Re: [PATCH v10 33/33] counter: 104-quad-8: Add IRQ support for the ACCES 104-QUAD-8

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:52PM +0900, William Breathitt Gray wrote:
> The LSI/CSI LS7266R1 chip provides programmable output via the FLG pins.
> When interrupts are enabled on the ACCES 104-QUAD-8, they occur whenever
> FLG1 is active. Four functions are available for the FLG1 signal: Carry,
> Compare, Carry-Borrow, and Index.
> 
>   Carry:
>   Interrupt generated on active low Carry signal. Carry
>   signal toggles every time the respective channel's
>   counter overflows.
> 
>   Compare:
>   Interrupt generated on active low Compare signal.
>   Compare signal toggles every time respective channel's
>   preset register is equal to the respective channel's
>   counter.
> 
>   Carry-Borrow:
>   Interrupt generated on active low Carry signal and
>   active low Borrow signal. Carry signal toggles every
>   time the respective channel's counter overflows. Borrow
>   signal toggles every time the respective channel's
>   counter underflows.
> 
>   Index:
>   Interrupt generated on active high Index signal.
> 
> These four functions correspond respectivefly to the following four
> Counter event types: COUNTER_EVENT_OVERFLOW, COUNTER_EVENT_THRESHOLD,
> COUNTER_EVENT_OVERFLOW_UNDERFLOW, and COUNTER_EVENT_INDEX. Interrupts
> push Counter events to event channel X, where 'X' is the respective
> channel whose FLG1 activated.
> 
> This patch adds IRQ support for the ACCES 104-QUAD-8. The interrupt line
> numbers for the devices may be configured via the irq array module
> parameter.
> 
> Cc: Syed Nayyar Waris 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c | 167 +--
>  drivers/counter/Kconfig  |   6 +-
>  2 files changed, 164 insertions(+), 9 deletions(-)

Acked-by: Syed Nayyar Waris 

> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index d46b8101f207..09b0b0ba8fe7 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -11,6 +11,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
> @@ -25,6 +26,10 @@ static unsigned int num_quad8;
>  module_param_hw_array(base, uint, ioport, &num_quad8, 0);
>  MODULE_PARM_DESC(base, "ACCES 104-QUAD-8 base addresses");
>  
> +static unsigned int irq[max_num_isa_dev(QUAD8_EXTENT)];
> +module_param_hw_array(irq, uint, irq, NULL, 0);
> +MODULE_PARM_DESC(irq, "ACCES 104-QUAD-8 interrupt line numbers");
> +
>  #define QUAD8_NUM_COUNTERS 8
>  
>  /**
> @@ -38,6 +43,8 @@ MODULE_PARM_DESC(base, "ACCES 104-QUAD-8 base addresses");
>   * @quadrature_scale:array of quadrature mode scale configurations
>   * @ab_enable:   array of A and B inputs enable configurations
>   * @preset_enable:   array of set_to_preset_on_index attribute configurations
> + * @irq_trigger: array of current IRQ trigger function configurations
> + * @next_irq_trigger:array of next IRQ trigger function 
> configurations
>   * @synchronous_mode:array of index function synchronous mode 
> configurations
>   * @index_polarity:  array of index function polarity configurations
>   * @cable_fault_enable:  differential encoder cable status enable 
> configurations
> @@ -53,13 +60,17 @@ struct quad8 {
>   unsigned int quadrature_scale[QUAD8_NUM_COUNTERS];
>   unsigned int ab_enable[QUAD8_NUM_COUNTERS];
>   unsigned int preset_enable[QUAD8_NUM_COUNTERS];
> + unsigned int irq_trigger[QUAD8_NUM_COUNTERS];
> + unsigned int next_irq_trigger[QUAD8_NUM_COUNTERS];
>   unsigned int synchronous_mode[QUAD8_NUM_COUNTERS];
>   unsigned int index_polarity[QUAD8_NUM_COUNTERS];
>   unsigned int cable_fault_enable;
>   unsigned int base;
>  };
>  
> +#define QUAD8_REG_INTERRUPT_STATUS 0x10
>  #define QUAD8_REG_CHAN_OP 0x11
> +#define QUAD8_REG_INDEX_INTERRUPT 0x12
>  #define QUAD8_REG_INDEX_INPUT_LEVELS 0x16
>  #define QUAD8_DIFF_ENCODER_CABLE_STATUS 0x17
>  /* Borrow Toggle flip-flop */
> @@ -92,8 +103,8 @@ struct quad8 {
>  #define QUAD8_RLD_CNTR_OUT 0x10
>  /* Transfer Preset Register LSB to FCK Prescaler */
>  #define QUAD8_RLD_PRESET_PSC 0x18
> -#define QUAD8_CHAN_OP_ENABLE_COUNTERS 0x00
>  #define QUAD8_CHAN_OP_RESET_COUNTERS 0x01
> +#define QUAD8_CHAN_OP_ENABLE_INTERRUPT_FUNC 0x04
>  #define QUAD8_CMR_QUADRATURE_X1 0x08
>  #define QUAD8_CMR_QUADRATURE_X2 0x10
>  #define QUAD8_CMR_QUADRATURE_X4 0x18
> @@ -378,13 +389,103 @@

Re: [PATCH v10 32/33] counter: 104-quad-8: Replace mutex with spinlock

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:51PM +0900, William Breathitt Gray wrote:
> This patch replaces the mutex I/O lock with a spinlock. This is in
> preparation for a subsequent patch adding IRQ support for 104-QUAD-8
> devices; we can't sleep in an interrupt context, so we'll need to use a
> spinlock instead.
> 
> Cc: Syed Nayyar Waris 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c | 90 +---
>  1 file changed, 53 insertions(+), 37 deletions(-)

Acked-by: Syed Nayyar Waris 

> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index eb7d63769f4c..d46b8101f207 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -16,6 +16,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  
>  #define QUAD8_EXTENT 32
>  
> @@ -28,6 +29,7 @@ MODULE_PARM_DESC(base, "ACCES 104-QUAD-8 base addresses");
>  
>  /**
>   * struct quad8 - device private data structure
> + * @lock:lock to prevent clobbering device states during R/W ops
>   * @counter: instance of the counter_device
>   * @fck_prescaler:   array of filter clock prescaler configurations
>   * @preset:  array of preset values
> @@ -42,7 +44,7 @@ MODULE_PARM_DESC(base, "ACCES 104-QUAD-8 base addresses");
>   * @base:base port address of the device
>   */
>  struct quad8 {
> - struct mutex lock;
> + spinlock_t lock;
>   struct counter_device counter;
>   unsigned int fck_prescaler[QUAD8_NUM_COUNTERS];
>   unsigned int preset[QUAD8_NUM_COUNTERS];
> @@ -123,6 +125,7 @@ static int quad8_count_read(struct counter_device 
> *counter,
>   unsigned int flags;
>   unsigned int borrow;
>   unsigned int carry;
> + unsigned long irqflags;
>   int i;
>  
>   flags = inb(base_offset + 1);
> @@ -132,7 +135,7 @@ static int quad8_count_read(struct counter_device 
> *counter,
>   /* Borrow XOR Carry effectively doubles count range */
>   *val = (unsigned long)(borrow ^ carry) << 24;
>  
> - mutex_lock(&priv->lock);
> + spin_lock_irqsave(&priv->lock, irqflags);
>  
>   /* Reset Byte Pointer; transfer Counter to Output Latch */
>   outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP | QUAD8_RLD_CNTR_OUT,
> @@ -141,7 +144,7 @@ static int quad8_count_read(struct counter_device 
> *counter,
>   for (i = 0; i < 3; i++)
>   *val |= (unsigned long)inb(base_offset) << (8 * i);
>  
> - mutex_unlock(&priv->lock);
> + spin_unlock_irqrestore(&priv->lock, irqflags);
>  
>   return 0;
>  }
> @@ -151,13 +154,14 @@ static int quad8_count_write(struct counter_device 
> *counter,
>  {
>   struct quad8 *const priv = counter->priv;
>   const int base_offset = priv->base + 2 * count->id;
> + unsigned long irqflags;
>   int i;
>  
>   /* Only 24-bit values are supported */
>   if (val > 0xFF)
>   return -ERANGE;
>  
> - mutex_lock(&priv->lock);
> + spin_lock_irqsave(&priv->lock, irqflags);
>  
>   /* Reset Byte Pointer */
>   outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_BP, base_offset + 1);
> @@ -182,7 +186,7 @@ static int quad8_count_write(struct counter_device 
> *counter,
>   /* Reset Error flag */
>   outb(QUAD8_CTR_RLD | QUAD8_RLD_RESET_E, base_offset + 1);
>  
> - mutex_unlock(&priv->lock);
> + spin_unlock_irqrestore(&priv->lock, irqflags);
>  
>   return 0;
>  }
> @@ -200,8 +204,9 @@ static int quad8_function_read(struct counter_device 
> *counter,
>  {
>   struct quad8 *const priv = counter->priv;
>   const int id = count->id;
> + unsigned long irqflags;
>  
> - mutex_lock(&priv->lock);
> + spin_lock_irqsave(&priv->lock, irqflags);
>  
>   if (priv->quadrature_mode[id])
>   switch (priv->quadrature_scale[id]) {
> @@ -218,7 +223,7 @@ static int quad8_function_read(struct counter_device 
> *counter,
>   else
>   *function = COUNTER_FUNCTION_PULSE_DIRECTION;
>  
> - mutex_unlock(&priv->lock);
> + spin_unlock_irqrestore(&priv->lock, irqflags);
>  
>   return 0;
>  }
> @@ -233,10 +238,11 @@ static int quad8_function_write(struct counter_device 
> *counter,
>   unsigned int *const scale = priv->quadrature_scale + id;
>   unsigned int *const synchronous_mode = priv->synchronous_mode + id;
>   const int base_offset = priv->base + 2 * id + 1;
> + unsigned long irqflags;
>   unsigned 

Re: [PATCH v10 21/33] counter: Rename counter_count_function to counter_function

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:40PM +0900, William Breathitt Gray wrote:
> The phrase "Counter Count function" is verbose and unintentionally
> implies that function is a Count extension. This patch adjusts the
> Counter subsystem code to use the more direct "Counter function" phrase
> to make the intent of this code clearer.
> 
> Cc: Syed Nayyar Waris 
> Cc: Patrick Havelange 
> Cc: Oleksij Rempel 
> Cc: Kamel Bouhara 
> Cc: Fabrice Gasnier 
> Cc: Maxime Coquelin 
> Cc: Alexandre Torgue 
> Cc: David Lechner 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c    | 10 +++

For the 104-quad-8 driver:
Acked-by: Syed Nayyar Waris 

>  drivers/counter/counter.c   | 38 -
>  drivers/counter/ftm-quaddec.c   |  5 ++--
>  drivers/counter/interrupt-cnt.c |  4 +--
>  drivers/counter/microchip-tcb-capture.c |  4 +--
>  drivers/counter/stm32-lptimer-cnt.c |  6 ++--
>  drivers/counter/stm32-timer-cnt.c   | 10 +++
>  drivers/counter/ti-eqep.c   | 10 +++
>  include/linux/counter.h | 20 ++---
>  9 files changed, 53 insertions(+), 54 deletions(-)
> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index fb0f021c0751..5a49ace2d4a6 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -194,11 +194,11 @@ enum quad8_count_function {
>   QUAD8_COUNT_FUNCTION_QUADRATURE_X4
>  };
>  
> -static const enum counter_count_function quad8_count_functions_list[] = {
> - [QUAD8_COUNT_FUNCTION_PULSE_DIRECTION] = 
> COUNTER_COUNT_FUNCTION_PULSE_DIRECTION,
> - [QUAD8_COUNT_FUNCTION_QUADRATURE_X1] = 
> COUNTER_COUNT_FUNCTION_QUADRATURE_X1_A,
> - [QUAD8_COUNT_FUNCTION_QUADRATURE_X2] = 
> COUNTER_COUNT_FUNCTION_QUADRATURE_X2_A,
> - [QUAD8_COUNT_FUNCTION_QUADRATURE_X4] = 
> COUNTER_COUNT_FUNCTION_QUADRATURE_X4
> +static const enum counter_function quad8_count_functions_list[] = {
> + [QUAD8_COUNT_FUNCTION_PULSE_DIRECTION] = 
> COUNTER_FUNCTION_PULSE_DIRECTION,
> + [QUAD8_COUNT_FUNCTION_QUADRATURE_X1] = COUNTER_FUNCTION_QUADRATURE_X1_A,
> + [QUAD8_COUNT_FUNCTION_QUADRATURE_X2] = COUNTER_FUNCTION_QUADRATURE_X2_A,
> + [QUAD8_COUNT_FUNCTION_QUADRATURE_X4] = COUNTER_FUNCTION_QUADRATURE_X4
>  };
>  
>  static int quad8_function_get(struct counter_device *counter,
> diff --git a/drivers/counter/counter.c b/drivers/counter/counter.c
> index cb92673552b5..de921e8a3f72 100644
> --- a/drivers/counter/counter.c
> +++ b/drivers/counter/counter.c
> @@ -744,15 +744,15 @@ static ssize_t counter_count_store(struct device *dev,
>   return len;
>  }
>  
> -static const char *const counter_count_function_str[] = {
> - [COUNTER_COUNT_FUNCTION_INCREASE] = "increase",
> - [COUNTER_COUNT_FUNCTION_DECREASE] = "decrease",
> - [COUNTER_COUNT_FUNCTION_PULSE_DIRECTION] = "pulse-direction",
> - [COUNTER_COUNT_FUNCTION_QUADRATURE_X1_A] = "quadrature x1 a",
> - [COUNTER_COUNT_FUNCTION_QUADRATURE_X1_B] = "quadrature x1 b",
> - [COUNTER_COUNT_FUNCTION_QUADRATURE_X2_A] = "quadrature x2 a",
> - [COUNTER_COUNT_FUNCTION_QUADRATURE_X2_B] = "quadrature x2 b",
> - [COUNTER_COUNT_FUNCTION_QUADRATURE_X4] = "quadrature x4"
> +static const char *const counter_function_str[] = {
> + [COUNTER_FUNCTION_INCREASE] = "increase",
> + [COUNTER_FUNCTION_DECREASE] = "decrease",
> + [COUNTER_FUNCTION_PULSE_DIRECTION] = "pulse-direction",
> + [COUNTER_FUNCTION_QUADRATURE_X1_A] = "quadrature x1 a",
> + [COUNTER_FUNCTION_QUADRATURE_X1_B] = "quadrature x1 b",
> + [COUNTER_FUNCTION_QUADRATURE_X2_A] = "quadrature x2 a",
> + [COUNTER_FUNCTION_QUADRATURE_X2_B] = "quadrature x2 b",
> + [COUNTER_FUNCTION_QUADRATURE_X4] = "quadrature x4"
>  };
>  
>  static ssize_t counter_function_show(struct device *dev,
> @@ -764,7 +764,7 @@ static ssize_t counter_function_show(struct device *dev,
>   const struct counter_count_unit *const component = devattr->component;
>   struct counter_count *const count = component->count;
>   size_t func_index;
> - enum counter_count_function function;
> + enum counter_function function;
>  
>   err = counter->ops->function_get(counter, count, &func_index);
>   if (err)
> @@ -773,7 +773,7 @@ static ssize_t counter_function_show(struct device *dev,
>   count->function = func_index;
>  
>   function = count->functions_list[func_index];
> - return sprintf(buf, "%

Re: [PATCH v10 20/33] counter: Rename counter_signal_value to counter_signal_level

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:39PM +0900, William Breathitt Gray wrote:
> Signal values will always be levels so let's be explicit it about it to
> make the intent of the code clear.
> 
> Cc: Syed Nayyar Waris 
> Cc: Oleksij Rempel 
> Cc: Kamel Bouhara 
> Reviewed-by: David Lechner 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c|  5 +++--

For the 104-quad-8 driver:
Acked-by: Syed Nayyar Waris 

>  drivers/counter/counter.c   | 12 ++--
>  drivers/counter/interrupt-cnt.c |  4 ++--
>  drivers/counter/microchip-tcb-capture.c |  4 ++--
>  include/linux/counter.h | 12 ++--
>  5 files changed, 19 insertions(+), 18 deletions(-)
> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index 0409b1771fd9..fb0f021c0751 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -97,7 +97,8 @@ struct quad8 {
>  #define QUAD8_CMR_QUADRATURE_X4 0x18
>  
>  static int quad8_signal_read(struct counter_device *counter,
> - struct counter_signal *signal, enum counter_signal_value *val)
> +  struct counter_signal *signal,
> +  enum counter_signal_level *level)
>  {
>   const struct quad8 *const priv = counter->priv;
>   unsigned int state;
> @@ -109,7 +110,7 @@ static int quad8_signal_read(struct counter_device 
> *counter,
>   state = inb(priv->base + QUAD8_REG_INDEX_INPUT_LEVELS)
>   & BIT(signal->id - 16);
>  
> - *val = (state) ? COUNTER_SIGNAL_HIGH : COUNTER_SIGNAL_LOW;
> + *level = (state) ? COUNTER_SIGNAL_LEVEL_HIGH : COUNTER_SIGNAL_LEVEL_LOW;
>  
>   return 0;
>  }
> diff --git a/drivers/counter/counter.c b/drivers/counter/counter.c
> index 6a683d086008..cb92673552b5 100644
> --- a/drivers/counter/counter.c
> +++ b/drivers/counter/counter.c
> @@ -289,9 +289,9 @@ struct counter_signal_unit {
>   struct counter_signal *signal;
>  };
>  
> -static const char *const counter_signal_value_str[] = {
> - [COUNTER_SIGNAL_LOW] = "low",
> - [COUNTER_SIGNAL_HIGH] = "high"
> +static const char *const counter_signal_level_str[] = {
> + [COUNTER_SIGNAL_LEVEL_LOW] = "low",
> + [COUNTER_SIGNAL_LEVEL_HIGH] = "high"
>  };
>  
>  static ssize_t counter_signal_show(struct device *dev,
> @@ -302,13 +302,13 @@ static ssize_t counter_signal_show(struct device *dev,
>   const struct counter_signal_unit *const component = devattr->component;
>   struct counter_signal *const signal = component->signal;
>   int err;
> - enum counter_signal_value val;
> + enum counter_signal_level level;
>  
> - err = counter->ops->signal_read(counter, signal, &val);
> + err = counter->ops->signal_read(counter, signal, &level);
>   if (err)
>   return err;
>  
> - return sprintf(buf, "%s\n", counter_signal_value_str[val]);
> + return sprintf(buf, "%s\n", counter_signal_level_str[level]);
>  }
>  
>  struct counter_name_unit {
> diff --git a/drivers/counter/interrupt-cnt.c b/drivers/counter/interrupt-cnt.c
> index f27dea317965..cce579c1c6ae 100644
> --- a/drivers/counter/interrupt-cnt.c
> +++ b/drivers/counter/interrupt-cnt.c
> @@ -130,7 +130,7 @@ static int interrupt_cnt_function_get(struct 
> counter_device *counter,
>  
>  static int interrupt_cnt_signal_read(struct counter_device *counter,
>struct counter_signal *signal,
> -  enum counter_signal_value *val)
> +  enum counter_signal_level *level)
>  {
>   struct interrupt_cnt_priv *priv = counter->priv;
>   int ret;
> @@ -142,7 +142,7 @@ static int interrupt_cnt_signal_read(struct 
> counter_device *counter,
>   if (ret < 0)
>   return ret;
>  
> - *val = ret ? COUNTER_SIGNAL_HIGH : COUNTER_SIGNAL_LOW;
> + *level = ret ? COUNTER_SIGNAL_LEVEL_HIGH : COUNTER_SIGNAL_LEVEL_LOW;
>  
>   return 0;
>  }
> diff --git a/drivers/counter/microchip-tcb-capture.c 
> b/drivers/counter/microchip-tcb-capture.c
> index 0c9a61962911..6be3adf74114 100644
> --- a/drivers/counter/microchip-tcb-capture.c
> +++ b/drivers/counter/microchip-tcb-capture.c
> @@ -158,7 +158,7 @@ static int mchp_tc_count_function_set(struct 
> counter_device *counter,
>  
>  static int mchp_tc_count_signal_read(struct counter_device *counter,
>struct counter_signal *signal,
> -  enum counter_signal_value *val)
&

Re: [PATCH v10 19/33] counter: Standardize to ERANGE for limit exceeded errors

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:38PM +0900, William Breathitt Gray wrote:
> ERANGE is a semantically better error code to return when an argument
> value falls outside the supported limit range of a device.
> 
> Cc: Syed Nayyar Waris 
> Cc: Oleksij Rempel 
> Cc: Fabrice Gasnier 
> Cc: Maxime Coquelin 
> Cc: Alexandre Torgue 
> Reviewed-by: David Lechner 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c| 6 +++---

For the 104-quad-8 driver:
Acked-by: Syed Nayyar Waris 

>  drivers/counter/interrupt-cnt.c | 3 +++
>  drivers/counter/stm32-lptimer-cnt.c | 2 +-
>  3 files changed, 7 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index b7d6c1c43655..0409b1771fd9 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -154,7 +154,7 @@ static int quad8_count_write(struct counter_device 
> *counter,
>  
>   /* Only 24-bit values are supported */
>   if (val > 0xFF)
> - return -EINVAL;
> + return -ERANGE;
>  
>   mutex_lock(&priv->lock);
>  
> @@ -669,7 +669,7 @@ static ssize_t quad8_count_preset_write(struct 
> counter_device *counter,
>  
>   /* Only 24-bit values are supported */
>   if (preset > 0xFF)
> - return -EINVAL;
> + return -ERANGE;
>  
>   mutex_lock(&priv->lock);
>  
> @@ -714,7 +714,7 @@ static ssize_t quad8_count_ceiling_write(struct 
> counter_device *counter,
>  
>   /* Only 24-bit values are supported */
>   if (ceiling > 0xFF)
> - return -EINVAL;
> + return -ERANGE;
>  
>   mutex_lock(&priv->lock);
>  
> diff --git a/drivers/counter/interrupt-cnt.c b/drivers/counter/interrupt-cnt.c
> index 0e07607f2cd3..f27dea317965 100644
> --- a/drivers/counter/interrupt-cnt.c
> +++ b/drivers/counter/interrupt-cnt.c
> @@ -107,6 +107,9 @@ static int interrupt_cnt_write(struct counter_device 
> *counter,
>  {
>   struct interrupt_cnt_priv *priv = counter->priv;
>  
> + if (val != (typeof(priv->count.counter))val)
> + return -ERANGE;
> +
>   atomic_set(&priv->count, val);
>  
>   return 0;
> diff --git a/drivers/counter/stm32-lptimer-cnt.c 
> b/drivers/counter/stm32-lptimer-cnt.c
> index 78f383b77bd2..49aeb9e393f3 100644
> --- a/drivers/counter/stm32-lptimer-cnt.c
> +++ b/drivers/counter/stm32-lptimer-cnt.c
> @@ -283,7 +283,7 @@ static ssize_t stm32_lptim_cnt_ceiling_write(struct 
> counter_device *counter,
>   return ret;
>  
>   if (ceiling > STM32_LPTIM_MAX_ARR)
> - return -EINVAL;
> + return -ERANGE;
>  
>   priv->ceiling = ceiling;
>  
> -- 
> 2.30.2
> 


Re: [PATCH v10 18/33] counter: Return error code on invalid modes

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:37PM +0900, William Breathitt Gray wrote:
> Only a select set of modes (function, action, etc.) are valid for a
> given device configuration. This patch ensures that invalid modes result
> in a return -EINVAL. Such a situation should never occur in reality, but
> it's good to define a default switch cases for the sake of making the
> intent of the code clear.
> 
> Cc: Syed Nayyar Waris 
> Cc: Kamel Bouhara 
> Cc: Fabrice Gasnier 
> Cc: Maxime Coquelin 
> Cc: Alexandre Torgue 
> Cc: David Lechner 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c| 20 +++

For the 104-quad-8 driver:
Acked-by: Syed Nayyar Waris 

>  drivers/counter/microchip-tcb-capture.c |  6 
>  drivers/counter/stm32-lptimer-cnt.c | 10 +++---
>  drivers/counter/ti-eqep.c   | 45 +++--
>  4 files changed, 46 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index 09d779544969..b7d6c1c43655 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -273,6 +273,10 @@ static int quad8_function_set(struct counter_device 
> *counter,
>   *scale = 2;
>   mode_cfg |= QUAD8_CMR_QUADRATURE_X4;
>   break;
> + default:
> + /* should never reach this path */
> + mutex_unlock(&priv->lock);
> + return -EINVAL;
>   }
>   }
>  
> @@ -349,7 +353,7 @@ static int quad8_action_get(struct counter_device 
> *counter,
>   case QUAD8_COUNT_FUNCTION_PULSE_DIRECTION:
>   if (synapse->signal->id == signal_a_id)
>   *action = QUAD8_SYNAPSE_ACTION_RISING_EDGE;
> - break;
> + return 0;
>   case QUAD8_COUNT_FUNCTION_QUADRATURE_X1:
>   if (synapse->signal->id == signal_a_id) {
>   quad8_direction_get(counter, count, &direction);
> @@ -359,17 +363,18 @@ static int quad8_action_get(struct counter_device 
> *counter,
>   else
>   *action = QUAD8_SYNAPSE_ACTION_FALLING_EDGE;
>   }
> - break;
> + return 0;
>   case QUAD8_COUNT_FUNCTION_QUADRATURE_X2:
>   if (synapse->signal->id == signal_a_id)
>   *action = QUAD8_SYNAPSE_ACTION_BOTH_EDGES;
> - break;
> + return 0;
>   case QUAD8_COUNT_FUNCTION_QUADRATURE_X4:
>   *action = QUAD8_SYNAPSE_ACTION_BOTH_EDGES;
> - break;
> + return 0;
> + default:
> + /* should never reach this path */
> + return -EINVAL;
>   }
> -
> - return 0;
>  }
>  
>  static const struct counter_ops quad8_ops = {
> @@ -529,6 +534,9 @@ static int quad8_count_mode_set(struct counter_device 
> *counter,
>   case COUNTER_COUNT_MODE_MODULO_N:
>   cnt_mode = 3;
>   break;
> + default:
> + /* should never reach this path */
> + return -EINVAL;
>   }
>  
>   mutex_lock(&priv->lock);
> diff --git a/drivers/counter/microchip-tcb-capture.c 
> b/drivers/counter/microchip-tcb-capture.c
> index 51b8af80f98b..0c9a61962911 100644
> --- a/drivers/counter/microchip-tcb-capture.c
> +++ b/drivers/counter/microchip-tcb-capture.c
> @@ -133,6 +133,9 @@ static int mchp_tc_count_function_set(struct 
> counter_device *counter,
>   bmr |= ATMEL_TC_QDEN | ATMEL_TC_POSEN;
>   cmr |= ATMEL_TC_ETRGEDG_RISING | ATMEL_TC_ABETRG | ATMEL_TC_XC0;
>   break;
> + default:
> + /* should never reach this path */
> + return -EINVAL;
>   }
>  
>   regmap_write(priv->regmap, ATMEL_TC_BMR, bmr);
> @@ -226,6 +229,9 @@ static int mchp_tc_count_action_set(struct counter_device 
> *counter,
>   case MCHP_TC_SYNAPSE_ACTION_BOTH_EDGE:
>   edge = ATMEL_TC_ETRGEDG_BOTH;
>   break;
> + default:
> + /* should never reach this path */
> + return -EINVAL;
>   }
>  
>   return regmap_write_bits(priv->regmap,
> diff --git a/drivers/counter/stm32-lptimer-cnt.c 
> b/drivers/counter/stm32-lptimer-cnt.c
> index c19d998df5ba..78f383b77bd2 100644
> --- a/drivers/counter/stm32-lptimer-cnt.c
> +++ b/drivers/counter/stm32-lptimer-cnt.c
> @@ -206,9 +206,10 @@ static int stm32_lptim_cnt_function_set(struct 
> counter_device *counter,
>   priv->quadrature

Re: [PATCH v10 12/33] counter: 104-quad-8: Add const qualifier for actions_list array

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:31PM +0900, William Breathitt Gray wrote:
> The struct counter_synapse actions_list member expects a const enum
> counter_synapse_action array. This patch adds the const qualifier to the
> quad8_index_actions_list and quad8_synapse_actions_list to match
> actions_list.
> 
> Cc: Syed Nayyar Waris 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index ae89ad7a91c6..09d779544969 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -305,12 +305,12 @@ enum quad8_synapse_action {
>   QUAD8_SYNAPSE_ACTION_BOTH_EDGES
>  };
>  
> -static enum counter_synapse_action quad8_index_actions_list[] = {
> +static const enum counter_synapse_action quad8_index_actions_list[] = {
>   [QUAD8_SYNAPSE_ACTION_NONE] = COUNTER_SYNAPSE_ACTION_NONE,
>   [QUAD8_SYNAPSE_ACTION_RISING_EDGE] = COUNTER_SYNAPSE_ACTION_RISING_EDGE
>  };
>  
> -static enum counter_synapse_action quad8_synapse_actions_list[] = {
> +static const enum counter_synapse_action quad8_synapse_actions_list[] = {
>   [QUAD8_SYNAPSE_ACTION_NONE] = COUNTER_SYNAPSE_ACTION_NONE,
>   [QUAD8_SYNAPSE_ACTION_RISING_EDGE] = COUNTER_SYNAPSE_ACTION_RISING_EDGE,
>   [QUAD8_SYNAPSE_ACTION_FALLING_EDGE] = 
> COUNTER_SYNAPSE_ACTION_FALLING_EDGE,
> -- 
> 2.30.2
>

Acked-by: Syed Nayyar Waris 


Re: [PATCH v10 07/33] counter: 104-quad-8: Add const qualifier for functions_list array

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:26PM +0900, William Breathitt Gray wrote:
> The struct counter_count functions_list member expects a const enum
> counter_count_function array. This patch adds the const qualifier to the
> quad8_count_functions_list to match functions_list.
> 
> Cc: Syed Nayyar Waris 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index 51fba8cf9c2a..ae89ad7a91c6 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -193,7 +193,7 @@ enum quad8_count_function {
>   QUAD8_COUNT_FUNCTION_QUADRATURE_X4
>  };
>  
> -static enum counter_count_function quad8_count_functions_list[] = {
> +static const enum counter_count_function quad8_count_functions_list[] = {
>   [QUAD8_COUNT_FUNCTION_PULSE_DIRECTION] = 
> COUNTER_COUNT_FUNCTION_PULSE_DIRECTION,
>   [QUAD8_COUNT_FUNCTION_QUADRATURE_X1] = 
> COUNTER_COUNT_FUNCTION_QUADRATURE_X1_A,
>   [QUAD8_COUNT_FUNCTION_QUADRATURE_X2] = 
> COUNTER_COUNT_FUNCTION_QUADRATURE_X2_A,
> -- 
> 2.30.2
>

Acked-by: Syed Nayyar Waris 


Re: [PATCH v10 06/33] counter: 104-quad-8: Add const qualifiers for quad8_preset_register_set

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:25PM +0900, William Breathitt Gray wrote:
> Add some safety by qualifying the quad8_preset_register_set() function
> parameters as const.
> 
> Cc: Syed Nayyar Waris 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index 0fd61cc82d30..51fba8cf9c2a 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -632,8 +632,8 @@ static ssize_t quad8_count_preset_read(struct 
> counter_device *counter,
>   return sprintf(buf, "%u\n", priv->preset[count->id]);
>  }
>  
> -static void quad8_preset_register_set(struct quad8 *priv, int id,
> -   unsigned int preset)
> +static void quad8_preset_register_set(struct quad8 *const priv, const int id,
> +   const unsigned int preset)
>  {
>   const unsigned int base_offset = priv->base + 2 * id;
>   int i;
> -- 
> 2.30.2
>

Acked-by: Syed Nayyar Waris 


Re: [PATCH v10 05/33] counter: 104-quad-8: Annotate hardware config module parameter

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:24PM +0900, William Breathitt Gray wrote:
> When the kernel is running in secure boot mode, we lock down the kernel to
> prevent userspace from modifying the running kernel image.  Whilst this
> includes prohibiting access to things like /dev/mem, it must also prevent
> access by means of configuring driver modules in such a way as to cause a
> device to access or modify the kernel image.
> 
> To this end, annotate module_param* statements that refer to hardware
> configuration and indicate for future reference what type of parameter they
> specify.  The parameter parser in the core sees this information and can
> skip such parameters with an error message if the kernel is locked down.
> The module initialisation then runs as normal, but just sees whatever the
> default values for those parameters is.
> 
> Note that we do still need to do the module initialisation because some
> drivers have viable defaults set in case parameters aren't specified and
> some drivers support automatic configuration (e.g. PNP or PCI) in addition
> to manually coded parameters.
> 
> This patch annotates the 104-QUAD-8 driver.
> 
> Cc: Syed Nayyar Waris 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index 233a3acc1377..0fd61cc82d30 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -21,7 +21,7 @@
>  
>  static unsigned int base[max_num_isa_dev(QUAD8_EXTENT)];
>  static unsigned int num_quad8;
> -module_param_array(base, uint, &num_quad8, 0);
> +module_param_hw_array(base, uint, ioport, &num_quad8, 0);
>  MODULE_PARM_DESC(base, "ACCES 104-QUAD-8 base addresses");
>  
>  #define QUAD8_NUM_COUNTERS 8
> -- 
> 2.30.2
>

Acked-by: Syed Nayyar Waris 


Re: [PATCH v10 04/33] counter: 104-quad-8: Return error when invalid mode during ceiling_write

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:23PM +0900, William Breathitt Gray wrote:
> The 104-QUAD-8 only has two count modes where a ceiling value makes
> sense: Range Limit and Modulo-N. Outside of these two modes, setting a
> ceiling value is an invalid operation -- so let's report it as such by
> returning -EINVAL.
> 
> Fixes: fc069262261c ("counter: 104-quad-8: Add lock guards - generic 
> interface")
> Cc: Syed Nayyar Waris 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index 4bb9abffae48..233a3acc1377 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -714,13 +714,14 @@ static ssize_t quad8_count_ceiling_write(struct 
> counter_device *counter,
>   switch (priv->count_mode[count->id]) {
>   case 1:
>   case 3:
> + mutex_unlock(&priv->lock);
>   quad8_preset_register_set(priv, count->id, ceiling);
> - break;
> + return len;
>   }
>  
>   mutex_unlock(&priv->lock);
>  
> - return len;
> + return -EINVAL;
>  }
>  
>  static ssize_t quad8_count_preset_enable_read(struct counter_device *counter,
> -- 
> 2.30.2
>

Acked-by: Syed Nayyar Waris 


Re: [PATCH v10 03/33] counter: 104-quad-8: Remove pointless comment

2021-04-15 Thread Syed Nayyar Waris
On Fri, Mar 19, 2021 at 08:00:22PM +0900, William Breathitt Gray wrote:
> It is obvious that devm_counter_register() is used to register a Counter
> device, so a comment stating such is pointless here.
> 
> Cc: Syed Nayyar Waris 
> Signed-off-by: William Breathitt Gray 
> ---
>  drivers/counter/104-quad-8.c | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index 9691f8612be8..4bb9abffae48 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -1082,7 +1082,6 @@ static int quad8_probe(struct device *dev, unsigned int 
> id)
>   /* Enable all counters */
>   outb(QUAD8_CHAN_OP_ENABLE_COUNTERS, base[id] + QUAD8_REG_CHAN_OP);
>  
> - /* Register Counter device */
>   return devm_counter_register(dev, &priv->counter);
>  }
>  
> -- 
> 2.30.2
>

Acked-by: Syed Nayyar Waris 


[RESEND PATCH v4 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-04-02 Thread Syed Nayyar Waris
This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.

Cc: Bartosz Golaszewski 
Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 
---
 drivers/gpio/gpio-xilinx.c | 52 +++---
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index b411d3156e0b..512198250b02 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include "gpiolib.h"
 
 /* Register Offset Definitions */
 #define XGPIO_DATA_OFFSET   (0x0)  /* Data register  */
@@ -161,35 +162,34 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
 {
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
 
-   spin_lock_irqsave(&chip->gpio_lock, flags);
+   u32 *state = chip->gpio_state;
+   unsigned int *width = chip->gpio_width;
+   DECLARE_BITMAP(old, 64);
+   DECLARE_BITMAP(new, 64);
+   DECLARE_BITMAP(changed, 64);
 
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock, flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock, flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
-   }
+   spin_lock_irqsave(&chip->gpio_lock, flags);
 
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
+   /* Copy initial value of state bits into 'old' contiguously */
+   bitmap_set_value(old, 64, state[0], width[0], 0);
+   bitmap_set_value(old, 64, state[1], width[1], width[0]);
+   /* Copy value from 'old' into 'new' with mask applied */
+   bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+   bitmap_from_arr32(old, state, 64);
+   /* Update 'state' */
+   state[0] = bitmap_get_value(new, 0, width[0]);
+   state[1] = bitmap_get_value(new, width[0], width[1]);
+   bitmap_from_arr32(new, state, 64);
+   /* XOR operation sets only changed bits */
+   bitmap_xor(changed, old, new, 64);
+
+   if (((u32 *)changed)[0])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET, state[0]);
+   if (((u32 *)changed)[1])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+   XGPIO_CHANNEL_OFFSET, state[1]);
 
spin_unlock_irqrestore(&chip->gpio_lock, flags);
 }
-- 
2.29.0



[RESEND PATCH v4 2/3] gpio: thunderx: Utilize for_each_set_nbits macro

2021-04-02 Thread Syed Nayyar Waris
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_nbits macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 
---
 drivers/gpio/gpio-thunderx.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..4349e7393a1d 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-
+#include "gpiolib.h"
 
 #define GPIO_RX_DAT0x0
 #define GPIO_TX_SET0x8
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_nbits(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.29.0



[RESEND PATCH v4 1/3] gpiolib: Introduce the for_each_set_nbits macro

2021-04-02 Thread Syed Nayyar Waris
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size from 1 to BITS_PER_LONG. size less
than 1 or more than BITS_PER_LONG causes undefined behaviour.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.

Cc: Linus Walleij 
Cc: Bartosz Gołaszewski 
Cc: Arnd Bergmann 
Cc: Andy Shevchenko 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 
---
 drivers/gpio/gpiolib.c | 90 ++
 drivers/gpio/gpiolib.h | 28 +
 2 files changed, 118 insertions(+)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 1427c1be749b..5576d1465c81 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -150,6 +150,96 @@ struct gpio_desc *gpiochip_get_desc(struct gpio_chip *gc,
 }
 EXPORT_SYMBOL_GPL(gpiochip_get_desc);
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive).
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+unsigned long bitmap_get_value(const unsigned long *map,
+   unsigned long start,
+   unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high;
+
+   if (space >= nbits)
+   return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+   else {
+   value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+   value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
nbits);
+   return (value_low >> offset) | (value_high << space);
+   }
+}
+EXPORT_SYMBOL_GPL(bitmap_get_value);
+
+/**
+ * bitmap_set_value - set value within a memory region
+ * @map: address to the bitmap memory region
+ * @nbits: size of map in bits
+ * @value: value of clump
+ * @value_width: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive)
+ * @start: bit offset of the value
+ */
+void bitmap_set_value(unsigned long *map, unsigned long nbits,
+   unsigned long value, unsigned long value_width,
+   unsigned long start)
+{
+   const unsigned long index = BIT_WORD(start);
+   const unsigned long length = BIT_WORD(nbits);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+
+   value &= GENMASK(value_width - 1, 0);
+
+   if (space >= value_width) {
+   map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
+   map[index] |= value << offset;
+   } else {
+   map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
+   map[index + 0] |= value << offset;
+
+   if (index + 1 >= length)
+   return;
+
+   map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
+   map[index + 1] |= value >> space;
+   }
+}
+EXPORT_SYMBOL_GPL(bitmap_set_value);
+
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+unsigned long find_next_clump(unsigned long *clump, const unsigned long *addr,
+   unsigned long size, unsigned long offset,
+   unsigned long clump_size)
+{
+   offset = find_next_bit(addr, size, offset);
+   if (offset == size)
+   return size;
+
+   offset = rounddown(offset, clump_size);
+   *clump = bitma

[RESEND PATCH v4 0/3] Introduce the for_each_set_nbits macro

2021-04-02 Thread Syed Nayyar Waris
Hello Bartosz,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

This patchset introduces a new generic version of for_each_set_nbits.
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_nbits the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word
size is not multiple of clump size. Following are examples showing the working
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_nbits:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour).

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_nbits:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v4:
 - [Patch 3/3]: Remove extra line and add few comments.
 - [Patch 3/3]: Use single lock (and unlock) call instead of two
   lock (and two unlock) calls.
 - [Patch 3/3]: Use bitmap_from_arr32() where applicalble.
 - [Patch 3/3]: Remove unnecessary 'const'.

Changes in v3:
 - [Patch 1/3]: Rename for_each_set_clump to for_each_set_nbits.
 - [Patch 1/3]: Shift function definitions outside 'ifdef CONFIG_DEBUG_FS'
   macro guard to resolve build (linking) error in xilinx Patch[3/3].
 - [Patch 2/3]: Rename for_each_set_clump to for_each_set_nbits.

Changes in v2:
 - [Patch 1/3]: Shift the macros and related functions to gpiolib inside
   gpio/. Reduce the visibilty of 'for_each_set_clump' to gpio.
 - [Patch 1/3]: Remove __builtin_unreachable and simply use return
   statement.
 - Remove tests from lib/test_bitmap.c as 'for_each_set_clump' is
   now localised inside gpio/ only.

Syed Nayyar Waris (3):
  gpiolib: Introduce the for_each_set_nbits macro
  gpio: thunderx: Utilize for_each_set_nbits macro
  gpio: xilinx: Utilize generic bitmap_get_value and _set_value

 drivers/gpio/gpio-thunderx.c | 13 --
 drivers/gpio/gpio-xilinx.c   | 52 ++---
 drivers/gpio/gpiolib.c   | 90 
 drivers/gpio/gpiolib.h   | 28 +++
 4 files changed, 152 insertions(+), 31 deletions(-)


base-commit: e1b7033ecdac56c1cc4dff72d67cac25d449efc6
-- 
2.29.0



Re: [PATCH v4 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-04-02 Thread Syed Nayyar Waris
On Fri, Apr 2, 2021 at 3:42 PM Syed Nayyar Waris  wrote:
>
> This patch reimplements the xgpio_set_multiple() function in
> drivers/gpio/gpio-xilinx.c to use the new generic functions:
> bitmap_get_value() and bitmap_set_value(). The code is now simpler
> to read and understand. Moreover, instead of looping for each bit
> in xgpio_set_multiple() function, now we can check each channel at
> a time and save cycles.
>
> Cc: Bartosz Golaszewski 
> Cc: Michal Simek 
> Signed-off-by: Syed Nayyar Waris 
> Acked-by: William Breathitt Gray 
> ---
>  drivers/gpio/gpio-xilinx.c | 60 +++---
>  1 file changed, 30 insertions(+), 30 deletions(-)
>
> diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
> index b411d3156e0b..e0ad3a81f216 100644
> --- a/drivers/gpio/gpio-xilinx.c
> +++ b/drivers/gpio/gpio-xilinx.c
> @@ -18,6 +18,7 @@
>  #include 
>  #include 
>  #include 
> +#include "gpiolib.h"
>
>  /* Register Offset Definitions */
>  #define XGPIO_DATA_OFFSET   (0x0)  /* Data register  */
> @@ -161,37 +162,36 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
> unsigned long *mask,
>  {
> unsigned long flags;
> struct xgpio_instance *chip = gpiochip_get_data(gc);
> -   int index = xgpio_index(chip, 0);
> -   int offset, i;
>
> -   spin_lock_irqsave(&chip->gpio_lock, flags);
> -
> -   /* Write to GPIO signals */
> -   for (i = 0; i < gc->ngpio; i++) {
> -   if (*mask == 0)
> -   break;
> -   /* Once finished with an index write it out to the register */
> -   if (index !=  xgpio_index(chip, i)) {
> -   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> -  index * XGPIO_CHANNEL_OFFSET,
> -  chip->gpio_state[index]);
> -   spin_unlock_irqrestore(&chip->gpio_lock, flags);
> -   index =  xgpio_index(chip, i);
> -   spin_lock_irqsave(&chip->gpio_lock, flags);
> -   }
> -   if (__test_and_clear_bit(i, mask)) {
> -   offset =  xgpio_offset(chip, i);
> -   if (test_bit(i, bits))
> -   chip->gpio_state[index] |= BIT(offset);
> -   else
> -   chip->gpio_state[index] &= ~BIT(offset);
> -   }
> -   }
> -
> -   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> -  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
> -
> -   spin_unlock_irqrestore(&chip->gpio_lock, flags);
> +u32 *state = chip->gpio_state;
> +unsigned int *width = chip->gpio_width;
> +DECLARE_BITMAP(old, 64);
> +DECLARE_BITMAP(new, 64);
> +DECLARE_BITMAP(changed, 64);
> +
> +spin_lock_irqsave(&chip->gpio_lock, flags);
> +
> +/* Copy initial value of state bits into 'old' bit-wise */
> +bitmap_set_value(old, 64, state[0], width[0], 0);
> +bitmap_set_value(old, 64, state[1], width[1], width[0]);
> +/* Copy value from 'old' into 'new' with mask applied */
> +bitmap_replace(new, old, bits, mask, gc->ngpio);
> +
> +bitmap_from_arr32(old, state, 64);
> +/* Update 'state' */
> +state[0] = bitmap_get_value(new, 0, width[0]);
> +state[1] = bitmap_get_value(new, width[0], width[1]);
> +bitmap_from_arr32(new, state, 64);
> +/* XOR operation sets only changed bits */
> +bitmap_xor(changed, old, new, 64);
> +
> +if (((u32 *)changed)[0])
> +xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET, state[0]);
> +if (((u32 *)changed)[1])
> +    xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> +XGPIO_CHANNEL_OFFSET, state[1]);
> +
> +spin_unlock_irqrestore(&chip->gpio_lock, flags);
>  }
>
>  /**
> --
> 2.29.0
>

Hi All,

There were indentation errors reported. I am Re-sending the patchset.
I am keeping the version same as v4.

Kindly consider the "RESEND" prefixed v4 patchset for future.

Regards
Syed Nayyar Waris


[PATCH v4 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-04-02 Thread Syed Nayyar Waris
This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.

Cc: Bartosz Golaszewski 
Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 
---
 drivers/gpio/gpio-xilinx.c | 60 +++---
 1 file changed, 30 insertions(+), 30 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index b411d3156e0b..e0ad3a81f216 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include "gpiolib.h"
 
 /* Register Offset Definitions */
 #define XGPIO_DATA_OFFSET   (0x0)  /* Data register  */
@@ -161,37 +162,36 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
 {
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
 
-   spin_lock_irqsave(&chip->gpio_lock, flags);
-
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock, flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock, flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
-   }
-
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
-   spin_unlock_irqrestore(&chip->gpio_lock, flags);
+u32 *state = chip->gpio_state;
+unsigned int *width = chip->gpio_width;
+DECLARE_BITMAP(old, 64);
+DECLARE_BITMAP(new, 64);
+DECLARE_BITMAP(changed, 64);
+
+spin_lock_irqsave(&chip->gpio_lock, flags);
+
+/* Copy initial value of state bits into 'old' bit-wise */
+bitmap_set_value(old, 64, state[0], width[0], 0);
+bitmap_set_value(old, 64, state[1], width[1], width[0]);
+/* Copy value from 'old' into 'new' with mask applied */
+bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+bitmap_from_arr32(old, state, 64);
+/* Update 'state' */
+state[0] = bitmap_get_value(new, 0, width[0]);
+state[1] = bitmap_get_value(new, width[0], width[1]);
+bitmap_from_arr32(new, state, 64);
+/* XOR operation sets only changed bits */
+bitmap_xor(changed, old, new, 64);
+
+if (((u32 *)changed)[0])
+xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET, state[0]);
+if (((u32 *)changed)[1])
+xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+XGPIO_CHANNEL_OFFSET, state[1]);
+
+spin_unlock_irqrestore(&chip->gpio_lock, flags);
 }
 
 /**
-- 
2.29.0



[PATCH v4 2/3] gpio: thunderx: Utilize for_each_set_nbits macro

2021-04-02 Thread Syed Nayyar Waris
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_nbits macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 
---
 drivers/gpio/gpio-thunderx.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..4349e7393a1d 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-
+#include "gpiolib.h"
 
 #define GPIO_RX_DAT0x0
 #define GPIO_TX_SET0x8
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_nbits(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.29.0



[PATCH v4 1/3] gpiolib: Introduce the for_each_set_nbits macro

2021-04-02 Thread Syed Nayyar Waris
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size from 1 to BITS_PER_LONG. size less
than 1 or more than BITS_PER_LONG causes undefined behaviour.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.

Cc: Linus Walleij 
Cc: Bartosz Gołaszewski 
Cc: Arnd Bergmann 
Cc: Andy Shevchenko 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 
---
 drivers/gpio/gpiolib.c | 90 ++
 drivers/gpio/gpiolib.h | 28 +
 2 files changed, 118 insertions(+)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 1427c1be749b..5576d1465c81 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -150,6 +150,96 @@ struct gpio_desc *gpiochip_get_desc(struct gpio_chip *gc,
 }
 EXPORT_SYMBOL_GPL(gpiochip_get_desc);
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive).
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+unsigned long bitmap_get_value(const unsigned long *map,
+   unsigned long start,
+   unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high;
+
+   if (space >= nbits)
+   return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+   else {
+   value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+   value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
nbits);
+   return (value_low >> offset) | (value_high << space);
+   }
+}
+EXPORT_SYMBOL_GPL(bitmap_get_value);
+
+/**
+ * bitmap_set_value - set value within a memory region
+ * @map: address to the bitmap memory region
+ * @nbits: size of map in bits
+ * @value: value of clump
+ * @value_width: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive)
+ * @start: bit offset of the value
+ */
+void bitmap_set_value(unsigned long *map, unsigned long nbits,
+   unsigned long value, unsigned long value_width,
+   unsigned long start)
+{
+   const unsigned long index = BIT_WORD(start);
+   const unsigned long length = BIT_WORD(nbits);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+
+   value &= GENMASK(value_width - 1, 0);
+
+   if (space >= value_width) {
+   map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
+   map[index] |= value << offset;
+   } else {
+   map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
+   map[index + 0] |= value << offset;
+
+   if (index + 1 >= length)
+   return;
+
+   map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
+   map[index + 1] |= value >> space;
+   }
+}
+EXPORT_SYMBOL_GPL(bitmap_set_value);
+
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+unsigned long find_next_clump(unsigned long *clump, const unsigned long *addr,
+   unsigned long size, unsigned long offset,
+   unsigned long clump_size)
+{
+   offset = find_next_bit(addr, size, offset);
+   if (offset == size)
+   return size;
+
+   offset = rounddown(offset, clump_size);
+   *clump = bitma

[PATCH v4 0/3] Introduce the for_each_set_nbits macro

2021-04-02 Thread Syed Nayyar Waris
Hello Bartosz,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

This patchset introduces a new generic version of for_each_set_nbits.
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_nbits the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word
size is not multiple of clump size. Following are examples showing the working
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_nbits:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour).

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_nbits:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v4:
 - [Patch 3/3]: Remove extra line and add few comments.
 - [Patch 3/3]: Use single lock (and unlock) call instead of two 
   lock (and two unlock) calls.
 - [Patch 3/3]: Use bitmap_from_arr32() where applicalble.
 - [Patch 3/3]: Remove unnecessary 'const'.

Changes in v3:
 - [Patch 1/3]: Rename for_each_set_clump to for_each_set_nbits.
 - [Patch 1/3]: Shift function definitions outside 'ifdef CONFIG_DEBUG_FS'
   macro guard to resolve build (linking) error in xilinx Patch[3/3].
 - [Patch 2/3]: Rename for_each_set_clump to for_each_set_nbits.

Changes in v2:
 - [Patch 1/3]: Shift the macros and related functions to gpiolib inside
   gpio/. Reduce the visibilty of 'for_each_set_clump' to gpio.
 - [Patch 1/3]: Remove __builtin_unreachable and simply use return
   statement.
 - Remove tests from lib/test_bitmap.c as 'for_each_set_clump' is
   now localised inside gpio/ only.

Syed Nayyar Waris (3):
  gpiolib: Introduce the for_each_set_nbits macro
  gpio: thunderx: Utilize for_each_set_nbits macro
  gpio: xilinx: Utilize generic bitmap_get_value and _set_value

 drivers/gpio/gpio-thunderx.c | 13 --
 drivers/gpio/gpio-xilinx.c   | 60 
 drivers/gpio/gpiolib.c   | 90 
 drivers/gpio/gpiolib.h   | 28 +++
 4 files changed, 156 insertions(+), 35 deletions(-)


base-commit: e1b7033ecdac56c1cc4dff72d67cac25d449efc6
-- 
2.29.0



Re: [PATCH v3 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-04-02 Thread Syed Nayyar Waris
On Mon, Mar 29, 2021 at 8:54 PM Andy Shevchenko
 wrote:
>
> On Sat, Mar 06, 2021 at 07:36:30PM +0530, Syed Nayyar Waris wrote:
> > This patch reimplements the xgpio_set_multiple() function in
> > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > to read and understand. Moreover, instead of looping for each bit
> > in xgpio_set_multiple() function, now we can check each channel at
> > a time and save cycles.
>
> ...
>
> > + u32 *const state = chip->gpio_state;
>
> Looking at this... What's the point of the const here?
>
> Am I right that this tells: pointer is a const, while the data underneath
> can be modified?

Yes you are right and the data underneath can be modified.
I have removed the 'const' in v4

>
> > + unsigned int *const width = chip->gpio_width;
>
> Ditto.
>
> Putting const:s here and there for sake of the const is not good practice.
> It makes code harder to read.

Okay.

>
> --
> With Best Regards,
> Andy Shevchenko
>
>


Re: [PATCH v3 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-04-02 Thread Syed Nayyar Waris
On Fri, Mar 26, 2021 at 11:27 PM Andy Shevchenko
 wrote:
>
> On Sat, Mar 6, 2021 at 4:08 PM Syed Nayyar Waris  wrote:
> >
> > This patch reimplements the xgpio_set_multiple() function in
> > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > to read and understand. Moreover, instead of looping for each bit
> > in xgpio_set_multiple() function, now we can check each channel at
> > a time and save cycles.
>
> ...
>
> > +   u32 *const state = chip->gpio_state;
> > +   unsigned int *const width = chip->gpio_width;
>
> > +
>
> Extra blank line.
>
> > +   DECLARE_BITMAP(old, 64);
> > +   DECLARE_BITMAP(new, 64);
> > +   DECLARE_BITMAP(changed, 64);
>
> > +   spin_lock_irqsave(&chip->gpio_lock[0], flags);
> > +   spin_lock(&chip->gpio_lock[1]);
>
> I understand why this is done at the top of the function in the original code.
> I do not understand why you put some operations under spin lock.
>
> Have you checked what each of these spin locks protects?
> Please check and try to lock as minimum as possible.
>
> > +   bitmap_set_value(old, 64, state[0], width[0], 0);
> > +   bitmap_set_value(old, 64, state[1], width[1], width[0]);
> > +   bitmap_replace(new, old, bits, mask, gc->ngpio);
> > +
> > +   bitmap_set_value(old, 64, state[0], 32, 0);
> > +   bitmap_set_value(old, 64, state[1], 32, 32);
> > +   state[0] = bitmap_get_value(new, 0, width[0]);
> > +   state[1] = bitmap_get_value(new, width[0], width[1]);
> > +   bitmap_set_value(new, 64, state[0], 32, 0);
> > +   bitmap_set_value(new, 64, state[1], 32, 32);
> > +   bitmap_xor(changed, old, new, 64);
>
> Original code and this is cryptic. Can you add a few comments
> explaining what is going on here?
>
> > +   spin_unlock(&chip->gpio_lock[1]);
> > +   spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
>
> --
> With Best Regards,
> Andy Shevchenko

Have removed the extra line and added comments. Regarding locking - I
see that now there is just a single lock available instead of 2 locks.
Have made necessary changes. Thanks
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/drivers/gpio/gpio-xilinx.c?id=37ef334680800263b32bb96a5156a4b47f0244a2

Regards

Syed Nayyar Waris


Re: [PATCH v3 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-04-01 Thread Syed Nayyar Waris
On Sat, Mar 27, 2021 at 10:05 PM Andy Shevchenko
 wrote:
>
> On Sat, Mar 27, 2021 at 2:02 PM William Breathitt Gray
>  wrote:
> > On Sat, Mar 27, 2021 at 09:29:26AM +0200, Andy Shevchenko wrote:
> > > On Saturday, March 27, 2021, Syed Nayyar Waris  
> > > wrote:
> > > > On Fri, Mar 26, 2021 at 11:32 PM Andy Shevchenko
> > > >  wrote:
> > > > > On Sat, Mar 6, 2021 at 4:08 PM Syed Nayyar Waris 
> > > > > 
> > > > wrote:
> > > > >
> > > > > > +   bitmap_set_value(old, 64, state[0], 32, 0);
> > > > > > +   bitmap_set_value(old, 64, state[1], 32, 32);
> > > > >
> > > > > Isn't it effectively bitnap_from_arr32() ?
> > > > >
> > > > > > +   bitmap_set_value(new, 64, state[0], 32, 0);
> > > > > > +   bitmap_set_value(new, 64, state[1], 32, 32);
> > > > >
> > > > > Ditto.
>
> > > > With bitmap_set_value() we are also specifying the offset (or start)
> > > > position too. so that the remainder of the array remains unaffected. I
> > > > think it would not be feasible to use bitmap_from/to_arr32()  here.
> > >
> > >
> > > You have hard coded start and nbits parameters to 32. How is it not the
> > > same?
> >
> > Would these four lines become something like this:
> >
> > bitmap_from_arr32(old, state, 64);
> > ...
> > bitmap_from_arr32(new, state, 64);
>
> This is my understanding, but I might miss something. I mean driver
> specifics that make my proposal incorrect.
>
> --
> With Best Regards,
> Andy Shevchenko

I initially (incorrectly) thought that all of the bitmap_set_value()
statements have to be replaced. But now I realised, only those
specific bitmap_set_value() calls containing 32 bits width have to
replaced.

I will incorporate the above review comments in my next v4 submission.

Regards
Syed Nayyar Waris


Re: [PATCH v3 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-04-01 Thread Syed Nayyar Waris
On Mon, Mar 29, 2021 at 8:54 PM Andy Shevchenko
 wrote:
>
> On Sat, Mar 06, 2021 at 07:36:30PM +0530, Syed Nayyar Waris wrote:
> > This patch reimplements the xgpio_set_multiple() function in
> > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > to read and understand. Moreover, instead of looping for each bit
> > in xgpio_set_multiple() function, now we can check each channel at
> > a time and save cycles.
>
> ...
>
> > + u32 *const state = chip->gpio_state;
>
> Looking at this... What's the point of the const here?
>
> Am I right that this tells: pointer is a const, while the data underneath
> can be modified?
>
> > + unsigned int *const width = chip->gpio_width;
>
> Ditto.
>
> Putting const:s here and there for sake of the const is not good practice.
> It makes code harder to read.
>
> --
> With Best Regards,
> Andy Shevchenko
>
Okay. I will incorporate your comments in my next submission. Thank You.

Regards
Syed Nayyar Waris


Re: [PATCH v3 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-04-01 Thread Syed Nayyar Waris
On Wed, Mar 31, 2021 at 8:56 PM Srinivas Neeli  wrote:
>
> Hi,
>
> > -Original Message-
> > From: Bartosz Golaszewski 
> > Sent: Friday, March 26, 2021 10:58 PM
> > To: Michal Simek 
> > Cc: Syed Nayyar Waris ; Srinivas Neeli
> > ; Andy Shevchenko
> > ; William Breathitt Gray
> > ; Arnd Bergmann ; Robert
> > Richter ; Linus Walleij ;
> > Masahiro Yamada ; Andrew Morton
> > ; Zhang Rui ; Daniel
> > Lezcano ; Amit Kucheria
> > ; Linux-Arch ;
> > linux-gpio ; LKML  > ker...@vger.kernel.org>; arm-soc ;
> > linux-pm ; Srinivas Goud 
> > Subject: Re: [PATCH v3 3/3] gpio: xilinx: Utilize generic bitmap_get_value 
> > and
> > _set_value
> >
> > On Mon, Mar 8, 2021 at 8:13 AM Michal Simek 
> > wrote:
> > >
> > >
> > >
> > > On 3/6/21 3:06 PM, Syed Nayyar Waris wrote:
> > > > This patch reimplements the xgpio_set_multiple() function in
> > > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > > > to read and understand. Moreover, instead of looping for each bit in
> > > > xgpio_set_multiple() function, now we can check each channel at a
> > > > time and save cycles.
> > > >
> > > > Cc: Bartosz Golaszewski 
> > > > Cc: Michal Simek 
> > > > Signed-off-by: Syed Nayyar Waris 
> > > > Acked-by: William Breathitt Gray 
> > > > ---
> > > >  drivers/gpio/gpio-xilinx.c | 63
> > > > +++---
> > > >  1 file changed, 32 insertions(+), 31 deletions(-)
> > > >
> > > > diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
> > > > index be539381fd82..8445e69cf37b 100644
> > > > --- a/drivers/gpio/gpio-xilinx.c
> > > > +++ b/drivers/gpio/gpio-xilinx.c
> > > > @@ -15,6 +15,7 @@
> > > >  #include 
> > > >  #include 
> > > >  #include 
> > > > +#include "gpiolib.h"
> > > >
> > > >  /* Register Offset Definitions */
> > > >  #define XGPIO_DATA_OFFSET   (0x0)/* Data register  */
> > > > @@ -141,37 +142,37 @@ static void xgpio_set_multiple(struct
> > > > gpio_chip *gc, unsigned long *mask,  {
> > > >   unsigned long flags;
> > > >   struct xgpio_instance *chip = gpiochip_get_data(gc);
> > > > - int index = xgpio_index(chip, 0);
> > > > - int offset, i;
> > > > -
> > > > - spin_lock_irqsave(&chip->gpio_lock[index], flags);
> > > > -
> > > > - /* Write to GPIO signals */
> > > > - for (i = 0; i < gc->ngpio; i++) {
> > > > - if (*mask == 0)
> > > > - break;
> > > > - /* Once finished with an index write it out to the 
> > > > register */
> > > > - if (index !=  xgpio_index(chip, i)) {
> > > > - xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> > > > -index * XGPIO_CHANNEL_OFFSET,
> > > > -chip->gpio_state[index]);
> > > > - spin_unlock_irqrestore(&chip->gpio_lock[index], 
> > > > flags);
> > > > - index =  xgpio_index(chip, i);
> > > > - spin_lock_irqsave(&chip->gpio_lock[index], flags);
> > > > - }
> > > > - if (__test_and_clear_bit(i, mask)) {
> > > > - offset =  xgpio_offset(chip, i);
> > > > - if (test_bit(i, bits))
> > > > - chip->gpio_state[index] |= BIT(offset);
> > > > - else
> > > > - chip->gpio_state[index] &= ~BIT(offset);
> > > > - }
> > > > - }
> > > > -
> > > > - xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
> > > > -index * XGPIO_CHANNEL_OFFSET, 
> > > > chip->gpio_state[index]);
> > > > -
> > > > - spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
> > > > + u32 *const state = chip->gpio_state;
> > > > + unsigned int *const width = chip->gpio_width;
> > > > +
> > > > + DECLARE

Re: [PATCH v3 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-03-26 Thread Syed Nayyar Waris
On Fri, Mar 26, 2021 at 11:32 PM Andy Shevchenko
 wrote:
>
> On Sat, Mar 6, 2021 at 4:08 PM Syed Nayyar Waris  wrote:
>
> > +   bitmap_set_value(old, 64, state[0], 32, 0);
> > +   bitmap_set_value(old, 64, state[1], 32, 32);
>
> Isn't it effectively bitnap_from_arr32() ?
>
> > +   bitmap_set_value(new, 64, state[0], 32, 0);
> > +   bitmap_set_value(new, 64, state[1], 32, 32);
>
> Ditto.
>
> --
> With Best Regards,
> Andy Shevchenko

Hi Andy,

With bitmap_set_value() we are also specifying the offset (or start)
position too. so that the remainder of the array remains unaffected. I
think it would not be feasible to use bitmap_from/to_arr32()  here.

Regards
Syed Nayyar Waris


[PATCH v3 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-03-06 Thread Syed Nayyar Waris
This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.

Cc: Bartosz Golaszewski 
Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 
---
 drivers/gpio/gpio-xilinx.c | 63 +++---
 1 file changed, 32 insertions(+), 31 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index be539381fd82..8445e69cf37b 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include "gpiolib.h"
 
 /* Register Offset Definitions */
 #define XGPIO_DATA_OFFSET   (0x0)  /* Data register  */
@@ -141,37 +142,37 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
 {
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
-
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
-   }
-
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+   u32 *const state = chip->gpio_state;
+   unsigned int *const width = chip->gpio_width;
+
+   DECLARE_BITMAP(old, 64);
+   DECLARE_BITMAP(new, 64);
+   DECLARE_BITMAP(changed, 64);
+
+   spin_lock_irqsave(&chip->gpio_lock[0], flags);
+   spin_lock(&chip->gpio_lock[1]);
+
+   bitmap_set_value(old, 64, state[0], width[0], 0);
+   bitmap_set_value(old, 64, state[1], width[1], width[0]);
+   bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+   bitmap_set_value(old, 64, state[0], 32, 0);
+   bitmap_set_value(old, 64, state[1], 32, 32);
+   state[0] = bitmap_get_value(new, 0, width[0]);
+   state[1] = bitmap_get_value(new, width[0], width[1]);
+   bitmap_set_value(new, 64, state[0], 32, 0);
+   bitmap_set_value(new, 64, state[1], 32, 32);
+   bitmap_xor(changed, old, new, 64);
+
+   if (((u32 *)changed)[0])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+   state[0]);
+   if (((u32 *)changed)[1])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+   XGPIO_CHANNEL_OFFSET, state[1]);
+
+   spin_unlock(&chip->gpio_lock[1]);
+   spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
 }
 
 /**
-- 
2.29.0



[PATCH v3 2/3] gpio: thunderx: Utilize for_each_set_nbits macro

2021-03-06 Thread Syed Nayyar Waris
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_nbits macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 
---
 drivers/gpio/gpio-thunderx.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..4349e7393a1d 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-
+#include "gpiolib.h"
 
 #define GPIO_RX_DAT0x0
 #define GPIO_TX_SET0x8
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_nbits(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.29.0



[PATCH v3 1/3] gpiolib: Introduce the for_each_set_nbits macro

2021-03-06 Thread Syed Nayyar Waris
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size from 1 to BITS_PER_LONG. size less
than 1 or more than BITS_PER_LONG causes undefined behaviour.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.

Cc: Linus Walleij 
Cc: Bartosz Gołaszewski 
Cc: Arnd Bergmann 
Cc: Andy Shevchenko 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 
---
 drivers/gpio/gpiolib.c | 90 ++
 drivers/gpio/gpiolib.h | 28 +
 2 files changed, 118 insertions(+)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index b02cc2abd3b6..1e3cfc6bc73f 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -148,6 +148,96 @@ struct gpio_desc *gpiochip_get_desc(struct gpio_chip *gc,
 }
 EXPORT_SYMBOL_GPL(gpiochip_get_desc);
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive).
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+unsigned long bitmap_get_value(const unsigned long *map,
+   unsigned long start,
+   unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high;
+
+   if (space >= nbits)
+   return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+   else {
+   value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+   value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
nbits);
+   return (value_low >> offset) | (value_high << space);
+   }
+}
+EXPORT_SYMBOL_GPL(bitmap_get_value);
+
+/**
+ * bitmap_set_value - set value within a memory region
+ * @map: address to the bitmap memory region
+ * @nbits: size of map in bits
+ * @value: value of clump
+ * @value_width: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive)
+ * @start: bit offset of the value
+ */
+void bitmap_set_value(unsigned long *map, unsigned long nbits,
+   unsigned long value, unsigned long value_width,
+   unsigned long start)
+{
+   const unsigned long index = BIT_WORD(start);
+   const unsigned long length = BIT_WORD(nbits);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+
+   value &= GENMASK(value_width - 1, 0);
+
+   if (space >= value_width) {
+   map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
+   map[index] |= value << offset;
+   } else {
+   map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
+   map[index + 0] |= value << offset;
+
+   if (index + 1 >= length)
+   return;
+
+   map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
+   map[index + 1] |= value >> space;
+   }
+}
+EXPORT_SYMBOL_GPL(bitmap_set_value);
+
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+unsigned long find_next_clump(unsigned long *clump, const unsigned long *addr,
+   unsigned long size, unsigned long offset,
+   unsigned long clump_size)
+{
+   offset = find_next_bit(addr, size, offset);
+   if (offset == size)
+   return size;
+
+   offset = rounddown(offset, clump_size);
+   *clump = bitma

[PATCH v3 0/3] Introduce the for_each_set_nbits macro

2021-03-06 Thread Syed Nayyar Waris
Hello Bartosz,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

This patchset introduces a new generic version of for_each_set_nbits.
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_nbits the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word
size is not multiple of clump size. Following are examples showing the working
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_nbits:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour).

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_nbits:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v3:
 - [Patch 1/3]: Rename for_each_set_clump to for_each_set_nbits.
 - [Patch 1/3]: Shift function definitions outside 'ifdef CONFIG_DEBUG_FS'
   macro guard to resolve build (linking) error in xilinx Patch[3/3].
 - [Patch 2/3]: Rename for_each_set_clump to for_each_set_nbits.

Changes in v2:
 - [Patch 1/3]: Shift the macros and related functions to gpiolib inside
   gpio/. Reduce the visibilty of 'for_each_set_clump' to gpio.
 - [Patch 1/3]: Remove __builtin_unreachable and simply use return
   statement.
 - Remove tests from lib/test_bitmap.c as 'for_each_set_clump' is
   now localised inside gpio/ only.

Syed Nayyar Waris (3):
  gpiolib: Introduce the for_each_set_nbits macro
  gpio: thunderx: Utilize for_each_set_nbits macro
  gpio: xilinx: Utilize generic bitmap_get_value and _set_value

 drivers/gpio/gpio-thunderx.c | 13 --
 drivers/gpio/gpio-xilinx.c   | 63 -
 drivers/gpio/gpiolib.c   | 90 
 drivers/gpio/gpiolib.h   | 28 +++
 4 files changed, 158 insertions(+), 36 deletions(-)


base-commit: e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62
-- 
2.29.0



Re: [PATCH v2 0/3] Introduce the for_each_set_clump macro

2021-03-06 Thread Syed Nayyar Waris
On Wed, Mar 3, 2021 at 8:13 PM Bartosz Golaszewski
 wrote:
>
> On Fri, Feb 12, 2021 at 2:19 PM Syed Nayyar Waris  
> wrote:
> >
> > Hello Bartosz,
> >
> > Since this patchset primarily affects GPIO drivers, would you like
> > to pick it up through your GPIO tree?
> >
>
> Sure, as soon as you figure out what's wrong with the xilinx patch.
> Could you also follow William's suggestion and rename the functions?
>
> Bart

I have incorporated William's suggestions and have also solved the
build error coming in the xilinx patch.

I am sharing the v3 patchset. Thanks !

Regards


Syed Nayyar Waris


[PATCH v2 3/3] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2021-02-12 Thread Syed Nayyar Waris
This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.

Cc: William Breathitt Gray 
Cc: Bartosz Golaszewski 
Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-xilinx.c | 63 +++---
 1 file changed, 32 insertions(+), 31 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index be539381fd82..8445e69cf37b 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -15,6 +15,7 @@
 #include 
 #include 
 #include 
+#include "gpiolib.h"
 
 /* Register Offset Definitions */
 #define XGPIO_DATA_OFFSET   (0x0)  /* Data register  */
@@ -141,37 +142,37 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
 {
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
-
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
-   }
-
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+   u32 *const state = chip->gpio_state;
+   unsigned int *const width = chip->gpio_width;
+
+   DECLARE_BITMAP(old, 64);
+   DECLARE_BITMAP(new, 64);
+   DECLARE_BITMAP(changed, 64);
+
+   spin_lock_irqsave(&chip->gpio_lock[0], flags);
+   spin_lock(&chip->gpio_lock[1]);
+
+   bitmap_set_value(old, 64, state[0], width[0], 0);
+   bitmap_set_value(old, 64, state[1], width[1], width[0]);
+   bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+   bitmap_set_value(old, 64, state[0], 32, 0);
+   bitmap_set_value(old, 64, state[1], 32, 32);
+   state[0] = bitmap_get_value(new, 0, width[0]);
+   state[1] = bitmap_get_value(new, width[0], width[1]);
+   bitmap_set_value(new, 64, state[0], 32, 0);
+   bitmap_set_value(new, 64, state[1], 32, 32);
+   bitmap_xor(changed, old, new, 64);
+
+   if (((u32 *)changed)[0])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+   state[0]);
+   if (((u32 *)changed)[1])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+   XGPIO_CHANNEL_OFFSET, state[1]);
+
+   spin_unlock(&chip->gpio_lock[1]);
+   spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
 }
 
 /**
-- 
2.29.0



[PATCH v2 2/3] gpio: thunderx: Utilize for_each_set_clump macro

2021-02-12 Thread Syed Nayyar Waris
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: William Breathitt Gray 
Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-thunderx.c | 13 -
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..0398b2d2af4b 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -16,7 +16,7 @@
 #include 
 #include 
 #include 
-
+#include "gpiolib.h"
 
 #define GPIO_RX_DAT0x0
 #define GPIO_TX_SET0x8
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.29.0



[PATCH v2 0/3] Introduce the for_each_set_clump macro

2021-02-12 Thread Syed Nayyar Waris
Hello Bartosz,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

This patchset introduces a new generic version of for_each_set_clump.
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word
size is not multiple of clump size. Following are examples showing the working
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour).

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v2:
 - [Patch 1/3]: Shift the macros and related functions to gpiolib inside 
   gpio/. Reduce the visibilty of 'for_each_set_clump' to gpio.
 - [Patch 1/3]: Remove __builtin_unreachable and simply use return
   statement.
 - Remove tests from lib/test_bitmap.c as 'for_each_set_clump' is
   now localised inside gpio/ only.

Syed Nayyar Waris (3):
  gpiolib: : Introduce the for_each_set_clump macro
  gpio: thunderx: Utilize for_each_set_clump macro
  gpio: xilinx: Utilize generic bitmap_get_value and _set_value

 drivers/gpio/gpio-thunderx.c | 13 --
 drivers/gpio/gpio-xilinx.c   | 63 -
 drivers/gpio/gpiolib.c   | 90 
 drivers/gpio/gpiolib.h   | 28 +++
 4 files changed, 158 insertions(+), 36 deletions(-)


base-commit: e71ba9452f0b5b2e8dc8aa5445198cd9214a6a62
-- 
2.29.0



[PATCH v2 1/3] gpiolib: Introduce the for_each_set_clump macro

2021-02-12 Thread Syed Nayyar Waris
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size from 1 to BITS_PER_LONG. size less
than 1 or more than BITS_PER_LONG causes undefined behaviour.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.

Cc: Linus Walleij 
Cc: Bartosz Gołaszewski 
Cc: Arnd Bergmann 
Cc: William Breathitt Gray 
Cc: Andy Shevchenko 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpiolib.c | 90 ++
 drivers/gpio/gpiolib.h | 28 +
 2 files changed, 118 insertions(+)

diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index b02cc2abd3b6..282ae599c143 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -4342,6 +4342,96 @@ static int gpiolib_seq_show(struct seq_file *s, void *v)
return 0;
 }
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive).
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+unsigned long bitmap_get_value(const unsigned long *map,
+   unsigned long start,
+   unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high;
+
+   if (space >= nbits)
+   return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+   else {
+   value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+   value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
nbits);
+   return (value_low >> offset) | (value_high << space);
+   }
+}
+EXPORT_SYMBOL_GPL(bitmap_get_value);
+
+/**
+ * bitmap_set_value - set value within a memory region
+ * @map: address to the bitmap memory region
+ * @nbits: size of map in bits
+ * @value: value of clump
+ * @value_width: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive)
+ * @start: bit offset of the value
+ */
+void bitmap_set_value(unsigned long *map, unsigned long nbits,
+   unsigned long value, unsigned long value_width,
+   unsigned long start)
+{
+   const unsigned long index = BIT_WORD(start);
+   const unsigned long length = BIT_WORD(nbits);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+
+   value &= GENMASK(value_width - 1, 0);
+
+   if (space >= value_width) {
+   map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
+   map[index] |= value << offset;
+   } else {
+   map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
+   map[index + 0] |= value << offset;
+
+   if (index + 1 >= length)
+   return;
+
+   map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
+   map[index + 1] |= value >> space;
+   }
+}
+EXPORT_SYMBOL_GPL(bitmap_set_value);
+
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+unsigned long find_next_clump(unsigned long *clump, const unsigned long *addr,
+   unsigned long size, unsigned long offset,
+   unsigned long clump_size)
+{
+   offset = find_next_bit(addr, size, offset);
+   if (offset == size)
+   return size;
+
+   offset = rounddown(offset, clump_size);
+   *clump = bitmap_get_value(addr, offset, clump_size);
+

Re: [PATCH 2/5] lib/test_bitmap.c: Add for_each_set_clump test cases

2021-02-06 Thread Syed Nayyar Waris
On Thu, Feb 4, 2021 at 2:25 PM Syed Nayyar Waris  wrote:
>
> On Sat, Dec 26, 2020 at 8:15 PM Andy Shevchenko
>  wrote:
> >
> >
> >
> > On Saturday, December 26, 2020, Syed Nayyar Waris  
> > wrote:
> >>
> >> The introduction of the generic for_each_set_clump macro need test
> >> cases to verify the implementation. This patch adds test cases for
> >> scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
> >> The cases contain situations where clump is getting split at the word
> >> boundary and also when zeroes are present in the start and middle of
> >> bitmap.
> >
> >
> > You have to split it to a separate test under drivers/gpio, because now it 
> > has no sense to be like this.
>
> Hi Andy,
>
> How do I split it into separate test under drivers/gpio ? I have
> thought of making a test_clump_bits.c file in drivers/gpio.
> But how do I integrate this test file so that tests are executed at
> runtime? Similar to tests in lib/test_bitmap.c ?
>
> I believe I need to make changes in config files so that tests in
> test_clump_bits.c ( in drivers/gpio ) are executed at runtime. Could
> you please provide some steps on how to do that. Thank You !
>
> Regards
> Syed Nayyar Waris

Hi Andy, could you please help me on the above. Thanks !

Regards
Syed Nayyar Waris


Re: [PATCH 2/5] lib/test_bitmap.c: Add for_each_set_clump test cases

2021-02-04 Thread Syed Nayyar Waris
On Sat, Dec 26, 2020 at 8:15 PM Andy Shevchenko
 wrote:
>
>
>
> On Saturday, December 26, 2020, Syed Nayyar Waris  
> wrote:
>>
>> The introduction of the generic for_each_set_clump macro need test
>> cases to verify the implementation. This patch adds test cases for
>> scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
>> The cases contain situations where clump is getting split at the word
>> boundary and also when zeroes are present in the start and middle of
>> bitmap.
>
>
> You have to split it to a separate test under drivers/gpio, because now it 
> has no sense to be like this.

Hi Andy,

How do I split it into separate test under drivers/gpio ? I have
thought of making a test_clump_bits.c file in drivers/gpio.
But how do I integrate this test file so that tests are executed at
runtime? Similar to tests in lib/test_bitmap.c ?

I believe I need to make changes in config files so that tests in
test_clump_bits.c ( in drivers/gpio ) are executed at runtime. Could
you please provide some steps on how to do that. Thank You !

Regards
Syed Nayyar Waris


[PATCH 5/5] gpio: xilinx: Add extra check if sum of widths exceed 64

2020-12-25 Thread Syed Nayyar Waris
Add extra check to see if sum of widths does not exceed 64. If it
exceeds then return -EINVAL alongwith appropriate error message.

Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-xilinx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index d565fbf128b7..c9d740ac711b 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -319,6 +319,12 @@ static int xgpio_probe(struct platform_device *pdev)
 
chip->gc.base = -1;
chip->gc.ngpio = chip->gpio_width[0] + chip->gpio_width[1];
+
+   if (chip->gc.ngpio > 64) {
+   dev_err(&pdev->dev, "invalid configuration: number of GPIO is 
greater than 64");
+   return -EINVAL;
+   }
+
chip->gc.parent = &pdev->dev;
chip->gc.direction_input = xgpio_dir_in;
chip->gc.direction_output = xgpio_dir_out;
-- 
2.29.0



[PATCH 2/5] lib/test_bitmap.c: Add for_each_set_clump test cases

2020-12-25 Thread Syed Nayyar Waris
The introduction of the generic for_each_set_clump macro need test
cases to verify the implementation. This patch adds test cases for
scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
The cases contain situations where clump is getting split at the word
boundary and also when zeroes are present in the start and middle of
bitmap.

Cc: Andy Shevchenko 
Cc: William Breathitt Gray 
Signed-off-by: Syed Nayyar Waris 
---
 lib/test_bitmap.c | 144 ++
 1 file changed, 144 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 4425a1dd4ef1..c5b5fb98c9dd 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include <../drivers/gpio/clump_bits.h>
 
 #include "../tools/testing/selftests/kselftest_module.h"
 
@@ -155,6 +156,37 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
return true;
 }
 
+static bool __init __check_eq_clump(const char *srcfile, unsigned int line,
+   const unsigned int offset,
+   const unsigned int size,
+   const unsigned long *const clump_exp,
+   const unsigned long *const clump,
+   const unsigned long clump_size)
+{
+   unsigned long exp;
+
+   if (offset >= size) {
+   pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected 
less than %u, got %u\n",
+   srcfile, line, size, offset);
+   return false;
+   }
+
+   exp = clump_exp[offset / clump_size];
+   if (!exp) {
+   pr_warn("[%s:%u] bit offset for zero clump: expected nonzero 
clump, got bit offset %u with clump value 0",
+   srcfile, line, offset);
+   return false;
+   }
+
+   if (*clump != exp) {
+   pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value 
of 0x%lX",
+   srcfile, line, exp, *clump);
+   return false;
+   }
+
+   return true;
+}
+
 #define __expect_eq(suffix, ...)   \
({  \
int result = 0; \
@@ -172,6 +204,7 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
 #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
 #define expect_eq_u32_array(...)   __expect_eq(u32_array, ##__VA_ARGS__)
 #define expect_eq_clump8(...)  __expect_eq(clump8, ##__VA_ARGS__)
+#define expect_eq_clump(...)   __expect_eq(clump, ##__VA_ARGS__)
 
 static void __init test_zero_clear(void)
 {
@@ -530,6 +563,28 @@ static void noinline __init test_mem_optimisations(void)
}
 }
 
+static const unsigned long clump_bitmap_data[] __initconst = {
+   0x38000201,
+   0x05ff0f38,
+   0xeffedcba,
+   0xabcd,
+   0x00aa,
+   0x00aa,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x,
+   0x,
+   0x,
+   0x0f00,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x0ac0,
+};
+
 static const unsigned char clump_exp[] __initconst = {
0x01,   /* 1 bit set */
0x02,   /* non-edge 1 bit set */
@@ -541,6 +596,94 @@ static const unsigned char clump_exp[] __initconst = {
0x05,   /* non-adjacent 2 bits set */
 };
 
+static const unsigned long clump_exp1[] __initconst = {
+   0x01,   /* 1 bit set */
+   0x02,   /* non-edge 1 bit set */
+   0x00,   /* zero bits set */
+   0x38,   /* 3 bits set across 4-bit boundary */
+   0x38,   /* Repeated clump */
+   0x0F,   /* 4 bits set */
+   0xFF,   /* all bits set */
+   0x05,   /* non-adjacent 2 bits set */
+};
+
+static const unsigned long clump_exp2[] __initconst = {
+   0xfedcba,   /* 24 bits */
+   0xabcdef,
+   0xaa,   /* Clump split between 2 words */
+   0x00,   /* zeroes in between */
+   0xaa,
+   0x00,
+   0xff,
+   0xaa,
+   0x00,
+   0xff,
+};
+
+static const unsigned long clump_exp3[] __initconst = {
+   0x, /* starting with 0s*/
+   0x, /* All 0s */
+   0x,
+   0x,
+   0x3f0f, /* Non zero set */
+   0x2aa80003,
+   0x0aaa,
+   0x3fc0,
+};
+
+static const unsigned long clump_exp4[] __initconst = {
+   0x00,
+   0x2b,
+};
+
+struct clump_test_data_params {
+   DECLARE_BITMAP(data, 256);
+   unsigned long count;
+   unsigned long offset;
+   unsigned long limit;
+   unsigned long clump_size;
+   unsigned long const 

[PATCH 4/5] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-12-25 Thread Syed Nayyar Waris
This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.

Cc: William Breathitt Gray 
Cc: Bartosz Golaszewski 
Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-xilinx.c | 66 +++---
 1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 67f9f82e0db0..d565fbf128b7 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -14,6 +14,7 @@
 #include 
 #include 
 #include 
+#include <../drivers/gpio/clump_bits.h>
 
 /* Register Offset Definitions */
 #define XGPIO_DATA_OFFSET   (0x0)  /* Data register  */
@@ -138,37 +139,37 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
 {
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
-
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
-   }
-
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+   u32 *const state = chip->gpio_state;
+   unsigned int *const width = chip->gpio_width;
+
+   DECLARE_BITMAP(old, 64);
+   DECLARE_BITMAP(new, 64);
+   DECLARE_BITMAP(changed, 64);
+
+   spin_lock_irqsave(&chip->gpio_lock[0], flags);
+   spin_lock(&chip->gpio_lock[1]);
+
+   bitmap_set_value(old, 64, state[0], width[0], 0);
+   bitmap_set_value(old, 64, state[1], width[1], width[0]);
+   bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+   bitmap_set_value(old, 64, state[0], 32, 0);
+   bitmap_set_value(old, 64, state[1], 32, 32);
+   state[0] = bitmap_get_value(new, 0, width[0]);
+   state[1] = bitmap_get_value(new, width[0], width[1]);
+   bitmap_set_value(new, 64, state[0], 32, 0);
+   bitmap_set_value(new, 64, state[1], 32, 32);
+   bitmap_xor(changed, old, new, 64);
+
+   if (((u32 *)changed)[0])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+   state[0]);
+   if (((u32 *)changed)[1])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+   XGPIO_CHANNEL_OFFSET, state[1]);
+
+   spin_unlock(&chip->gpio_lock[1]);
+   spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
 }
 
 /**
@@ -292,6 +293,7 @@ static int xgpio_probe(struct platform_device *pdev)
chip->gpio_width[0] = 32;
 
spin_lock_init(&chip->gpio_lock[0]);
+   spin_lock_init(&chip->gpio_lock[1]);
 
if (of_property_read_u32(np, "xlnx,is-dual", &is_dual))
is_dual = 0;
@@ -313,8 +315,6 @@ static int xgpio_probe(struct platform_device *pdev)
if (of_property_read_u32(np, "xlnx,gpio2-width",
 &chip->gpio_width[1]))
chip->gpio_width[1] = 32;
-
-   spin_lock_init(&chip->gpio_lock[1]);
}
 
chip->gc.base = -1;
-- 
2.29.0



[PATCH 3/5] gpio: thunderx: Utilize for_each_set_clump macro

2020-12-25 Thread Syed Nayyar Waris
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: William Breathitt Gray 
Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-thunderx.c | 12 
 1 file changed, 8 insertions(+), 4 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..716b75ba7df6 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -16,6 +16,7 @@
 #include 
 #include 
 #include 
+#include <../drivers/gpio/clump_bits.h>
 
 
 #define GPIO_RX_DAT0x0
@@ -275,12 +276,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.29.0



[PATCH 1/5] clump_bits: Introduce the for_each_set_clump macro

2020-12-25 Thread Syed Nayyar Waris
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size from 1 to BITS_PER_LONG. size less
than 1 or more than BITS_PER_LONG causes undefined behaviour.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.

GCC gives warning in bitmap_set_value(): https://godbolt.org/z/rjx34r
Add explicit check to see if the value being written into the bitmap
does not fall outside the bitmap.
The situation that it is falling outside would never be possible in the
code because the boundaries are required to be correct before the
function is called. The responsibility is on the caller for ensuring the
boundaries are correct.
The code change is simply to silence the GCC warning messages
because GCC is not aware that the boundaries have already been checked.
As such, we're better off using __builtin_unreachable() here because we
can avoid the latency of the conditional check entirely.

Cc: Linus Walleij 
Cc: Arnd Bergmann 
Cc: William Breathitt Gray 
Cc: Andy Shevchenko 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/clump_bits.h | 101 ++
 1 file changed, 101 insertions(+)
 create mode 100644 drivers/gpio/clump_bits.h

diff --git a/drivers/gpio/clump_bits.h b/drivers/gpio/clump_bits.h
new file mode 100644
index ..72ef772b83c8
--- /dev/null
+++ b/drivers/gpio/clump_bits.h
@@ -0,0 +1,101 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __CLUMP_BITS_H
+#define __CLUMP_BITS_H
+
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+extern unsigned long find_next_clump(unsigned long *clump,
+ const unsigned long *addr,
+ unsigned long size, unsigned long offset,
+ unsigned long clump_size);
+
+#define find_first_clump(clump, bits, size, clump_size) \
+   find_next_clump((clump), (bits), (size), 0, (clump_size))
+
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive).
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high;
+
+   if (space >= nbits)
+   return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+   else {
+   value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+   value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
nbits);
+   return (value_low >> offset) | (value_high << space);
+   }
+}
+
+/**
+ * bitmap_set_value - set value within a memory region
+ * @map: address to the bitmap memory region
+ * @nbits: size of map in bits
+ * @value: value of clump
+ * @value_width: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive)
+ * @start: bit offset of the value
+ */
+static inline void bitmap_set_value(unsigned long *map, unsigned long nbits,
+   unsigned long value, unsigned long 
value_width,
+   unsigned long start)
+{
+   const unsigned long index = BIT_WORD(start);
+   const unsigned long length = BIT_WORD(nbits);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigne

[PATCH 0/5] Introduce the for_each_set_clump macro

2020-12-25 Thread Syed Nayyar Waris
Hello Linus,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

(Note: Patchset resent with the new macro and relevant
functions shifted to a new header clump_bits.h [Linus Torvalds])

Michal,
What do you think of [PATCH 5/5]? Is the conditional check needed? And
also does returning -EINVAL look good?

This patchset introduces a new generic version of for_each_set_clump.
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word
size is not multiple of clump size. Following are examples showing the working
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour).

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

GCC gives warning in bitmap_set_value(): https://godbolt.org/z/rjx34r
Add explicit check to see if the value being written into the bitmap
does not fall outside the bitmap.
The situation that it is falling outside would never be possible in the
code because the boundaries are required to be correct before the
function is called. The responsibility is on the caller for ensuring the
boundaries are correct.
The code change is simply to silence the GCC warning messages
because GCC is not aware that the boundaries have already been checked.
As such, we're better off using __builtin_unreachable() here because we
can avoid the latency of the conditional check entirely.

Syed Nayyar Waris (5):
  clump_bits: Introduce the for_each_set_clump macro
  lib/test_bitmap.c: Add for_each_set_clump test cases
  gpio: thunderx: Utilize for_each_set_clump macro
  gpio: xilinx: Utilize generic bitmap_get_value and _set_value
  gpio: xilinx: Add extra check if sum of widths exceed 64

 drivers/gpio/clump_bits.h| 101 
 drivers/gpio/gpio-thunderx.c |  12 ++-
 drivers/gpio/gpio-xilinx.c   |  72 ++
 lib/test_bitmap.c| 144 +++
 4 files changed, 292 insertions(+), 37 deletions(-)
 create mode 100644 drivers/gpio/clump_bits.h


base-commit: bbe2ba04c5a92a49db8a42c850a5a2f6481e47eb
-- 
2.29.0



Re: [RESEND PATCH 3/4] gpio: xilinx: Modify bitmap_set_value() calls

2020-12-12 Thread Syed Nayyar Waris
On Tue, Dec 1, 2020 at 9:03 PM Bartosz Golaszewski
 wrote:
>
> On Fri, Nov 20, 2020 at 7:46 PM Syed Nayyar Waris  
> wrote:
> >
> > Modify the bitmap_set_value() calls. bitmap_set_value()
> > now takes an extra bitmap width as second argument and the width of
> > value is now present as the fourth argument.
> >
> > Cc: Michal Simek 
> > Signed-off-by: Syed Nayyar Waris 
> > ---
> >  drivers/gpio/gpio-xilinx.c | 12 ++--
> >  1 file changed, 6 insertions(+), 6 deletions(-)
> >
> > diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
> > index ad4ee4145db4..05dae086c4d0 100644
> > --- a/drivers/gpio/gpio-xilinx.c
> > +++ b/drivers/gpio/gpio-xilinx.c
> > @@ -151,16 +151,16 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
> > unsigned long *mask,
> > spin_lock_irqsave(&chip->gpio_lock[0], flags);
> > spin_lock(&chip->gpio_lock[1]);
> >
> > -   bitmap_set_value(old, state[0], 0, width[0]);
> > -   bitmap_set_value(old, state[1], width[0], width[1]);
> > +   bitmap_set_value(old, 64, state[0], width[0], 0);
> > +   bitmap_set_value(old, 64, state[1], width[1], width[0]);
> > bitmap_replace(new, old, bits, mask, gc->ngpio);
> >
> > -   bitmap_set_value(old, state[0], 0, 32);
> > -   bitmap_set_value(old, state[1], 32, 32);
> > +   bitmap_set_value(old, 64, state[0], 32, 0);
> > +   bitmap_set_value(old, 64, state[1], 32, 32);
> > state[0] = bitmap_get_value(new, 0, width[0]);
> > state[1] = bitmap_get_value(new, width[0], width[1]);
> > -   bitmap_set_value(new, state[0], 0, 32);
> > -   bitmap_set_value(new, state[1], 32, 32);
> > +   bitmap_set_value(new, 64, state[0], 32, 0);
> > +   bitmap_set_value(new, 64, state[1], 32, 32);
> > bitmap_xor(changed, old, new, 64);
> >
> > if (((u32 *)changed)[0])
> > --
> > 2.29.0
> >
>
> This series is not bisectable because you modify the interface -
> breaking existing users - and you only fix them later. Please squash
> those changes into a single commit.
>
> Bartosz

Hi Bartosz,

I have squashed the changes and have sent a new patchset v2.

Regards
Syed Nayyar Waris


[PATCH v2 2/2] gpio: xilinx: Add extra check if sum of widths exceed 64

2020-12-12 Thread Syed Nayyar Waris
Add extra check to see if sum of widths does not exceed 64. If it
exceeds then return -EINVAL alongwith appropriate error message.

Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-xilinx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 05dae086c4d0..a2e92a1cf50b 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -340,6 +340,12 @@ static int xgpio_probe(struct platform_device *pdev)
 
chip->gc.base = -1;
chip->gc.ngpio = chip->gpio_width[0] + chip->gpio_width[1];
+
+   if (chip->gc.ngpio > 64) {
+   dev_err(&pdev->dev, "invalid configuration: number of GPIO is 
greater than 64");
+   return -EINVAL;
+   }
+
chip->gc.parent = &pdev->dev;
chip->gc.direction_input = xgpio_dir_in;
chip->gc.direction_output = xgpio_dir_out;
-- 
2.29.0



[PATCH v2 0/2] Modify bitmap_set_value() to suppress compiler warning

2020-12-12 Thread Syed Nayyar Waris
Hi All,

The purpose of this patchset is to suppress the compiler warning 
(-Wtype-limits).

In function bitmap_set_value(), add explicit check to see if the value being
written into the bitmap does not fall outside the bitmap.
The situation that it is falling outside is never possible in the code
because the boundaries are required to be correct before the function is
called. The responsibility is on the caller for ensuring the boundaries
are correct.
The code change is simply to silence the GCC warning messages
because GCC is not aware that the boundaries have already been checked.
As such, we're better off using __builtin_unreachable() here because we
can avoid the latency of the conditional check entirely.

Michal,
What do you think of [PATCH 2/2]? Is the conditional check needed, and also does
returning -EINVAL look good?

Changes in v2:
 - [Patch 1/2]: Squashed earlier three patches into one.

Syed Nayyar Waris (2):
  bitmap: Modify bitmap_set_value() to check bitmap length
  gpio: xilinx: Add extra check if sum of widths exceed 64

 drivers/gpio/gpio-xilinx.c | 18 --
 include/linux/bitmap.h | 35 +--
 lib/test_bitmap.c  |  4 ++--
 3 files changed, 35 insertions(+), 22 deletions(-)


base-commit: b640c4e12bbe1f0b6383c3ef788a89e5427c763f
-- 
2.29.0



[PATCH v2 1/2] bitmap: Modify bitmap_set_value() to check bitmap length

2020-12-12 Thread Syed Nayyar Waris
Add explicit check to see if the value being written into the bitmap
does not fall outside the bitmap.
The situation that it is falling outside would never be possible in the
code because the boundaries are required to be correct before the
function is called. The responsibility is on the caller for ensuring the
boundaries are correct.
The code change is simply to silence the GCC warning messages
because GCC is not aware that the boundaries have already been checked.
As such, we're better off using __builtin_unreachable() here because we
can avoid the latency of the conditional check entirely.

Cc: Arnd Bergmann 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 

lib/test_bitmap.c: Modify for_each_set_clump test

Modify the test where bitmap_set_value() is called. bitmap_set_value()
now takes an extra bitmap-width as second argument and the width of
value is now present as the fourth argument.

Signed-off-by: Syed Nayyar Waris 

gpio: xilinx: Modify bitmap_set_value() calls

Modify the bitmap_set_value() calls. bitmap_set_value()
now takes an extra bitmap width as second argument and the width of
value is now present as the fourth argument.

Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-xilinx.c | 12 ++--
 include/linux/bitmap.h | 35 +--
 lib/test_bitmap.c  |  4 ++--
 3 files changed, 29 insertions(+), 22 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index ad4ee4145db4..05dae086c4d0 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -151,16 +151,16 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
spin_lock_irqsave(&chip->gpio_lock[0], flags);
spin_lock(&chip->gpio_lock[1]);
 
-   bitmap_set_value(old, state[0], 0, width[0]);
-   bitmap_set_value(old, state[1], width[0], width[1]);
+   bitmap_set_value(old, 64, state[0], width[0], 0);
+   bitmap_set_value(old, 64, state[1], width[1], width[0]);
bitmap_replace(new, old, bits, mask, gc->ngpio);
 
-   bitmap_set_value(old, state[0], 0, 32);
-   bitmap_set_value(old, state[1], 32, 32);
+   bitmap_set_value(old, 64, state[0], 32, 0);
+   bitmap_set_value(old, 64, state[1], 32, 32);
state[0] = bitmap_get_value(new, 0, width[0]);
state[1] = bitmap_get_value(new, width[0], width[1]);
-   bitmap_set_value(new, state[0], 0, 32);
-   bitmap_set_value(new, state[1], 32, 32);
+   bitmap_set_value(new, 64, state[0], 32, 0);
+   bitmap_set_value(new, 64, state[1], 32, 32);
bitmap_xor(changed, old, new, 64);
 
if (((u32 *)changed)[0])
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 386d08777342..efb6199ea1e7 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -78,8 +78,9 @@
  *  bitmap_get_value(map, start, nbits)Get bit value of size
  *  'nbits' from map at start
  *  bitmap_set_value8(map, value, start)Set 8bit value to map at start
- *  bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits'
- *  of map at start
+ *  bitmap_set_value(map, nbits, value, value_width, start)
+ *  Set bit value of size 
value_width
+ *  to map at start
  *
  * Note, bitmap_zero() and bitmap_fill() operate over the region of
  * unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -610,30 +611,36 @@ static inline void bitmap_set_value8(unsigned long *map, 
unsigned long value,
 }
 
 /**
- * bitmap_set_value - set n-bit value within a memory region
+ * bitmap_set_value - set value within a memory region
  * @map: address to the bitmap memory region
- * @value: value of nbits
- * @start: bit offset of the n-bit value
- * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive).
+ * @nbits: size of map in bits
+ * @value: value of clump
+ * @value_width: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive)
+ * @start: bit offset of the value
  */
-static inline void bitmap_set_value(unsigned long *map,
-   unsigned long value,
-   unsigned long start, unsigned long nbits)
+static inline void bitmap_set_value(unsigned long *map, unsigned long nbits,
+   unsigned long value, unsigned long 
value_width,
+   unsigned long start)
 {
-   const size_t index = BIT_WORD(start);
+   const unsigned long index = BIT_WORD(start);
+   const unsigned long length = BIT_WORD(nbits);
const unsigned long offset = start % BITS_PER_LONG;
const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
c

Re: [PATCH 1/4] bitmap: Modify bitmap_set_value() to check bitmap length

2020-11-20 Thread Syed Nayyar Waris
On Fri, Nov 20, 2020 at 11:29 PM William Breathitt Gray
 wrote:
>
> On Fri, Nov 20, 2020 at 11:14:16PM +0530, Syed Nayyar Waris wrote:
> > Add explicit check to see if the value being written into the bitmap
> > does not fall outside the bitmap.
> > The situation that it is falling outside would never be possible in the
> > code because the boundaries are required to be correct before the function
> > is called. The responsibility is on the caller for ensuring the boundaries
> > are correct.
> > This is just to suppress the GCC -Wtype-limits warnings.
>
> Hi Syed,
>
> This commit message sounds a bit strange without the context of our
> earlier discussion thread. Would you be able to reword the commit
> message to explain the motivation for using __builtin_unreachable()?
>
> Thanks,
>
> William Breathitt Gray

Hi William,

Actually I explained the motivation for using __builtin_unreachable()
in the cover letter.
So, left it here in this patch.

I am sending this patch again updating the commit message.

Regards
Syed Nayyar Waris

>
> >
> > Cc: Arnd Bergmann 
> > Signed-off-by: Syed Nayyar Waris 
> > Acked-by: William Breathitt Gray 
> > ---
> >  include/linux/bitmap.h | 35 +--
> >  1 file changed, 21 insertions(+), 14 deletions(-)
> >
> > diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
> > index 386d08777342..efb6199ea1e7 100644
> > --- a/include/linux/bitmap.h
> > +++ b/include/linux/bitmap.h
> > @@ -78,8 +78,9 @@
> >   *  bitmap_get_value(map, start, nbits)  Get bit value of size
> >   *  'nbits' from map at start
> >   *  bitmap_set_value8(map, value, start)Set 8bit value to map at 
> > start
> > - *  bitmap_set_value(map, value, start, nbits)   Set bit value of size 
> > 'nbits'
> > - *  of map at start
> > + *  bitmap_set_value(map, nbits, value, value_width, start)
> > + *  Set bit value of size 
> > value_width
> > + *  to map at start
> >   *
> >   * Note, bitmap_zero() and bitmap_fill() operate over the region of
> >   * unsigned longs, that is, bits behind bitmap till the unsigned long
> > @@ -610,30 +611,36 @@ static inline void bitmap_set_value8(unsigned long 
> > *map, unsigned long value,
> >  }
> >
> >  /**
> > - * bitmap_set_value - set n-bit value within a memory region
> > + * bitmap_set_value - set value within a memory region
> >   * @map: address to the bitmap memory region
> > - * @value: value of nbits
> > - * @start: bit offset of the n-bit value
> > - * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
> > inclusive).
> > + * @nbits: size of map in bits
> > + * @value: value of clump
> > + * @value_width: size of value in bits (must be between 1 and 
> > BITS_PER_LONG inclusive)
> > + * @start: bit offset of the value
> >   */
> > -static inline void bitmap_set_value(unsigned long *map,
> > - unsigned long value,
> > - unsigned long start, unsigned long nbits)
> > +static inline void bitmap_set_value(unsigned long *map, unsigned long 
> > nbits,
> > + unsigned long value, unsigned long 
> > value_width,
> > + unsigned long start)
> >  {
> > - const size_t index = BIT_WORD(start);
> > + const unsigned long index = BIT_WORD(start);
> > + const unsigned long length = BIT_WORD(nbits);
> >   const unsigned long offset = start % BITS_PER_LONG;
> >   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> >   const unsigned long space = ceiling - start;
> >
> > - value &= GENMASK(nbits - 1, 0);
> > + value &= GENMASK(value_width - 1, 0);
> >
> > - if (space >= nbits) {
> > - map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> > + if (space >= value_width) {
> > + map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
> >   map[index] |= value << offset;
> >   } else {
> >   map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> >   map[index + 0] |= value << offset;
> > - map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > +
> > + if (index + 1 >= length)
> > + __builtin_unreachable();
> > +
> > + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
> >   map[index + 1] |= value >> space;
> >   }
> >  }
> > --
> > 2.29.0
> >


[PATCH 4/4] gpio: xilinx: Add extra check to see if sum of widths exceed 64

2020-11-20 Thread Syed Nayyar Waris
Add extra check to see if sum of widths does not exceed 64. If it
exceeds then return -EINVAL alongwith appropriate error message.

Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-xilinx.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 05dae086c4d0..a2e92a1cf50b 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -340,6 +340,12 @@ static int xgpio_probe(struct platform_device *pdev)
 
chip->gc.base = -1;
chip->gc.ngpio = chip->gpio_width[0] + chip->gpio_width[1];
+
+   if (chip->gc.ngpio > 64) {
+   dev_err(&pdev->dev, "invalid configuration: number of GPIO is 
greater than 64");
+   return -EINVAL;
+   }
+
chip->gc.parent = &pdev->dev;
chip->gc.direction_input = xgpio_dir_in;
chip->gc.direction_output = xgpio_dir_out;
-- 
2.29.0



[PATCH 2/4] lib/test_bitmap.c: Modify for_each_set_clump test

2020-11-20 Thread Syed Nayyar Waris
Modify the test where bitmap_set_value() is called. bitmap_set_value()
now takes an extra bitmap-width as second argument and the width of
value is now present as the fourth argument.

Signed-off-by: Syed Nayyar Waris 
---
 lib/test_bitmap.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 1c5791ff02cb..7fafe6a0bc08 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -656,8 +656,8 @@ static void __init prepare_test_data(unsigned int index)
unsigned long width = 0;
 
for (i = 0; i < clump_test_data[index].count; i++) {
-   bitmap_set_value(clump_test_data[index].data,
-   clump_bitmap_data[(clump_test_data[index].offset)++], 
width, 32);
+   bitmap_set_value(clump_test_data[index].data, 256,
+   clump_bitmap_data[(clump_test_data[index].offset)++], 
32, width);
width += 32;
}
 }
-- 
2.29.0



[PATCH 3/4] gpio: xilinx: Modify bitmap_set_value() calls

2020-11-20 Thread Syed Nayyar Waris
Modify the bitmap_set_value() calls. bitmap_set_value()
now takes an extra bitmap width as second argument and the width of
value is now present as the fourth argument.

Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
---
 drivers/gpio/gpio-xilinx.c | 12 ++--
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index ad4ee4145db4..05dae086c4d0 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -151,16 +151,16 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
spin_lock_irqsave(&chip->gpio_lock[0], flags);
spin_lock(&chip->gpio_lock[1]);
 
-   bitmap_set_value(old, state[0], 0, width[0]);
-   bitmap_set_value(old, state[1], width[0], width[1]);
+   bitmap_set_value(old, 64, state[0], width[0], 0);
+   bitmap_set_value(old, 64, state[1], width[1], width[0]);
bitmap_replace(new, old, bits, mask, gc->ngpio);
 
-   bitmap_set_value(old, state[0], 0, 32);
-   bitmap_set_value(old, state[1], 32, 32);
+   bitmap_set_value(old, 64, state[0], 32, 0);
+   bitmap_set_value(old, 64, state[1], 32, 32);
state[0] = bitmap_get_value(new, 0, width[0]);
state[1] = bitmap_get_value(new, width[0], width[1]);
-   bitmap_set_value(new, state[0], 0, 32);
-   bitmap_set_value(new, state[1], 32, 32);
+   bitmap_set_value(new, 64, state[0], 32, 0);
+   bitmap_set_value(new, 64, state[1], 32, 32);
bitmap_xor(changed, old, new, 64);
 
if (((u32 *)changed)[0])
-- 
2.29.0



[PATCH 1/4] bitmap: Modify bitmap_set_value() to check bitmap length

2020-11-20 Thread Syed Nayyar Waris
Add explicit check to see if the value being written into the bitmap
does not fall outside the bitmap.
The situation that it is falling outside would never be possible in the
code because the boundaries are required to be correct before the function
is called. The responsibility is on the caller for ensuring the boundaries
are correct.
This is just to suppress the GCC -Wtype-limits warnings.

Cc: Arnd Bergmann 
Signed-off-by: Syed Nayyar Waris 
Acked-by: William Breathitt Gray 
---
 include/linux/bitmap.h | 35 +--
 1 file changed, 21 insertions(+), 14 deletions(-)

diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 386d08777342..efb6199ea1e7 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -78,8 +78,9 @@
  *  bitmap_get_value(map, start, nbits)Get bit value of size
  *  'nbits' from map at start
  *  bitmap_set_value8(map, value, start)Set 8bit value to map at start
- *  bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits'
- *  of map at start
+ *  bitmap_set_value(map, nbits, value, value_width, start)
+ *  Set bit value of size 
value_width
+ *  to map at start
  *
  * Note, bitmap_zero() and bitmap_fill() operate over the region of
  * unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -610,30 +611,36 @@ static inline void bitmap_set_value8(unsigned long *map, 
unsigned long value,
 }
 
 /**
- * bitmap_set_value - set n-bit value within a memory region
+ * bitmap_set_value - set value within a memory region
  * @map: address to the bitmap memory region
- * @value: value of nbits
- * @start: bit offset of the n-bit value
- * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive).
+ * @nbits: size of map in bits
+ * @value: value of clump
+ * @value_width: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive)
+ * @start: bit offset of the value
  */
-static inline void bitmap_set_value(unsigned long *map,
-   unsigned long value,
-   unsigned long start, unsigned long nbits)
+static inline void bitmap_set_value(unsigned long *map, unsigned long nbits,
+   unsigned long value, unsigned long 
value_width,
+   unsigned long start)
 {
-   const size_t index = BIT_WORD(start);
+   const unsigned long index = BIT_WORD(start);
+   const unsigned long length = BIT_WORD(nbits);
const unsigned long offset = start % BITS_PER_LONG;
const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
const unsigned long space = ceiling - start;
 
-   value &= GENMASK(nbits - 1, 0);
+   value &= GENMASK(value_width - 1, 0);
 
-   if (space >= nbits) {
-   map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
+   if (space >= value_width) {
+   map[index] &= ~(GENMASK(value_width - 1, 0) << offset);
map[index] |= value << offset;
} else {
map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
map[index + 0] |= value << offset;
-   map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
+
+   if (index + 1 >= length)
+   __builtin_unreachable();
+
+   map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + value_width);
map[index + 1] |= value >> space;
}
 }
-- 
2.29.0



[PATCH 0/4] Modify bitmap_set_value() to suppress compiler warning

2020-11-20 Thread Syed Nayyar Waris
Hi All,

The purpose of this patchset is to suppress the compiler warning 
(-Wtype-limits).

In function bitmap_set_value(), add explicit check to see if the value being
written into the bitmap does not fall outside the bitmap.
The situation that it is falling outside is never possible in the code 
because the boundaries are required to be correct before the function is 
called. The responsibility is on the caller for ensuring the boundaries 
are correct.
The code change is simply to silence the GCC warning messages
because GCC is not aware that the boundaries have already been checked.
As such, we're better off using __builtin_unreachable() here because we
can avoid the latency of the conditional check entirely.

Michal,
What do you think of [PATCH 4/4]? Is the conditional check needed, and also 
whether
returning -EINVAL looks good?

Syed Nayyar Waris (4):
  bitmap: Modify bitmap_set_value() to check bitmap length
  lib/test_bitmap.c: Modify for_each_set_clump test
  gpio: xilinx: Modify bitmap_set_value() calls
  gpio: xilinx: Add extra check if sum of widths exceed 64

 drivers/gpio/gpio-xilinx.c | 18 --
 include/linux/bitmap.h | 35 +--
 lib/test_bitmap.c  |  4 ++--
 3 files changed, 35 insertions(+), 22 deletions(-)


base-commit: b640c4e12bbe1f0b6383c3ef788a89e5427c763f
-- 
2.29.0



Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-11-13 Thread Syed Nayyar Waris
On Wed, Nov 11, 2020 at 3:30 AM Syed Nayyar Waris  wrote:
>
> On Tue, Nov 10, 2020 at 12:43:16PM -0500, William Breathitt Gray wrote:
> > On Tue, Nov 10, 2020 at 10:52:42PM +0530, Syed Nayyar Waris wrote:
> > > On Tue, Nov 10, 2020 at 6:05 PM William Breathitt Gray
> > >  wrote:
> > > >
> > > > On Tue, Nov 10, 2020 at 11:02:43AM +0100, Michal Simek wrote:
> > > > >
> > > > >
> > > > > On 09. 11. 20 18:31, William Breathitt Gray wrote:
> > > > > > On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> > > > > >> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray 
> > > > > >> wrote:
> > > > > >>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > > > > >>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> > > > > >>
> > > > > >> ...
> > > > > >>
> > > > > >>>>  static inline void bitmap_set_value(unsigned long *map,
> > > > > >>>> -unsigned long value,
> > > > > >>>> +unsigned long value, const 
> > > > > >>>> size_t length,
> > > > > >>>>  unsigned long start, 
> > > > > >>>> unsigned long nbits)
> > > > > >>>>  {
> > > > > >>>>  const size_t index = BIT_WORD(start);
> > > > > >>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned 
> > > > > >>>> long *map,
> > > > > >>>>  } else {
> > > > > >>>>  map[index + 0] &= 
> > > > > >>>> ~BITMAP_FIRST_WORD_MASK(start);
> > > > > >>>>  map[index + 0] |= value << offset;
> > > > > >>>> +
> > > > > >>>> +   if (index + 1 >= length)
> > > > > >>>> +   __builtin_unreachable();
> > > > > >>>> +
> > > > > >>>>  map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start 
> > > > > >>>> + nbits);
> > > > > >>>>  map[index + 1] |= value >> space;
> > > > > >>>>  }
> > > > > >>>
> > > > > >>> Hi Syed,
> > > > > >>>
> > > > > >>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 
> > > > > >>> 'nbits'
> > > > > >>> to value_width.
> > > > > >>
> > > > > >> length here is in longs. I guess this is the point of entire patch.
> > > > > >
> > > > > > Ah yes, this should become 'const unsigned long nbits' and 
> > > > > > represent the
> > > > > > length of the bitmap in bits and not longs.
> > >
> > > Hi William, Andy and All,
> > >
> > > Thank You for reviewing. I was looking into the review comments and I
> > > have a question on the above.
> > >
> > > Actually, in bitmap_set_value(), the intended comparison is to be made
> > > between 'index + 1' and 'length' (which is now renamed as 'nbits').
> > > That is, the comparison would look-like as follows:
> > > if (index + 1 >= nbits)
> > >
> > > The 'index' is getting populated with BIT_WORD(start).
> > > The 'index' variable in above is the actual index of the bitmap array,
> > > while in previous mail it is suggested to use 'nbits' which represent
> > > the length of the bitmap in bits and not longs.
> > >
> > > Isn't it comparing two different things? index of array (not the
> > > bit-wise-length) on left hand side and nbits (bit-wise-length) on
> > > right hand side?
> > >
> > > Have I misunderstood something? If yes, request to clarify.
> > >
> > > Or do I have to first divide 'nbits' by BITS_PER_LONG and then compare
> > > it with 'index + 1'? Something like this?
> > >
> > > Regards
> > > Syed Nayyar Waris
> >
> >

Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-11-10 Thread Syed Nayyar Waris
On Tue, Nov 10, 2020 at 12:43:16PM -0500, William Breathitt Gray wrote:
> On Tue, Nov 10, 2020 at 10:52:42PM +0530, Syed Nayyar Waris wrote:
> > On Tue, Nov 10, 2020 at 6:05 PM William Breathitt Gray
> >  wrote:
> > >
> > > On Tue, Nov 10, 2020 at 11:02:43AM +0100, Michal Simek wrote:
> > > >
> > > >
> > > > On 09. 11. 20 18:31, William Breathitt Gray wrote:
> > > > > On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> > > > >> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray 
> > > > >> wrote:
> > > > >>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > > > >>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> > > > >>
> > > > >> ...
> > > > >>
> > > > >>>>  static inline void bitmap_set_value(unsigned long *map,
> > > > >>>> -unsigned long value,
> > > > >>>> +unsigned long value, const 
> > > > >>>> size_t length,
> > > > >>>>  unsigned long start, unsigned 
> > > > >>>> long nbits)
> > > > >>>>  {
> > > > >>>>  const size_t index = BIT_WORD(start);
> > > > >>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned 
> > > > >>>> long *map,
> > > > >>>>  } else {
> > > > >>>>  map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > >>>>  map[index + 0] |= value << offset;
> > > > >>>> +
> > > > >>>> +   if (index + 1 >= length)
> > > > >>>> +   __builtin_unreachable();
> > > > >>>> +
> > > > >>>>  map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + 
> > > > >>>> nbits);
> > > > >>>>  map[index + 1] |= value >> space;
> > > > >>>>  }
> > > > >>>
> > > > >>> Hi Syed,
> > > > >>>
> > > > >>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 
> > > > >>> 'nbits'
> > > > >>> to value_width.
> > > > >>
> > > > >> length here is in longs. I guess this is the point of entire patch.
> > > > >
> > > > > Ah yes, this should become 'const unsigned long nbits' and represent 
> > > > > the
> > > > > length of the bitmap in bits and not longs.
> > 
> > Hi William, Andy and All,
> > 
> > Thank You for reviewing. I was looking into the review comments and I
> > have a question on the above.
> > 
> > Actually, in bitmap_set_value(), the intended comparison is to be made
> > between 'index + 1' and 'length' (which is now renamed as 'nbits').
> > That is, the comparison would look-like as follows:
> > if (index + 1 >= nbits)
> > 
> > The 'index' is getting populated with BIT_WORD(start).
> > The 'index' variable in above is the actual index of the bitmap array,
> > while in previous mail it is suggested to use 'nbits' which represent
> > the length of the bitmap in bits and not longs.
> > 
> > Isn't it comparing two different things? index of array (not the
> > bit-wise-length) on left hand side and nbits (bit-wise-length) on
> > right hand side?
> > 
> > Have I misunderstood something? If yes, request to clarify.
> > 
> > Or do I have to first divide 'nbits' by BITS_PER_LONG and then compare
> > it with 'index + 1'? Something like this?
> > 
> > Regards
> > Syed Nayyar Waris
> 
> The array elements of the bitmap memory region are abstracted away for
> the covenience of the users of the bitmap_* functions; the driver
> authors are able to treat their bitmaps as just a set of contiguous bits
> and not worry about where the division between array elements happen.
> 
> So to match the interface of the other bitmap_* functions, you should
> take in nbits and figure out the actual array length by dividing by
> BITS_PER_LONG inside bitmap_set_val

Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-11-10 Thread Syed Nayyar Waris
On Tue, Nov 10, 2020 at 6:05 PM William Breathitt Gray
 wrote:
>
> On Tue, Nov 10, 2020 at 11:02:43AM +0100, Michal Simek wrote:
> >
> >
> > On 09. 11. 20 18:31, William Breathitt Gray wrote:
> > > On Mon, Nov 09, 2020 at 07:22:20PM +0200, Andy Shevchenko wrote:
> > >> On Mon, Nov 09, 2020 at 12:11:40PM -0500, William Breathitt Gray wrote:
> > >>> On Mon, Nov 09, 2020 at 10:15:29PM +0530, Syed Nayyar Waris wrote:
> > >>>> On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> > >>
> > >> ...
> > >>
> > >>>>  static inline void bitmap_set_value(unsigned long *map,
> > >>>> -unsigned long value,
> > >>>> +unsigned long value, const size_t 
> > >>>> length,
> > >>>>  unsigned long start, unsigned 
> > >>>> long nbits)
> > >>>>  {
> > >>>>  const size_t index = BIT_WORD(start);
> > >>>> @@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long 
> > >>>> *map,
> > >>>>  } else {
> > >>>>  map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > >>>>  map[index + 0] |= value << offset;
> > >>>> +
> > >>>> +   if (index + 1 >= length)
> > >>>> +   __builtin_unreachable();
> > >>>> +
> > >>>>  map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + 
> > >>>> nbits);
> > >>>>  map[index + 1] |= value >> space;
> > >>>>  }
> > >>>
> > >>> Hi Syed,
> > >>>
> > >>> Let's rename 'length' to 'nbits' as Arnd suggested, and rename 'nbits'
> > >>> to value_width.
> > >>
> > >> length here is in longs. I guess this is the point of entire patch.
> > >
> > > Ah yes, this should become 'const unsigned long nbits' and represent the
> > > length of the bitmap in bits and not longs.

Hi William, Andy and All,

Thank You for reviewing. I was looking into the review comments and I
have a question on the above.

Actually, in bitmap_set_value(), the intended comparison is to be made
between 'index + 1' and 'length' (which is now renamed as 'nbits').
That is, the comparison would look-like as follows:
if (index + 1 >= nbits)

The 'index' is getting populated with BIT_WORD(start).
The 'index' variable in above is the actual index of the bitmap array,
while in previous mail it is suggested to use 'nbits' which represent
the length of the bitmap in bits and not longs.

Isn't it comparing two different things? index of array (not the
bit-wise-length) on left hand side and nbits (bit-wise-length) on
right hand side?

Have I misunderstood something? If yes, request to clarify.

Or do I have to first divide 'nbits' by BITS_PER_LONG and then compare
it with 'index + 1'? Something like this?

Regards
Syed Nayyar Waris

> > >
> > >> But to me sounds like it would be better to have simply 
> > >> bitmap_set_value64() /
> > >> bitmap_set_value32() with proper optimization done and forget about 
> > >> variadic
> > >> ones for now.
> > >
> > > The gpio-xilinx driver can have arbitrary sizes for width[0] and
> > > width[1], so unfortunately that means we don't know the start position
> > > nor the width of the value beforehand.
> >
> > Start position should be all the time zero. You can't configure this IP
> > to start from bit 2. Width can vary but start is IMHO all the time from
> > 0 bit.
> >
> > Thanks,
> > Michal
>
> Hi Michal,
>
> I'm referring to the mask creation, not the data bus transfer; see the
> implementation of the xgpio_set_multiple() function in linux-next for
> reference:
> <https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/drivers/gpio/gpio-xilinx.c?h=akpm>.
>
> To generate the old mask we call the following:
>
> bitmap_set_value(old, state[0], 0, width[0]);
> bitmap_set_value(old, state[1], width[0], width[1]);
>
> Here, width[0] and width[1] can vary, which makes the exact values of
> the start and nbits parameters unknown beforehand (although we do know
> they are within the bitmap boundary).
>
> Regardless, this is not an issue because we know the bitmap_set_value()
> is supposed to be called with valid values. We just need a way to hint
> to GCC that this is the case, without increasing the latency of the
> function -- which I think is possible if we use __builtin_unreachable()
> for the conditional path checking the index against the length of the
> bitmap.
>
> William Breathitt Gray


Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-11-09 Thread Syed Nayyar Waris
On Mon, Nov 09, 2020 at 03:41:53PM +0100, Arnd Bergmann wrote:
> On Mon, Nov 9, 2020 at 2:41 PM William Breathitt Gray
>  wrote:
> > On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
> >
> > One of my concerns is that we're incurring the latency two additional
> > conditional checks just to suppress a compiler warning about a case that
> > wouldn't occur in the actual use of bitmap_set_value(). I'm hoping
> > there's a way for us to suppress these warnings without adding onto the
> > latency of this function; given that bitmap_set_value() is intended to
> > be used in loops, conditionals here could significantly increase latency
> > in drivers.
> 
> At least for this caller, the size check would be a compile-time
> constant that can be eliminated.
> 
> > I wonder if array_index_nospec() might have the side effect of
> > suppressing these warnings for us. For example, would this work:
> >
> > static inline void bitmap_set_value(unsigned long *map,
> > unsigned long value,
> > unsigned long start, unsigned long 
> > nbits)
> > {
> > const unsigned long offset = start % BITS_PER_LONG;
> > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > const unsigned long space = ceiling - start;
> > size_t index = BIT_WORD(start);
> >
> > value &= GENMASK(nbits - 1, 0);
> >
> > if (space >= nbits) {
> > index = array_index_nospec(index, index + 1);
> >
> > map[index] &= ~(GENMASK(nbits - 1, 0) << offset);
> > map[index] |= value << offset;
> > } else {
> > index = array_index_nospec(index, index + 2);
> >
> > map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
> > map[index + 0] |= value << offset;
> > map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > map[index + 1] |= value >> space;
> > }
> > }
> >
> > Or is this going to produce the same warning because we're not using an
> > explicit check against the map array size?
> 
> https://godbolt.org/z/fxnsG9
> 
> It still warns about the 'map[index + 1]' access: from all I can tell,
> gcc mainly complains because it cannot rule out that 'space < nbits',
> and then it knows the size of 'DECLARE_BITMAP(old, 64)' and finds
> that if 'index + 0' is correct, then 'index + 1' overflows that array.
> 
>   Arnd

Hi Arnd,

As suggested by William, sharing another solution to suppress the 
compiler warning. Please let me know your views on the below fix. Thanks.

If its alright, I shall submit a (new) v13 patchset soon. Let me know.

@@ -1,5 +1,5 @@
 static inline void bitmap_set_value(unsigned long *map,
-unsigned long value,
+unsigned long value, const size_t length,
 unsigned long start, unsigned long nbits)
 {
 const size_t index = BIT_WORD(start);
@@ -15,6 +15,10 @@ static inline void bitmap_set_value(unsigned long *map,
 } else {
 map[index + 0] &= ~BITMAP_FIRST_WORD_MASK(start);
 map[index + 0] |= value << offset;
+
+   if (index + 1 >= length)
+   __builtin_unreachable();
+
 map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
 map[index + 1] |= value >> space;
 }




Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-11-09 Thread Syed Nayyar Waris
On Mon, Nov 9, 2020 at 8:09 PM William Breathitt Gray
 wrote:
>
> On Mon, Nov 09, 2020 at 08:41:28AM -0500, William Breathitt Gray wrote:
> > On Mon, Nov 09, 2020 at 06:04:11PM +0530, Syed Nayyar Waris wrote:
> > > On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
> > > > On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
> > > >  wrote:
> > > > >
> > > > > On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> > > > > > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris 
> > > > > >  wrote:
> > > > > > >
> > > > > > > This patch reimplements the xgpio_set_multiple() function in
> > > > > > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > > > > > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > > > > > > to read and understand. Moreover, instead of looping for each bit
> > > > > > > in xgpio_set_multiple() function, now we can check each channel at
> > > > > > > a time and save cycles.
> > > > > >
> > > > > > This now causes -Wtype-limits warnings in linux-next with gcc-10:
> > > > >
> > > > > Hi Arnd,
> > > > >
> > > > > What version of gcc-10 are you running? I'm having trouble generating
> > > > > these warnings so I suspect I'm using a different version than you.
> > > >
> > > > I originally saw it with the binaries from
> > > > https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
> > > > also been able to reproduce it with a minimal test case on the
> > > > binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n
> > > >
> > > > > Let me first verify that I understand the problem correctly. The issue
> > > > > is the possibility of a stack smash in bitmap_set_value() when the 
> > > > > value
> > > > > of start + nbits is larger than the length of the map bitmap memory
> > > > > region. This is because index (or index + 1) could be outside the 
> > > > > range
> > > > > of the bitmap memory region passed in as map. Is my understanding
> > > > > correct here?
> > > >
> > > > Yes, that seems to be the case here.
> > > >
> > > > > In xgpio_set_multiple(), the variables width[0] and width[1] serve as
> > > > > possible start and nbits values for the bitmap_set_value() calls.
> > > > > Because width[0] and width[1] are unsigned int variables, GCC 
> > > > > considers
> > > > > the possibility that the value of width[0]/width[1] might exceed the
> > > > > length of the bitmap memory region named old and thus result in a 
> > > > > stack
> > > > > smash.
> > > > >
> > > > > I don't know if invalid width values are actually possible for the
> > > > > Xilinx gpio device, but let's err on the side of safety and assume 
> > > > > this
> > > > > is actually a possibility. We should verify that the combined value of
> > > > > gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
> > > > > check for this in xgpio_probe() when we grab the gpio_width values.
> > > > >
> > > > > However, we're still left with the GCC warnings because GCC is not 
> > > > > smart
> > > > > enough to know that we've already checked the boundary and width[0] 
> > > > > and
> > > > > width[1] are valid values. I suspect we can avoid this warning is we
> > > > > refactor bitmap_set_value() to increment map seperately and then set 
> > > > > it:
> > > >
> > > > As I understand it, part of the problem is that gcc sees the possible
> > > > range as being constrained by the operations on 'start' and 'nbits',
> > > > in particular the shift in BIT_WORD() that put an upper bound on
> > > > the index, but then it sees that the upper bound is higher than the
> > > > upper bound of the array, i.e. element zero.
> > > >
> > > > I added a check
> > > >
> > > >   if (start >= 64 || start + size >= 64) return;
> > > >
> > > > in the godbolt.org testcase, which does help limit the start
> > > > index appropriately,

Re: [PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-11-09 Thread Syed Nayyar Waris
On Sun, Nov 01, 2020 at 09:08:29PM +0100, Arnd Bergmann wrote:
> On Sun, Nov 1, 2020 at 4:00 PM William Breathitt Gray
>  wrote:
> >
> > On Thu, Oct 29, 2020 at 11:44:47PM +0100, Arnd Bergmann wrote:
> > > On Sun, Oct 18, 2020 at 11:44 PM Syed Nayyar Waris  
> > > wrote:
> > > >
> > > > This patch reimplements the xgpio_set_multiple() function in
> > > > drivers/gpio/gpio-xilinx.c to use the new generic functions:
> > > > bitmap_get_value() and bitmap_set_value(). The code is now simpler
> > > > to read and understand. Moreover, instead of looping for each bit
> > > > in xgpio_set_multiple() function, now we can check each channel at
> > > > a time and save cycles.
> > >
> > > This now causes -Wtype-limits warnings in linux-next with gcc-10:
> >
> > Hi Arnd,
> >
> > What version of gcc-10 are you running? I'm having trouble generating
> > these warnings so I suspect I'm using a different version than you.
> 
> I originally saw it with the binaries from
> https://mirrors.edge.kernel.org/pub/tools/crosstool/, but I have
> also been able to reproduce it with a minimal test case on the
> binaries from godbolt.org, see https://godbolt.org/z/Wq8q4n
> 
> > Let me first verify that I understand the problem correctly. The issue
> > is the possibility of a stack smash in bitmap_set_value() when the value
> > of start + nbits is larger than the length of the map bitmap memory
> > region. This is because index (or index + 1) could be outside the range
> > of the bitmap memory region passed in as map. Is my understanding
> > correct here?
> 
> Yes, that seems to be the case here.
> 
> > In xgpio_set_multiple(), the variables width[0] and width[1] serve as
> > possible start and nbits values for the bitmap_set_value() calls.
> > Because width[0] and width[1] are unsigned int variables, GCC considers
> > the possibility that the value of width[0]/width[1] might exceed the
> > length of the bitmap memory region named old and thus result in a stack
> > smash.
> >
> > I don't know if invalid width values are actually possible for the
> > Xilinx gpio device, but let's err on the side of safety and assume this
> > is actually a possibility. We should verify that the combined value of
> > gpio_width[0] + gpio_width[1] does not exceed 64 bits; we can add a
> > check for this in xgpio_probe() when we grab the gpio_width values.
> >
> > However, we're still left with the GCC warnings because GCC is not smart
> > enough to know that we've already checked the boundary and width[0] and
> > width[1] are valid values. I suspect we can avoid this warning is we
> > refactor bitmap_set_value() to increment map seperately and then set it:
> 
> As I understand it, part of the problem is that gcc sees the possible
> range as being constrained by the operations on 'start' and 'nbits',
> in particular the shift in BIT_WORD() that put an upper bound on
> the index, but then it sees that the upper bound is higher than the
> upper bound of the array, i.e. element zero.
> 
> I added a check
> 
>   if (start >= 64 || start + size >= 64) return;
> 
> in the godbolt.org testcase, which does help limit the start
> index appropriately, but it is not sufficient to let the compiler
> see that the 'if (space >= nbits) ' condition is guaranteed to
> be true for all values here.
> 
> > static inline void bitmap_set_value(unsigned long *map,
> > unsigned long value,
> > unsigned long start, unsigned long 
> > nbits)
> > {
> > const unsigned long offset = start % BITS_PER_LONG;
> > const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
> > const unsigned long space = ceiling - start;
> >
> > map += BIT_WORD(start);
> > value &= GENMASK(nbits - 1, 0);
> >
> > if (space >= nbits) {
> > *map &= ~(GENMASK(nbits - 1, 0) << offset);
> > *map |= value << offset;
> > } else {
> > *map &= ~BITMAP_FIRST_WORD_MASK(start);
> > *map |= value << offset;
> > map++;
> > *map &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > *map |= value >> space;
> > }
> > }
> >
> > This avoids adding a costly conditional check inside bitmap_set_value()
> > when almost all bitmap_set_valu

Re: [PATCH v11 0/4] Introduce the for_each_set_clump macro

2020-10-23 Thread Syed Nayyar Waris
On Wed, Oct 7, 2020 at 2:09 PM Linus Walleij  wrote:
>
> On Tue, Oct 6, 2020 at 11:20 AM Syed Nayyar Waris  
> wrote:
>
> > Since this patchset primarily affects GPIO drivers, would you like
> > to pick it up through your GPIO tree?
>
> Definitely will, once we are finished!
>
> I see Andy still has comments and we need more iterations.
> That is fine, because we are not in any hurry. Just keep posting
> it!
>
> Let's merge this for v5.11 when we are finished with it.
>
> Yours,
> Linus Walleij

Hi Linus,

Just thought of giving an update. The V-12 revision of the patchset
has been submitted (19 Oct).

Let me know if you face any questions regarding the V-12 patchset. Thanks !

Regards
Syed Nayyar Waris


[PATCH v12 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-10-18 Thread Syed Nayyar Waris
This patch reimplements the xgpio_set_multiple() function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value() and bitmap_set_value(). The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple() function, now we can check each channel at
a time and save cycles.

Cc: Bartosz Golaszewski 
Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
Signed-off-by: William Breathitt Gray 
---
Changes in v12:
 - Remove extra empty newline.

Changes in v11:
 - Change variable name 'flag' to 'flags'.

Changes in v10:
 - No change.

Changes in v9:
 - Remove looping of 'for_each_set_clump' and instead process two
   halves of a 64-bit bitmap separately or individually. Use normal spin_lock 
   call for second inner lock. And take the spin_lock_init call outside the 'if'
   condition in the 'probe' function of driver.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - Minor change: Inline values '32' and '64' in code for better
   code readability.

Changes in v4:
 - Minor change: Inline values '32' and '64' in code for better
   code readability.

Changes in v3:
 - No change.

Changes in v2:
 - No change

 drivers/gpio/gpio-xilinx.c | 65 +++---
 1 file changed, 32 insertions(+), 33 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 67f9f82e0db0..3ba1a993c85e 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -138,37 +138,37 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
 {
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
-
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
-   }
-
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+   u32 *const state = chip->gpio_state;
+   unsigned int *const width = chip->gpio_width;
+
+   DECLARE_BITMAP(old, 64);
+   DECLARE_BITMAP(new, 64);
+   DECLARE_BITMAP(changed, 64);
+
+   spin_lock_irqsave(&chip->gpio_lock[0], flags);
+   spin_lock(&chip->gpio_lock[1]);
+
+   bitmap_set_value(old, state[0], 0, width[0]);
+   bitmap_set_value(old, state[1], width[0], width[1]);
+   bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+   bitmap_set_value(old, state[0], 0, 32);
+   bitmap_set_value(old, state[1], 32, 32);
+   state[0] = bitmap_get_value(new, 0, width[0]);
+   state[1] = bitmap_get_value(new, width[0], width[1]);
+   bitmap_set_value(new, state[0], 0, 32);
+   bitmap_set_value(new, state[1], 32, 32);
+   bitmap_xor(changed, old, new, 64);
+
+   if (((u32 *)changed)[0])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+   state[0]);
+   if (((u32 *)changed)[1])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+   XGPIO_CHANNEL_OFFSET, state[1]);
+
+   spin_unlock(&chip->gpio_lock[1]);
+   spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
 }
 
 /**
@@ -292,6 +292,7 @@ static int xgpio_probe(struct platform_device *pdev)
chip->gpio_width[0] = 32;
 
spin_lock_init(&chip->gpio_lock[0]);
+   spin_lock_init(&chip->gpio_lock[1]);
 
if (of_property_read_u32(np, "xlnx,is-dual", &is_dual))
is_dual = 0;
@@ -313,8 +314,6 @@ static int xgpio_probe(struct platform_device *pdev)
  

[PATCH v12 3/4] gpio: thunderx: Utilize for_each_set_clump macro

2020-10-18 Thread Syed Nayyar Waris
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
Signed-off-by: William Breathitt Gray 
---
Changes in v12:
 - No change.

Changes in v11:
 - No change.

Changes in v10:
 - No change.

Changes in v9:
 - No change.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - Minor change: Inline value '64' in code for better code readability.

Changes in v3:
 - Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

Changes in v2:
 - No change.

 drivers/gpio/gpio-thunderx.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..58c9bb25a377 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.26.2



[PATCH v12 1/4] bitops: Introduce the for_each_set_clump macro

2020-10-18 Thread Syed Nayyar Waris
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size from 1 to BITS_PER_LONG. size less
than 1 or more than BITS_PER_LONG causes undefined behaviour.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.

Cc: Arnd Bergmann 
Signed-off-by: Syed Nayyar Waris 
Reviewed-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
Changes in v12:
 - Format and modify comments.
 - Optimize code using '<<' operator with GENMASK.

Changes in v11:
 - Document valid range of values that 'nbits' can take.

Changes in v10:
 - No change.

Changes in v9:
 - No change.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - No change.

Changes in v3:
 - No change.

Changes in v2:
 - No change.

 include/asm-generic/bitops/find.h | 19 ++
 include/linux/bitmap.h| 61 +++
 include/linux/bitops.h| 13 +++
 lib/find_bit.c| 14 +++
 4 files changed, 107 insertions(+)

diff --git a/include/asm-generic/bitops/find.h 
b/include/asm-generic/bitops/find.h
index 9fdf21302fdf..4e6600759455 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -97,4 +97,23 @@ extern unsigned long find_next_clump8(unsigned long *clump,
 #define find_first_clump8(clump, bits, size) \
find_next_clump8((clump), (bits), (size), 0)
 
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+extern unsigned long find_next_clump(unsigned long *clump,
+ const unsigned long *addr,
+ unsigned long size, unsigned long offset,
+ unsigned long clump_size);
+
+#define find_first_clump(clump, bits, size, clump_size) \
+   find_next_clump((clump), (bits), (size), 0, (clump_size))
+
 #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 99058eb81042..2ee934484532 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -75,7 +75,11 @@
  *  bitmap_from_arr32(dst, buf, nbits)  Copy nbits from u32[] buf to 
dst
  *  bitmap_to_arr32(buf, src, nbits)Copy nbits from buf to u32[] 
dst
  *  bitmap_get_value8(map, start)   Get 8bit value from map at 
start
+ *  bitmap_get_value(map, start, nbits)Get bit value of size
+ *  'nbits' from map at start
  *  bitmap_set_value8(map, value, start)Set 8bit value to map at start
+ *  bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits'
+ *  of map at start
  *
  * Note, bitmap_zero() and bitmap_fill() operate over the region of
  * unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -563,6 +567,34 @@ static inline unsigned long bitmap_get_value8(const 
unsigned long *map,
return (map[index] >> offset) & 0xFF;
 }
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive).
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = round_up(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high

[PATCH v12 2/4] lib/test_bitmap.c: Add for_each_set_clump test cases

2020-10-18 Thread Syed Nayyar Waris
The introduction of the generic for_each_set_clump macro need test
cases to verify the implementation. This patch adds test cases for
scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
The cases contain situations where clump is getting split at the word
boundary and also when zeroes are present in the start and middle of
bitmap.

Signed-off-by: Syed Nayyar Waris 
Reviewed-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
Changes in v12:
 - No change.

Changes in v11:
 - No change.

Changes in v10:
 - No change.

Changes in v9:
 - No change.

Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - Make 'for loop' inside 'test_for_each_set_clump' more succinct.

Changes in v5:
 - No change.

Changes in v4:
 - Use 'for' loop in test function of 'for_each_set_clump'.

Changes in v3:
 - No Change.

Changes in v2:
 - Unify different tests for 'for_each_set_clump'. Pass test data as
   function parameters.
 - Remove unnecessary bitmap_zero calls.

 lib/test_bitmap.c | 144 ++
 1 file changed, 144 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index df903c53952b..cb2cf3858f93 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -155,6 +155,37 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
return true;
 }
 
+static bool __init __check_eq_clump(const char *srcfile, unsigned int line,
+   const unsigned int offset,
+   const unsigned int size,
+   const unsigned long *const clump_exp,
+   const unsigned long *const clump,
+   const unsigned long clump_size)
+{
+   unsigned long exp;
+
+   if (offset >= size) {
+   pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected 
less than %u, got %u\n",
+   srcfile, line, size, offset);
+   return false;
+   }
+
+   exp = clump_exp[offset / clump_size];
+   if (!exp) {
+   pr_warn("[%s:%u] bit offset for zero clump: expected nonzero 
clump, got bit offset %u with clump value 0",
+   srcfile, line, offset);
+   return false;
+   }
+
+   if (*clump != exp) {
+   pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value 
of 0x%lX",
+   srcfile, line, exp, *clump);
+   return false;
+   }
+
+   return true;
+}
+
 #define __expect_eq(suffix, ...)   \
({  \
int result = 0; \
@@ -172,6 +203,7 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
 #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
 #define expect_eq_u32_array(...)   __expect_eq(u32_array, ##__VA_ARGS__)
 #define expect_eq_clump8(...)  __expect_eq(clump8, ##__VA_ARGS__)
+#define expect_eq_clump(...)   __expect_eq(clump, ##__VA_ARGS__)
 
 static void __init test_zero_clear(void)
 {
@@ -577,6 +609,28 @@ static void noinline __init test_mem_optimisations(void)
}
 }
 
+static const unsigned long clump_bitmap_data[] __initconst = {
+   0x38000201,
+   0x05ff0f38,
+   0xeffedcba,
+   0xabcd,
+   0x00aa,
+   0x00aa,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x,
+   0x,
+   0x,
+   0x0f00,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x0ac0,
+};
+
 static const unsigned char clump_exp[] __initconst = {
0x01,   /* 1 bit set */
0x02,   /* non-edge 1 bit set */
@@ -588,6 +642,95 @@ static const unsigned char clump_exp[] __initconst = {
0x05,   /* non-adjacent 2 bits set */
 };
 
+static const unsigned long clump_exp1[] __initconst = {
+   0x01,   /* 1 bit set */
+   0x02,   /* non-edge 1 bit set */
+   0x00,   /* zero bits set */
+   0x38,   /* 3 bits set across 4-bit boundary */
+   0x38,   /* Repeated clump */
+   0x0F,   /* 4 bits set */
+   0xFF,   /* all bits set */
+   0x05,   /* non-adjacent 2 bits set */
+};
+
+static const unsigned long clump_exp2[] __initconst = {
+   0xfedcba,   /* 24 bits */
+   0xabcdef,
+   0xaa,   /* Clump split between 2 words */
+   0x00,   /* zeroes in between */
+   0xaa,
+   0x00,
+  

[PATCH v12 0/4] Introduce the for_each_set_clump macro

2020-10-18 Thread Syed Nayyar Waris
This patchset introduces a new generic version of for_each_set_clump. 
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump (n-bits) having 
size between 1 and BITS_PER_LONG inclusive. size less than 1 or more than 
BITS_PER_LONG causes undefined behaviour. The patchset utilizes the new 
macro in some GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word 
size is not multiple of clump size. Following are examples showing the working 
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example 
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour). 

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v12:
 - [Patch 1/4]: Format and modify comments.
 - [Patch 1/4]: Optimize code using '<<' operator with GENMASK.
 - [Patch 4/4]: Remove extra empty newline.

Changes in v11:
 - [Patch 1/4]: Document range of values 'nbits' can take.
 - [Patch 4/4]: Change variable name 'flag' to 'flags'.

Changes in v10:
 - Patchset based on v5.9-rc1.

Changes in v9:
 - [Patch 4/4]: Remove looping of 'for_each_set_clump' and instead process two 
   halves of a 64-bit bitmap separately or individually. Use normal spin_lock 
   call for second inner lock. And take the spin_lock_init call outside the 'if'
   condition in the probe function of driver.

Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - [Patch 2/4]: Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - [Patch 2/4]: Make 'for loop' inside test_for_each_set_clump more
   succinct.

Changes in v5:
 - [Patch 4/4]: Minor change: Hardcode value for better code readability.

Changes in v4:
 - [Patch 2/4]: Use 'for' loop in test function of for_each_set_clump.
 - [Patch 3/4]: Minor change: Inline value for better code readability.
 - [Patch 4/4]: Minor change: Inline value for better code readability.

Changes in v3:
 - [Patch 3/4]: Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

CHanges in v2:
 - [Patch 2/4]: Unify different tests for 'for_each_set_clump'. Pass test data 
as
   function parameters.
 - [Patch 2/4]: Remove unnecessary bitmap_zero calls.

Syed Nayyar Waris (4):
  bitops: Introduce the for_each_set_clump macro
  lib/test_bitmap.c: Add for_each_set_clump test cases
  g

Re: [PATCH v11 1/4] bitops: Introduce the for_each_set_clump macro

2020-10-16 Thread Syed Nayyar Waris
On Fri, Oct 16, 2020 at 2:46 PM Andy Shevchenko
 wrote:
>
> On Fri, Oct 16, 2020 at 04:23:05AM +0530, Syed Nayyar Waris wrote:
> > On Tue, Oct 6, 2020 at 4:56 PM Andy Shevchenko
> >  wrote:
> > > On Tue, Oct 06, 2020 at 02:52:16PM +0530, Syed Nayyar Waris wrote:
>
> ...
>
> > > > + return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> > >
> > > Have you considered to use rather BIT{_ULL}(nbits) - 1?
> > > It maybe better for code generation.
> >
> > Yes I have considered using BIT{_ULL} in earlier versions of patchset.
> > It has a problem:
> >
> > This macro when used in both bitmap_get_value and
> > bitmap_set_value functions, it will give unexpected results when nbits or 
> > clump
> > size is BITS_PER_LONG (32 or 64 depending on arch).
> >
> > Actually when nbits (clump size) is 64 (BITS_PER_LONG is 64, for example),
> > (BIT(nbits) - 1)
> > gives a value of zero and when this zero is ANDed with any value, it
> > makes it full zero. This is unexpected, and incorrect calculation occurs.
> >
> > What actually happens is in the macro expansion of BIT(64), that is 1
> > << 64, the '1' overflows from leftmost bit position (most significant
> > bit) and re-enters at the rightmost bit position (least significant
> > bit), therefore 1 << 64 becomes '0x1', and when another '1' is
> > subtracted from this, the final result becomes 0.
> >
> > This is undefined behavior in the C standard (section 6.5.7 in the N1124)
>
> I see, indeed, for 64/32 it is like this.
>
> ...
>
> > Yes I have incorporated your suggestion to use the '<<' operator. Thank You.
>
> One side note, consider the use round_up() vs. roundup(). I don't remember
> which one is optimized to divisor being power of 2.

Yes. changed 'roundup' to 'round_up'. 'round_up' is optimized for
power-of-2. Thank you.

Syed Nayyar Waris


Re: [PATCH v11 1/4] bitops: Introduce the for_each_set_clump macro

2020-10-15 Thread Syed Nayyar Waris
On Tue, Oct 6, 2020 at 4:56 PM Andy Shevchenko
 wrote:
>
> On Tue, Oct 06, 2020 at 02:52:16PM +0530, Syed Nayyar Waris wrote:
> > This macro iterates for each group of bits (clump) with set bits,
> > within a bitmap memory region. For each iteration, "start" is set to
> > the bit offset of the found clump, while the respective clump value is
> > stored to the location pointed by "clump". Additionally, the
> > bitmap_get_value() and bitmap_set_value() functions are introduced to
> > respectively get and set a value of n-bits in a bitmap memory region.
> > The n-bits can have any size less than or equal to BITS_PER_LONG.
> > Moreover, during setting value of n-bit in bitmap, if a situation arise
> > that the width of next n-bit is exceeding the word boundary, then it
> > will divide itself such that some portion of it is stored in that word,
> > while the remaining portion is stored in the next higher word. Similar
> > situation occurs while retrieving the value from bitmap.
>
> ...
>
> > @@ -75,7 +75,11 @@
> >   *  bitmap_from_arr32(dst, buf, nbits)  Copy nbits from u32[] buf 
> > to dst
> >   *  bitmap_to_arr32(buf, src, nbits)Copy nbits from buf to 
> > u32[] dst
> >   *  bitmap_get_value8(map, start)   Get 8bit value from map at 
> > start
> > + *  bitmap_get_value(map, start, nbits)  Get bit value of size
> > + *   'nbits' from map at start
> >   *  bitmap_set_value8(map, value, start)Set 8bit value to map at 
> > start
> > + *  bitmap_set_value(map, value, start, nbits)   Set bit value of size 
> > 'nbits'
> > + *   of map at start
>
> Formatting here is done with solely spaces, no TABs.

Okay. Done

>
> ...
>
> > +/**
> > + * bitmap_get_value - get a value of n-bits from the memory region
> > + * @map: address to the bitmap memory region
> > + * @start: bit offset of the n-bit value
> > + * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
> > inclusive).
>
>
> > + *   nbits less than 1 or more than BITS_PER_LONG causes undefined 
> > behaviour.
>
> Please, detach this from field description and move to a main description.

Okay. Done.
>
> > + *
> > + * Returns value of nbits located at the @start bit offset within the @map
> > + * memory region.
> > + */
>
> ...
>
> > + return (map[index] >> offset) & GENMASK(nbits - 1, 0);
>
> Have you considered to use rather BIT{_ULL}(nbits) - 1?
> It maybe better for code generation.

Yes I have considered using BIT{_ULL} in earlier versions of patchset.
It has a problem:

This macro when used in both bitmap_get_value and
bitmap_set_value functions, it will give unexpected results when nbits or clump
size is BITS_PER_LONG (32 or 64 depending on arch).

Actually when nbits (clump size) is 64 (BITS_PER_LONG is 64, for example),
(BIT(nbits) - 1)
gives a value of zero and when this zero is ANDed with any value, it
makes it full zero. This is unexpected, and incorrect calculation occurs.

What actually happens is in the macro expansion of BIT(64), that is 1
<< 64, the '1' overflows from leftmost bit position (most significant
bit) and re-enters at the rightmost bit position (least significant
bit), therefore 1 << 64 becomes '0x1', and when another '1' is
subtracted from this, the final result becomes 0.

This is undefined behavior in the C standard (section 6.5.7 in the N1124)

>
> ...
>
> > +/**
> > + * bitmap_set_value - set n-bit value within a memory region
> > + * @map: address to the bitmap memory region
> > + * @value: value of nbits
> > + * @start: bit offset of the n-bit value
> > + * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
> > inclusive).
>
> > + *   nbits less than 1 or more than BITS_PER_LONG causes undefined 
> > behaviour.
>
> Please, detach this from field description and move to a main description.

Okay. Done

>
> > + */
>
> ...
>
> > + value &= GENMASK(nbits - 1, 0);
>
> Ditto.
>
> > + map[index] &= ~(GENMASK(nbits + offset - 1, offset));
>
> Last time I checked such GENMASK) use, it gave a lot of code when
> GENMASK(nbits - 1, 0) << offset works much better, but see also above.

Yes I have incorporated your suggestion to use the '<<' operator. Thank You.


>
> --
> With Best Regards,
> Andy Shevchenko
>
>


[PATCH v11 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-10-06 Thread Syed Nayyar Waris
This patch reimplements the xgpio_set_multiple function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value and bitmap_set_value. The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple function, now we can check each channel at
a time and save cycles.

Cc: Bartosz Golaszewski 
Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
Signed-off-by: William Breathitt Gray 
---
Changes in v11:
 - Change variable name 'flag' to 'flags'.

Changes in v10:
 - No change.

Changes in v9:
 - Remove looping of 'for_each_set_clump' and instead process two
   halves of a 64-bit bitmap separately or individually. Use normal spin_lock 
   call for second inner lock. And take the spin_lock_init call outside the 'if'
   condition in the 'probe' function of driver.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - Minor change: Inline values '32' and '64' in code for better
   code readability.

Changes in v4:
 - Minor change: Inline values '32' and '64' in code for better
   code readability.

Changes in v3:
 - No change.

Changes in v2:
 - No change

 drivers/gpio/gpio-xilinx.c | 64 +++---
 1 file changed, 32 insertions(+), 32 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 67f9f82e0db0..f86bee271246 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -138,37 +138,37 @@ static void xgpio_set_multiple(struct gpio_chip *gc, 
unsigned long *mask,
 {
unsigned long flags;
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
-
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
-   }
-
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+   u32 *const state = chip->gpio_state;
+   unsigned int *const width = chip->gpio_width;
+
+   DECLARE_BITMAP(old, 64);
+   DECLARE_BITMAP(new, 64);
+   DECLARE_BITMAP(changed, 64);
+
+   spin_lock_irqsave(&chip->gpio_lock[0], flags);
+   spin_lock(&chip->gpio_lock[1]);
+
+   bitmap_set_value(old, state[0], 0, width[0]);
+   bitmap_set_value(old, state[1], width[0], width[1]);
+   bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+   bitmap_set_value(old, state[0], 0, 32);
+   bitmap_set_value(old, state[1], 32, 32);
+   state[0] = bitmap_get_value(new, 0, width[0]);
+   state[1] = bitmap_get_value(new, width[0], width[1]);
+   bitmap_set_value(new, state[0], 0, 32);
+   bitmap_set_value(new, state[1], 32, 32);
+   bitmap_xor(changed, old, new, 64);
+
+   if (((u32 *)changed)[0])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+   state[0]);
+   if (((u32 *)changed)[1])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+   XGPIO_CHANNEL_OFFSET, state[1]);
+
+   spin_unlock(&chip->gpio_lock[1]);
+   spin_unlock_irqrestore(&chip->gpio_lock[0], flags);
 }
 
 /**
@@ -292,6 +292,7 @@ static int xgpio_probe(struct platform_device *pdev)
chip->gpio_width[0] = 32;
 
spin_lock_init(&chip->gpio_lock[0]);
+   spin_lock_init(&chip->gpio_lock[1]);
 
if (of_property_read_u32(np, "xlnx,is-dual", &is_dual))
is_dual = 0;
@@ -314,7 +315,6 @@ static int xgpio_probe(struct platform_device *pdev)
 &chip->gpio_width[1]))
chip->gpio_width[1] = 32;
 
-   spin_lock_init(&chip->gpio_lock[1]);
}
 
chip->gc.base = -1;
-- 
2.26.2



[PATCH v11 3/4] gpio: thunderx: Utilize for_each_set_clump macro

2020-10-06 Thread Syed Nayyar Waris
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
Signed-off-by: William Breathitt Gray 
---
Changes in v11:
 - No change.

Changes in v10:
 - No change.

Changes in v9:
 - No change.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - Minor change: Inline value '64' in code for better code readability.

Changes in v3:
 - Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

Changes in v2:
 - No change.

 drivers/gpio/gpio-thunderx.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..58c9bb25a377 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.26.2



[PATCH v11 2/4] lib/test_bitmap.c: Add for_each_set_clump test cases

2020-10-06 Thread Syed Nayyar Waris
The introduction of the generic for_each_set_clump macro need test
cases to verify the implementation. This patch adds test cases for
scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
The cases contain situations where clump is getting split at the word
boundary and also when zeroes are present in the start and middle of
bitmap.

Signed-off-by: Syed Nayyar Waris 
Reviewed-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
Changes in v11:
 - No change.

Changes in v10:
 - No change.

Changes in v9:
 - No change.

Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - Make 'for loop' inside 'test_for_each_set_clump' more succinct.

Changes in v5:
 - No change.

Changes in v4:
 - Use 'for' loop in test function of 'for_each_set_clump'.

Changes in v3:
 - No Change.

Changes in v2:
 - Unify different tests for 'for_each_set_clump'. Pass test data as
   function parameters.
 - Remove unnecessary bitmap_zero calls.

 lib/test_bitmap.c | 144 ++
 1 file changed, 144 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index df903c53952b..cb2cf3858f93 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -155,6 +155,37 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
return true;
 }
 
+static bool __init __check_eq_clump(const char *srcfile, unsigned int line,
+   const unsigned int offset,
+   const unsigned int size,
+   const unsigned long *const clump_exp,
+   const unsigned long *const clump,
+   const unsigned long clump_size)
+{
+   unsigned long exp;
+
+   if (offset >= size) {
+   pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected 
less than %u, got %u\n",
+   srcfile, line, size, offset);
+   return false;
+   }
+
+   exp = clump_exp[offset / clump_size];
+   if (!exp) {
+   pr_warn("[%s:%u] bit offset for zero clump: expected nonzero 
clump, got bit offset %u with clump value 0",
+   srcfile, line, offset);
+   return false;
+   }
+
+   if (*clump != exp) {
+   pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value 
of 0x%lX",
+   srcfile, line, exp, *clump);
+   return false;
+   }
+
+   return true;
+}
+
 #define __expect_eq(suffix, ...)   \
({  \
int result = 0; \
@@ -172,6 +203,7 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
 #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
 #define expect_eq_u32_array(...)   __expect_eq(u32_array, ##__VA_ARGS__)
 #define expect_eq_clump8(...)  __expect_eq(clump8, ##__VA_ARGS__)
+#define expect_eq_clump(...)   __expect_eq(clump, ##__VA_ARGS__)
 
 static void __init test_zero_clear(void)
 {
@@ -577,6 +609,28 @@ static void noinline __init test_mem_optimisations(void)
}
 }
 
+static const unsigned long clump_bitmap_data[] __initconst = {
+   0x38000201,
+   0x05ff0f38,
+   0xeffedcba,
+   0xabcd,
+   0x00aa,
+   0x00aa,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x,
+   0x,
+   0x,
+   0x0f00,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x0ac0,
+};
+
 static const unsigned char clump_exp[] __initconst = {
0x01,   /* 1 bit set */
0x02,   /* non-edge 1 bit set */
@@ -588,6 +642,95 @@ static const unsigned char clump_exp[] __initconst = {
0x05,   /* non-adjacent 2 bits set */
 };
 
+static const unsigned long clump_exp1[] __initconst = {
+   0x01,   /* 1 bit set */
+   0x02,   /* non-edge 1 bit set */
+   0x00,   /* zero bits set */
+   0x38,   /* 3 bits set across 4-bit boundary */
+   0x38,   /* Repeated clump */
+   0x0F,   /* 4 bits set */
+   0xFF,   /* all bits set */
+   0x05,   /* non-adjacent 2 bits set */
+};
+
+static const unsigned long clump_exp2[] __initconst = {
+   0xfedcba,   /* 24 bits */
+   0xabcdef,
+   0xaa,   /* Clump split between 2 words */
+   0x00,   /* zeroes in between */
+   0xaa,
+   0x00,
+   0xff,
+   0xaa,
+   0x0

[PATCH v11 1/4] bitops: Introduce the for_each_set_clump macro

2020-10-06 Thread Syed Nayyar Waris
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value() and bitmap_set_value() functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size less than or equal to BITS_PER_LONG.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving the value from bitmap.

Cc: Arnd Bergmann 
Signed-off-by: Syed Nayyar Waris 
Reviewed-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
Changes in v11:
 - Document valid range of values that 'nbits' can take.

Changes in v10:
 - No change.

Changes in v9:
 - No change.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - No change.

Changes in v3:
 - No change.

Changes in v2:
 - No change.

 include/asm-generic/bitops/find.h | 19 ++
 include/linux/bitmap.h| 63 +++
 include/linux/bitops.h| 13 +++
 lib/find_bit.c| 14 +++
 4 files changed, 109 insertions(+)

diff --git a/include/asm-generic/bitops/find.h 
b/include/asm-generic/bitops/find.h
index 9fdf21302fdf..4e6600759455 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -97,4 +97,23 @@ extern unsigned long find_next_clump8(unsigned long *clump,
 #define find_first_clump8(clump, bits, size) \
find_next_clump8((clump), (bits), (size), 0)
 
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+extern unsigned long find_next_clump(unsigned long *clump,
+ const unsigned long *addr,
+ unsigned long size, unsigned long offset,
+ unsigned long clump_size);
+
+#define find_first_clump(clump, bits, size, clump_size) \
+   find_next_clump((clump), (bits), (size), 0, (clump_size))
+
 #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 99058eb81042..6e0cc6877b68 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -75,7 +75,11 @@
  *  bitmap_from_arr32(dst, buf, nbits)  Copy nbits from u32[] buf to 
dst
  *  bitmap_to_arr32(buf, src, nbits)Copy nbits from buf to u32[] 
dst
  *  bitmap_get_value8(map, start)   Get 8bit value from map at 
start
+ *  bitmap_get_value(map, start, nbits)Get bit value of size
+ * 'nbits' from map at start
  *  bitmap_set_value8(map, value, start)Set 8bit value to map at start
+ *  bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits'
+ * of map at start
  *
  * Note, bitmap_zero() and bitmap_fill() operate over the region of
  * unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -563,6 +567,35 @@ static inline unsigned long bitmap_get_value8(const 
unsigned long *map,
return (map[index] >> offset) & 0xFF;
 }
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits (must be between 1 and BITS_PER_LONG 
inclusive).
+ * nbits less than 1 or more than BITS_PER_LONG causes undefined behaviour.
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high;
+
+   if (space >= nbits)
+   return (map[index] >> offset) & GE

[PATCH v11 0/4] Introduce the for_each_set_clump macro

2020-10-06 Thread Syed Nayyar Waris
Hello Linus,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

This patchset introduces a new generic version of for_each_set_clump. 
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro 
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word 
size is not multiple of clump size. Following are examples showing the working 
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example 
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour). 

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v11:
 - [Patch 1/4]: Document range of values 'nbits' can take.
 - [Patch 4/4]: Change variable name 'flag' to 'flags'.

Changes in v10:
 - Patchset based on v5.9-rc1.

Changes in v9:
 - [Patch 4/4]: Remove looping of 'for_each_set_clump' and instead process two 
   halves of a 64-bit bitmap separately or individually. Use normal spin_lock 
   call for second inner lock. And take the spin_lock_init call outside the 'if'
   condition in the probe function of driver.

Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - [Patch 2/4]: Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - [Patch 2/4]: Make 'for loop' inside test_for_each_set_clump more
   succinct.

Changes in v5:
 - [Patch 4/4]: Minor change: Hardcode value for better code readability.

Changes in v4:
 - [Patch 2/4]: Use 'for' loop in test function of for_each_set_clump.
 - [Patch 3/4]: Minor change: Inline value for better code readability.
 - [Patch 4/4]: Minor change: Inline value for better code readability.

Changes in v3:
 - [Patch 3/4]: Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

CHanges in v2:
 - [Patch 2/4]: Unify different tests for 'for_each_set_clump'. Pass test data 
as
   function parameters.
 - [Patch 2/4]: Remove unnecessary bitmap_zero calls.

Syed Nayyar Waris (4):
  bitops: Introduce the for_each_set_clump macro
  lib/test_bitmap.c: Add for_each_set_clump test cases
  gpio: thunderx: Utilize for_each_set_clump macro
  gpio: xilinx: Utilize generic bitmap_get_value and _set_value

 drivers/gpio/gpio-thunder

Re: [PATCH v10 1/4] bitops: Introduce the for_each_set_clump macro

2020-10-03 Thread Syed Nayyar Waris
On Sat, Oct 3, 2020 at 6:32 PM Andy Shevchenko
 wrote:
>
> On Sat, Oct 3, 2020 at 3:56 PM William Breathitt Gray
>  wrote:
> > On Sat, Oct 03, 2020 at 03:45:04PM +0300, Andy Shevchenko wrote:
> > > On Sat, Oct 3, 2020 at 2:37 PM Syed Nayyar Waris  
> > > wrote:
> > > > On Sat, Oct 3, 2020 at 2:14 PM Andy Shevchenko
> > > >  wrote:
> > > > > On Sat, Oct 3, 2020 at 2:51 AM Syed Nayyar Waris 
> > > > >  wrote:
>
> ...
>
> > > > > > +   map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
> > > > > > +   map[index] |= value << offset;
> > >
> > > Side note: I would prefer + 0 here and there, but it's up to you.

Andy what do you mean by the above statement, can you please clarify?
Can you please elaborate on the above statement.

Thanks

> > >
> > > > > > +   map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + 
> > > > > > nbits);
> > > > > > +   map[index + 1] |= (value >> space);
> > >
> > > By the way, what about this in the case of start=0, nbits > 64?
> > > space == 64 -> UB.
> > >
> > > (And btw parentheses are redundant here)
> >
> > I think this is the same situation as before: we should document that
> > nbits must be between 1 and BITS_PER_LONG.
>
> At least documented, yes.
>
> --
> With Best Regards,
> Andy Shevchenko


Re: [PATCH v10 1/4] bitops: Introduce the for_each_set_clump macro

2020-10-03 Thread Syed Nayyar Waris
On Sat, Oct 3, 2020 at 2:14 PM Andy Shevchenko
 wrote:
>
> On Sat, Oct 3, 2020 at 2:51 AM Syed Nayyar Waris  wrote:
>
> Now I remember...
> This needs to be revisited.
>
> > This macro iterates for each group of bits (clump) with set bits,
> > within a bitmap memory region. For each iteration, "start" is set to
> > the bit offset of the found clump, while the respective clump value is
> > stored to the location pointed by "clump". Additionally, the
> > bitmap_get_value and bitmap_set_value functions are introduced to
>
> Mark functions like func() in the text as well.
Okay

>
> > respectively get and set a value of n-bits in a bitmap memory region.
> > The n-bits can have any size less than or equal to BITS_PER_LONG.
> > Moreover, during setting value of n-bit in bitmap, if a situation arise
> > that the width of next n-bit is exceeding the word boundary, then it
> > will divide itself such that some portion of it is stored in that word,
> > while the remaining portion is stored in the next higher word. Similar
> > situation occurs while retrieving value of n-bits from bitmap.
>
> retrieving the value
> from a bitmap
Okay

>
> ...
>
> > +/**
> > + * bitmap_get_value - get a value of n-bits from the memory region
> > + * @map: address to the bitmap memory region
> > + * @start: bit offset of the n-bit value
> > + * @nbits: size of value in bits
> > + *
> > + * Returns value of nbits located at the @start bit offset within the @map
> > + * memory region.
> > + */
> > +static inline unsigned long bitmap_get_value(const unsigned long *map,
> > + unsigned long start,
> > + unsigned long nbits)
> > +{
> > +   const size_t index = BIT_WORD(start);
> > +   const unsigned long offset = start % BITS_PER_LONG;
> > +   const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
> > +   const unsigned long space = ceiling - start;
> > +   unsigned long value_low, value_high;
> > +
> > +   if (space >= nbits)
> > +   return (map[index] >> offset) & GENMASK(nbits - 1, 0);
>
> This is UB in GENMASK() when nbits == 0.

'nbits' actually specifies the width of clump value. Basically 'nbits'
denotes how-many-bits wide the clump value is.
'nbits' having a value of '0' means zero-width-sized clump, meaning
nothing. 'nbits' can take valid values from '1' to BITS_PER_LONG.
The minimum value the 'nbits' can have is 1 because the smallest sized
clump can be 1-bit-wide. It can't be smaller than that.

Let me know if I have misunderstood something?

>
> > +   else {
> > +   value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
> > +   value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
> > nbits);
> > +   return (value_low >> offset) | (value_high << space);
> > +   }
> > +}
>
> ...
>
> > +/**
> > + * bitmap_set_value - set n-bit value within a memory region
> > + * @map: address to the bitmap memory region
> > + * @value: value of nbits
> > + * @start: bit offset of the n-bit value
> > + * @nbits: size of value in bits
> > + */
> > +static inline void bitmap_set_value(unsigned long *map,
> > +   unsigned long value,
> > +   unsigned long start, unsigned long 
> > nbits)
> > +{
> > +   const size_t index = BIT_WORD(start);
> > +   const unsigned long offset = start % BITS_PER_LONG;
> > +   const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
> > +   const unsigned long space = ceiling - start;
>
> > +   value &= GENMASK(nbits - 1, 0);
>
> This is UB when nbits == 0.

Same as above.
'nbits' actually specifies the width of clump value. Basically 'nbits'
denotes how-many-bits wide the clump value is.
'nbits' having a value of '0' means zero-width-sized clump, meaning
nothing. 'nbits' can take valid values from '1' to BITS_PER_LONG.
The minimum value the 'nbits' can have is 1 because the smallest sized
clump can be 1-bit-wide. It can't be smaller than that.

>
> > +   if (space >= nbits) {
> > +   map[index] &= ~(GENMASK(nbits + offset - 1, offset));
>
> UB when nbits == 0 and start == 0.
>
> > +   map[index] |= value << offset;
> > +   } else {
> > +   

[PATCH v10 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value

2020-10-02 Thread Syed Nayyar Waris
This patch reimplements the xgpio_set_multiple function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value and bitmap_set_value. The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple function, now we can check each channel at
a time and save cycles.

Cc: Bartosz Golaszewski 
Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
Signed-off-by: William Breathitt Gray 
---
Changes in v10:
 - No change.

Changes in v9:
 - Remove looping of 'for_each_set_clump' and instead process two
   halves of a 64-bit bitmap separately or individually. Use normal spin_lock 
   call for second inner lock. And take the spin_lock_init call outside the 'if'
   condition in the 'probe' function of driver.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - Minor change: Inline values '32' and '64' in code for better
   code readability.

Changes in v4:
 - Minor change: Inline values '32' and '64' in code for better
   code readability.

Changes in v3:
 - No change.

Changes in v2:
 - No change

 drivers/gpio/gpio-xilinx.c | 66 +++---
 1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 67f9f82e0db0..48393d06fb55 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -136,39 +136,39 @@ static void xgpio_set(struct gpio_chip *gc, unsigned int 
gpio, int val)
 static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
   unsigned long *bits)
 {
-   unsigned long flags;
+   unsigned long flag;
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
-
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
-   }
-
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+   u32 *const state = chip->gpio_state;
+   unsigned int *const width = chip->gpio_width;
+
+   DECLARE_BITMAP(old, 64);
+   DECLARE_BITMAP(new, 64);
+   DECLARE_BITMAP(changed, 64);
+
+   spin_lock_irqsave(&chip->gpio_lock[0], flag);
+   spin_lock(&chip->gpio_lock[1]);
+
+   bitmap_set_value(old, state[0], 0, width[0]);
+   bitmap_set_value(old, state[1], width[0], width[1]);
+   bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+   bitmap_set_value(old, state[0], 0, 32);
+   bitmap_set_value(old, state[1], 32, 32);
+   state[0] = bitmap_get_value(new, 0, width[0]);
+   state[1] = bitmap_get_value(new, width[0], width[1]);
+   bitmap_set_value(new, state[0], 0, 32);
+   bitmap_set_value(new, state[1], 32, 32);
+   bitmap_xor(changed, old, new, 64);
+
+   if (((u32 *)changed)[0])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+   state[0]);
+   if (((u32 *)changed)[1])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+   XGPIO_CHANNEL_OFFSET, state[1]);
+
+   spin_unlock(&chip->gpio_lock[1]);
+   spin_unlock_irqrestore(&chip->gpio_lock[0], flag);
 }
 
 /**
@@ -292,6 +292,7 @@ static int xgpio_probe(struct platform_device *pdev)
chip->gpio_width[0] = 32;
 
spin_lock_init(&chip->gpio_lock[0]);
+   spin_lock_init(&chip->gpio_lock[1]);
 
if (of_property_read_u32(np, "xlnx,is-dual", &is_dual))
is_dual = 0;
@@ -314,7 +315,6 @@ static int xgpio_probe(struct platform_device *pdev)
  

[PATCH v10 3/4] gpio: thunderx: Utilize for_each_set_clump macro

2020-10-02 Thread Syed Nayyar Waris
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
Signed-off-by: William Breathitt Gray 
---
Changes in v10:
 - No change.

Changes in v9:
 - No change.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - Minor change: Inline value '64' in code for better code readability.

Changes in v3:
 - Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

Changes in v2:
 - No change.

 drivers/gpio/gpio-thunderx.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..58c9bb25a377 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.26.2



[PATCH v10 2/4] lib/test_bitmap.c: Add for_each_set_clump test cases

2020-10-02 Thread Syed Nayyar Waris
The introduction of the generic for_each_set_clump macro need test
cases to verify the implementation. This patch adds test cases for
scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
The cases contain situations where clump is getting split at the word
boundary and also when zeroes are present in the start and middle of
bitmap.

Signed-off-by: Syed Nayyar Waris 
Reviewed-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
Changes in v10:
 - No change.

Changes in v9:
 - No change.

Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - Make 'for loop' inside 'test_for_each_set_clump' more succinct.

Changes in v5:
 - No change.

Changes in v4:
 - Use 'for' loop in test function of 'for_each_set_clump'.

Changes in v3:
 - No Change.

Changes in v2:
 - Unify different tests for 'for_each_set_clump'. Pass test data as
   function parameters.
 - Remove unnecessary bitmap_zero calls.

 lib/test_bitmap.c | 144 ++
 1 file changed, 144 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index df903c53952b..cb2cf3858f93 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -155,6 +155,37 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
return true;
 }
 
+static bool __init __check_eq_clump(const char *srcfile, unsigned int line,
+   const unsigned int offset,
+   const unsigned int size,
+   const unsigned long *const clump_exp,
+   const unsigned long *const clump,
+   const unsigned long clump_size)
+{
+   unsigned long exp;
+
+   if (offset >= size) {
+   pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected 
less than %u, got %u\n",
+   srcfile, line, size, offset);
+   return false;
+   }
+
+   exp = clump_exp[offset / clump_size];
+   if (!exp) {
+   pr_warn("[%s:%u] bit offset for zero clump: expected nonzero 
clump, got bit offset %u with clump value 0",
+   srcfile, line, offset);
+   return false;
+   }
+
+   if (*clump != exp) {
+   pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value 
of 0x%lX",
+   srcfile, line, exp, *clump);
+   return false;
+   }
+
+   return true;
+}
+
 #define __expect_eq(suffix, ...)   \
({  \
int result = 0; \
@@ -172,6 +203,7 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
 #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
 #define expect_eq_u32_array(...)   __expect_eq(u32_array, ##__VA_ARGS__)
 #define expect_eq_clump8(...)  __expect_eq(clump8, ##__VA_ARGS__)
+#define expect_eq_clump(...)   __expect_eq(clump, ##__VA_ARGS__)
 
 static void __init test_zero_clear(void)
 {
@@ -577,6 +609,28 @@ static void noinline __init test_mem_optimisations(void)
}
 }
 
+static const unsigned long clump_bitmap_data[] __initconst = {
+   0x38000201,
+   0x05ff0f38,
+   0xeffedcba,
+   0xabcd,
+   0x00aa,
+   0x00aa,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x,
+   0x,
+   0x,
+   0x0f00,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x0ac0,
+};
+
 static const unsigned char clump_exp[] __initconst = {
0x01,   /* 1 bit set */
0x02,   /* non-edge 1 bit set */
@@ -588,6 +642,95 @@ static const unsigned char clump_exp[] __initconst = {
0x05,   /* non-adjacent 2 bits set */
 };
 
+static const unsigned long clump_exp1[] __initconst = {
+   0x01,   /* 1 bit set */
+   0x02,   /* non-edge 1 bit set */
+   0x00,   /* zero bits set */
+   0x38,   /* 3 bits set across 4-bit boundary */
+   0x38,   /* Repeated clump */
+   0x0F,   /* 4 bits set */
+   0xFF,   /* all bits set */
+   0x05,   /* non-adjacent 2 bits set */
+};
+
+static const unsigned long clump_exp2[] __initconst = {
+   0xfedcba,   /* 24 bits */
+   0xabcdef,
+   0xaa,   /* Clump split between 2 words */
+   0x00,   /* zeroes in between */
+   0xaa,
+   0x00,
+   0xff,
+   0xaa,
+   0x00,
+   0xff,
+};

[PATCH v10 1/4] bitops: Introduce the for_each_set_clump macro

2020-10-02 Thread Syed Nayyar Waris
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value and bitmap_set_value functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size less than or equal to BITS_PER_LONG.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving value of n-bits from bitmap.

Cc: Arnd Bergmann 
Signed-off-by: Syed Nayyar Waris 
Reviewed-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
Changes in v10:
 - No change.

Changes in v9:
 - No change.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - No change.

Changes in v3:
 - No change.

Changes in v2:
 - No change.

 include/asm-generic/bitops/find.h | 19 ++
 include/linux/bitmap.h| 61 +++
 include/linux/bitops.h| 13 +++
 lib/find_bit.c| 14 +++
 4 files changed, 107 insertions(+)

diff --git a/include/asm-generic/bitops/find.h 
b/include/asm-generic/bitops/find.h
index 9fdf21302fdf..4e6600759455 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -97,4 +97,23 @@ extern unsigned long find_next_clump8(unsigned long *clump,
 #define find_first_clump8(clump, bits, size) \
find_next_clump8((clump), (bits), (size), 0)
 
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+extern unsigned long find_next_clump(unsigned long *clump,
+ const unsigned long *addr,
+ unsigned long size, unsigned long offset,
+ unsigned long clump_size);
+
+#define find_first_clump(clump, bits, size, clump_size) \
+   find_next_clump((clump), (bits), (size), 0, (clump_size))
+
 #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 99058eb81042..7ab2c65fc964 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -75,7 +75,11 @@
  *  bitmap_from_arr32(dst, buf, nbits)  Copy nbits from u32[] buf to 
dst
  *  bitmap_to_arr32(buf, src, nbits)Copy nbits from buf to u32[] 
dst
  *  bitmap_get_value8(map, start)   Get 8bit value from map at 
start
+ *  bitmap_get_value(map, start, nbits)Get bit value of size
+ * 'nbits' from map at start
  *  bitmap_set_value8(map, value, start)Set 8bit value to map at start
+ *  bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits'
+ * of map at start
  *
  * Note, bitmap_zero() and bitmap_fill() operate over the region of
  * unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -563,6 +567,34 @@ static inline unsigned long bitmap_get_value8(const 
unsigned long *map,
return (map[index] >> offset) & 0xFF;
 }
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high;
+
+   if (space >= nbits)
+   return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+   else {
+   value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+   value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
nbit

[PATCH v10 0/4] Introduce the for_each_set_clump macro

2020-10-02 Thread Syed Nayyar Waris
Hello Linus,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

This patchset introduces a new generic version of for_each_set_clump. 
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro 
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word 
size is not multiple of clump size. Following are examples showing the working 
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example 
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour). 

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v10:
 - Patchset based on v5.9-rc1.

Changes in v9:
 - [Patch 4/4]: Remove looping of 'for_each_set_clump' and instead process two 
   halves of a 64-bit bitmap separately or individually. Use normal spin_lock 
   call for second inner lock. And take the spin_lock_init call outside the 'if'
   condition in the probe function of driver.

Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - [Patch 2/4]: Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - [Patch 2/4]: Make 'for loop' inside test_for_each_set_clump more
   succinct.

Changes in v5:
 - [Patch 4/4]: Minor change: Hardcode value for better code readability.

Changes in v4:
 - [Patch 2/4]: Use 'for' loop in test function of for_each_set_clump.
 - [Patch 3/4]: Minor change: Inline value for better code readability.
 - [Patch 4/4]: Minor change: Inline value for better code readability.

Changes in v3:
 - [Patch 3/4]: Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

CHanges in v2:
 - [Patch 2/4]: Unify different tests for 'for_each_set_clump'. Pass test data 
as
   function parameters.
 - [Patch 2/4]: Remove unnecessary bitmap_zero calls.

Syed Nayyar Waris (4):
  bitops: Introduce the for_each_set_clump macro
  lib/test_bitmap.c: Add for_each_set_clump test cases
  gpio: thunderx: Utilize for_each_set_clump macro
  gpio: xilinx: Utilize generic bitmap_get_value and _set_value

 drivers/gpio/gpio-thunderx.c  |  11 ++-
 drivers/gpio/gpio-xilinx.c|  66 +++---
 include/asm-generic/bitops/find.h |  19 
 include/linux/bitmap.h  

Re: [PATCH v9 0/4] Introduce the for_each_set_clump macro

2020-10-02 Thread Syed Nayyar Waris
On Wed, Sep 30, 2020 at 2:59 PM Linus Walleij  wrote:
>
> On Tue, Sep 29, 2020 at 3:14 PM Andy Shevchenko
>  wrote:
>
> > Linus, are you referencing to [3]? It was fixed in GENMASK()
> > implementation some time ago.
> > [3]: https://lore.kernel.org/lkml/202006171559.jsbgjxnw%25...@intel.com/
>
> Yup.
>
> I tried to apply the patches again now to test it but now patch 2
> needs to be rebased.
>
> Sorry for all the trouble!
>
> Syed can you rebase the patch set on v5.9-rc1 and resend as v10?

Sure Linus. I will send it as soon as possible.

Thanks
Syed Nayyar Waris


Re: [PATCH v9 0/4] Introduce the for_each_set_clump macro

2020-08-31 Thread Syed Nayyar Waris
On Thu, Jul 16, 2020 at 6:19 PM Linus Walleij  wrote:
>
> Hi Syed,
>
> sorry for taking so long. I was on vacation and a bit snowed
> under by work.
>
> On Sat, Jun 27, 2020 at 10:10 AM Syed Nayyar Waris  
> wrote:
>
> > Since this patchset primarily affects GPIO drivers, would you like
> > to pick it up through your GPIO tree?
>
> I have applied the patches to an immutable branch and pushed
> to kernelorg for testing (autobuilders will play with it I hope).
>
> If all works fine I will merge this into my devel branch for v5.9.
>
> It would be desirable if Andrew gave his explicit ACK on it too.
>
> Yours,
> Linus Walleij

Hi Linus,

As a reminder, I would like to point out about the
'for_each_set_clump' patchset. If it's alright and if anything is
needed to take it further so that it is finally accepted.

Regards
Syed Nayyar Waris


Re: [PATCH v4 5/5] counter: 104-quad-8: Add IRQ support for the ACCES 104-QUAD-8

2020-07-30 Thread Syed Nayyar Waris
On Wed, Jul 22, 2020 at 1:06 AM William Breathitt Gray
 wrote:
>
> The LSI/CSI LS7266R1 chip provides programmable output via the FLG pins.
> When interrupts are enabled on the ACCES 104-QUAD-8, they occur whenever
> FLG1 is active. Four functions are available for the FLG1 signal: Carry,
> Compare, Carry-Borrow, and Index.
>
> Carry:
> Interrupt generated on active low Carry signal. Carry
> signal toggles every time the respective channel's
> counter overflows.
>
> Compare:
> Interrupt generated on active low Compare signal.
> Compare signal toggles every time respective channel's
> preset register is equal to the respective channel's
> counter.
>
> Carry-Borrow:
> Interrupt generated on active low Carry signal and
> active low Borrow signal. Carry signal toggles every
> time the respective channel's counter overflows. Borrow
> signal toggles every time the respective channel's
> counter underflows.
>
> Index:
> Interrupt generated on active high Index signal.
>
> The irq_trigger Count extension is introduced to allow the selection of
> the desired IRQ trigger function per channel. The irq_trigger_enable
> Count extension is introduced to allow the enablement of interrupts for
> a respective channel. Interrupts push Counter events as Event X, where
> 'X' is the respective channel whose FLG1 activated.
>
> This patch adds IRQ support for the ACCES 104-QUAD-8. The interrupt line
> numbers for the devices may be configured via the irq array module
> parameter.
>
> Cc: Syed Nayyar Waris 
> Signed-off-by: William Breathitt Gray 
> ---
>  .../ABI/testing/sysfs-bus-counter-104-quad-8  |  32 ++
>  drivers/counter/104-quad-8.c  | 283 +-
>  drivers/counter/Kconfig   |   6 +-
>  3 files changed, 249 insertions(+), 72 deletions(-)
>
> diff --git a/Documentation/ABI/testing/sysfs-bus-counter-104-quad-8 
> b/Documentation/ABI/testing/sysfs-bus-counter-104-quad-8
> index eac32180c40d..718f6199c71e 100644
> --- a/Documentation/ABI/testing/sysfs-bus-counter-104-quad-8
> +++ b/Documentation/ABI/testing/sysfs-bus-counter-104-quad-8
> @@ -1,3 +1,35 @@
> +What:  /sys/bus/counter/devices/counterX/countY/irq_trigger
> +KernelVersion: 5.9
> +Contact:   linux-...@vger.kernel.org
> +Description:
> +   IRQ trigger function for channel Y. Four trigger functions are
> +   available: carry, compare, carry-borrow, and index.
> +
> +   carry:
> +   Interrupt generated on active low Carry signal. Carry
> +   signal toggles every time channel Y counter overflows.
> +
> +   compare:
> +   Interrupt generated on active low Compare signal.
> +   Compare signal toggles every time channel Y preset
> +   register is equal to channel Y counter.
> +
> +   carry-borrow:
> +   Interrupt generated on active low Carry signal and
> +   active low Borrow signal. Carry signal toggles every
> +   time channel Y counter overflows. Borrow signal 
> toggles
> +   every time channel Y counter underflows.
> +
> +   index:
> +   Interrupt generated on active high Index signal.
> +
> +What:  /sys/bus/counter/devices/counterX/countY/irq_trigger_enable
> +KernelVersion: 5.9
> +Contact:   linux-...@vger.kernel.org
> +Description:
> +   Whether generation of interrupts is enabled for channel Y. 
> Valid
> +   attribute values are boolean.
> +
>  What:  /sys/bus/counter/devices/counterX/signalY/cable_fault
>  KernelVersion: 5.7
>  Contact:   linux-...@vger.kernel.org
> diff --git a/drivers/counter/104-quad-8.c b/drivers/counter/104-quad-8.c
> index 0f20920073d6..b43be2d5464d 100644
> --- a/drivers/counter/104-quad-8.c
> +++ b/drivers/counter/104-quad-8.c
> @@ -13,23 +13,30 @@
>  #include 
>  #include 
>  #include 
> +#include 
>  #include 
>  #include 
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #define QUAD8_EXTENT 32
>
>  static unsigned int base[max_num_isa_dev(QUAD8_EXTENT)];
>  static unsigned int num_quad8;
> -module_param_array(base, uint, &num_quad8, 0);
> +module_param_hw_array(base, uint, ioport, &num_quad8, 0);
>  MODULE_PARM_DESC(base, "ACCES 104-QUAD-8

Re: [PATCH v9 0/4] Introduce the for_each_set_clump macro

2020-07-10 Thread Syed Nayyar Waris
On Sat, Jun 27, 2020 at 1:40 PM Syed Nayyar Waris  wrote:
>
> Hello Linus,
>
> Since this patchset primarily affects GPIO drivers, would you like
> to pick it up through your GPIO tree?
>
> This patchset introduces a new generic version of for_each_set_clump.
> The previous version of for_each_set_clump8 used a fixed size 8-bit
> clump, but the new generic version can work with clump of any size but
> less than or equal to BITS_PER_LONG. The patchset utilizes the new macro
> in several GPIO drivers.
>
> The earlier 8-bit for_each_set_clump8 facilitated a
> for-loop syntax that iterates over a memory region entire groups of set
> bits at a time.
>
> For example, suppose you would like to iterate over a 32-bit integer 8
> bits at a time, skipping over 8-bit groups with no set bit, where
>  represents the current 8-bit group:
>
> Example:1010   00110011
> First loop: 1010   
> Second loop:1010   00110011
> Third loop:    00110011
>
> Each iteration of the loop returns the next 8-bit group that has at
> least one set bit.
>
> But with the new for_each_set_clump the clump size can be different from 8 
> bits.
> Moreover, the clump can be split at word boundary in situations where word
> size is not multiple of clump size. Following are examples showing the working
> of new macro for clump sizes of 24 bits and 6 bits.
>
> Example 1:
> clump size: 24 bits, Number of clumps (or ports): 10
> bitmap stores the bit information from where successive clumps are retrieved.
>
>  /* bitmap memory region */
> 0x00aaff00;  /* Most significant bits */
> 0xaaff;
> 0x00aa00aa;
> 0xabcdeffedcba;  /* Least significant bits */
>
> Different iterations of for_each_set_clump:-
> 'offset' is the bit position and 'clump' is the 24 bit clump from the
> above bitmap.
> Iteration first:offset: 0 clump: 0xfedcba
> Iteration second:   offset: 24 clump: 0xabcdef
> Iteration third:offset: 48 clump: 0xaa
> Iteration fourth:   offset: 96 clump: 0xaa
> Iteration fifth:offset: 144 clump: 0xff
> Iteration sixth:offset: 168 clump: 0xaa
> Iteration seventh:  offset: 216 clump: 0xff
> Loop breaks because in the end the remaining bits (0x00aa) size was less
> than clump size of 24 bits.
>
> In above example it can be seen that in iteration third, the 24 bit clump
> that was retrieved was split between bitmap[0] and bitmap[1]. This example
> also shows that 24 bit zeroes if present in between, were skipped (preserving
> the previous for_each_set_macro8 behaviour).
>
> Example 2:
> clump size = 6 bits, Number of clumps (or ports) = 3.
>
>  /* bitmap memory region */
> 0x00aaff00;  /* Most significant bits */
> 0xaaff;
> 0x0f00;
> 0x0ac0;  /* Least significant bits */
>
> Different iterations of for_each_set_clump:
> 'offset' is the bit position and 'clump' is the 6 bit clump from the
> above bitmap.
> Iteration first:offset: 6 clump: 0x2b
> Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
> Here 6 * 3 is clump size * no. of clumps.
>
> Changes in v9:
>  - [Patch 4/4]: Remove looping of 'for_each_set_clump' and instead process two
>halves of a 64-bit bitmap separately or individually. Use normal spin_lock
>call for second inner lock. And take the spin_lock_init call outside the 
> 'if'
>condition in the probe function of driver.
>
> Changes in v8:
>  - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
>in 'clump_test_data' array.
>
> Changes in v7:
>  - [Patch 2/4]: Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
>definition and test data.
>
> Changes in v6:
>  - [Patch 2/4]: Make 'for loop' inside test_for_each_set_clump more
>succinct.
>
> Changes in v5:
>  - [Patch 4/4]: Minor change: Hardcode value for better code readability.
>
> Changes in v4:
>  - [Patch 2/4]: Use 'for' loop in test function of for_each_set_clump.
>  - [Patch 3/4]: Minor change: Inline value for better code readability.
>  - [Patch 4/4]: Minor change: Inline value for better code readability.
>
> Changes in v3:
>  - [Patch 3/4]: Change datatype of some variables from u64 to unsigned long
>in function thunderx_gpio_set_multiple.
>
> CHanges in v2:
>  - [Patch 2/4]: Unify different tests for 'for_each_set

[PATCH v9 4/4] gpio: xilinx: Utilize generic bitmap_get_value and _set_value.

2020-06-27 Thread Syed Nayyar Waris
This patch reimplements the xgpio_set_multiple function in
drivers/gpio/gpio-xilinx.c to use the new generic functions:
bitmap_get_value and bitmap_set_value. The code is now simpler
to read and understand. Moreover, instead of looping for each bit
in xgpio_set_multiple function, now we can check each channel at
a time and save cycles.
---
Changes in v9:
 - Remove looping of 'for_each_set_clump' and instead process two
   halves of a 64-bit bitmap separately or individually. Use normal spin_lock 
   call for second inner lock. And take the spin_lock_init call outside the 'if'
   condition in the 'probe' function of driver.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - Minor change: Inline values '32' and '64' in code for better
   code readability.

Changes in v4:
 - Minor change: Inline values '32' and '64' in code for better
   code readability.

Changes in v3:
 - No change.

Changes in v2:
 - No change

 drivers/gpio/gpio-xilinx.c | 66 +++---
 1 file changed, 33 insertions(+), 33 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 67f9f82e0db0..48393d06fb55 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -136,39 +136,39 @@ static void xgpio_set(struct gpio_chip *gc, unsigned int 
gpio, int val)
 static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
   unsigned long *bits)
 {
-   unsigned long flags;
+   unsigned long flag;
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
-
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
-   }
-
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+   u32 *const state = chip->gpio_state;
+   unsigned int *const width = chip->gpio_width;
+
+   DECLARE_BITMAP(old, 64);
+   DECLARE_BITMAP(new, 64);
+   DECLARE_BITMAP(changed, 64);
+
+   spin_lock_irqsave(&chip->gpio_lock[0], flag);
+   spin_lock(&chip->gpio_lock[1]);
+
+   bitmap_set_value(old, state[0], 0, width[0]);
+   bitmap_set_value(old, state[1], width[0], width[1]);
+   bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+   bitmap_set_value(old, state[0], 0, 32);
+   bitmap_set_value(old, state[1], 32, 32);
+   state[0] = bitmap_get_value(new, 0, width[0]);
+   state[1] = bitmap_get_value(new, width[0], width[1]);
+   bitmap_set_value(new, state[0], 0, 32);
+   bitmap_set_value(new, state[1], 32, 32);
+   bitmap_xor(changed, old, new, 64);
+
+   if (((u32 *)changed)[0])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET,
+   state[0]);
+   if (((u32 *)changed)[1])
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+   XGPIO_CHANNEL_OFFSET, state[1]);
+
+   spin_unlock(&chip->gpio_lock[1]);
+   spin_unlock_irqrestore(&chip->gpio_lock[0], flag);
 }
 
 /**
@@ -292,6 +292,7 @@ static int xgpio_probe(struct platform_device *pdev)
chip->gpio_width[0] = 32;
 
spin_lock_init(&chip->gpio_lock[0]);
+   spin_lock_init(&chip->gpio_lock[1]);
 
if (of_property_read_u32(np, "xlnx,is-dual", &is_dual))
is_dual = 0;
@@ -314,7 +315,6 @@ static int xgpio_probe(struct platform_device *pdev)
 &chip->gpio_width[1]))
chip->gpio_width[1] = 32;
 
-   spin_lock_init(&chip->gpio_lock[1]);
}
 
chip->gc.base = -1;
-- 
2.26.2



[PATCH v9 3/4] gpio: thunderx: Utilize for_each_set_clump macro

2020-06-27 Thread Syed Nayyar Waris
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
Signed-off-by: William Breathitt Gray 
---
Changes in v9:
 - No change.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - Minor change: Inline value '64' in code for better code readability.

Changes in v3:
 - Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

Changes in v2:
 - No change.

 drivers/gpio/gpio-thunderx.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..58c9bb25a377 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.26.2



[PATCH v9 2/4] lib/test_bitmap.c: Add for_each_set_clump test cases

2020-06-27 Thread Syed Nayyar Waris
The introduction of the generic for_each_set_clump macro need test
cases to verify the implementation. This patch adds test cases for
scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
The cases contain situations where clump is getting split at the word
boundary and also when zeroes are present in the start and middle of
bitmap.

Signed-off-by: Syed Nayyar Waris 
Reviewed-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
Changes in v9:
 - No change.

Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - Make 'for loop' inside 'test_for_each_set_clump' more succinct.

Changes in v5:
 - No change.

Changes in v4:
 - Use 'for' loop in test function of 'for_each_set_clump'.

Changes in v3:
 - No Change.

Changes in v2:
 - Unify different tests for 'for_each_set_clump'. Pass test data as
   function parameters.
 - Remove unnecessary bitmap_zero calls.

 lib/test_bitmap.c | 145 ++
 1 file changed, 145 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 6b13150667f5..78c0048870a6 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -155,6 +155,38 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
return true;
 }
 
+static bool __init __check_eq_clump(const char *srcfile, unsigned int line,
+   const unsigned int offset,
+   const unsigned int size,
+   const unsigned long *const clump_exp,
+   const unsigned long *const clump,
+   const unsigned long clump_size)
+{
+   unsigned long exp;
+
+   if (offset >= size) {
+   pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected 
less than %u, got %u\n",
+   srcfile, line, size, offset);
+   return false;
+   }
+
+   exp = clump_exp[offset / clump_size];
+   if (!exp) {
+   pr_warn("[%s:%u] bit offset for zero clump: expected nonzero 
clump, got bit offset %u with clump value 0",
+   srcfile, line, offset);
+   return false;
+   }
+
+   if (*clump != exp) {
+   pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value 
of 0x%lX",
+   srcfile, line, exp, *clump);
+   return false;
+   }
+
+   return true;
+}
+
+
 #define __expect_eq(suffix, ...)   \
({  \
int result = 0; \
@@ -172,6 +204,7 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
 #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
 #define expect_eq_u32_array(...)   __expect_eq(u32_array, ##__VA_ARGS__)
 #define expect_eq_clump8(...)  __expect_eq(clump8, ##__VA_ARGS__)
+#define expect_eq_clump(...)   __expect_eq(clump, ##__VA_ARGS__)
 
 static void __init test_zero_clear(void)
 {
@@ -577,6 +610,28 @@ static void noinline __init test_mem_optimisations(void)
}
 }
 
+static const unsigned long clump_bitmap_data[] __initconst = {
+   0x38000201,
+   0x05ff0f38,
+   0xeffedcba,
+   0xabcd,
+   0x00aa,
+   0x00aa,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x,
+   0x,
+   0x,
+   0x0f00,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x0ac0,
+};
+
 static const unsigned char clump_exp[] __initconst = {
0x01,   /* 1 bit set */
0x02,   /* non-edge 1 bit set */
@@ -588,6 +643,95 @@ static const unsigned char clump_exp[] __initconst = {
0x05,   /* non-adjacent 2 bits set */
 };
 
+static const unsigned long clump_exp1[] __initconst = {
+   0x01,   /* 1 bit set */
+   0x02,   /* non-edge 1 bit set */
+   0x00,   /* zero bits set */
+   0x38,   /* 3 bits set across 4-bit boundary */
+   0x38,   /* Repeated clump */
+   0x0F,   /* 4 bits set */
+   0xFF,   /* all bits set */
+   0x05,   /* non-adjacent 2 bits set */
+};
+
+static const unsigned long clump_exp2[] __initconst = {
+   0xfedcba,   /* 24 bits */
+   0xabcdef,
+   0xaa,   /* Clump split between 2 words */
+   0x00,   /* zeroes in between */
+   0xaa,
+   0x00,
+   0xff,
+   0xaa,
+   0x00,
+   0xff,
+};
+
+static const

[PATCH v9 1/4] bitops: Introduce the for_each_set_clump macro

2020-06-27 Thread Syed Nayyar Waris
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value and bitmap_set_value functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size less than or equal to BITS_PER_LONG.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving value of n-bits from bitmap.

Cc: Arnd Bergmann 
Signed-off-by: Syed Nayyar Waris 
Reviewed-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
Changes in v9:
 - No change.

Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - No change.

Changes in v3:
 - No change.

Changes in v2:
 - No change.

 include/asm-generic/bitops/find.h | 19 ++
 include/linux/bitmap.h| 61 +++
 include/linux/bitops.h| 13 +++
 lib/find_bit.c| 14 +++
 4 files changed, 107 insertions(+)

diff --git a/include/asm-generic/bitops/find.h 
b/include/asm-generic/bitops/find.h
index 9fdf21302fdf..4e6600759455 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -97,4 +97,23 @@ extern unsigned long find_next_clump8(unsigned long *clump,
 #define find_first_clump8(clump, bits, size) \
find_next_clump8((clump), (bits), (size), 0)
 
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+extern unsigned long find_next_clump(unsigned long *clump,
+ const unsigned long *addr,
+ unsigned long size, unsigned long offset,
+ unsigned long clump_size);
+
+#define find_first_clump(clump, bits, size, clump_size) \
+   find_next_clump((clump), (bits), (size), 0, (clump_size))
+
 #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 99058eb81042..7ab2c65fc964 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -75,7 +75,11 @@
  *  bitmap_from_arr32(dst, buf, nbits)  Copy nbits from u32[] buf to 
dst
  *  bitmap_to_arr32(buf, src, nbits)Copy nbits from buf to u32[] 
dst
  *  bitmap_get_value8(map, start)   Get 8bit value from map at 
start
+ *  bitmap_get_value(map, start, nbits)Get bit value of size
+ * 'nbits' from map at start
  *  bitmap_set_value8(map, value, start)Set 8bit value to map at start
+ *  bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits'
+ * of map at start
  *
  * Note, bitmap_zero() and bitmap_fill() operate over the region of
  * unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -563,6 +567,34 @@ static inline unsigned long bitmap_get_value8(const 
unsigned long *map,
return (map[index] >> offset) & 0xFF;
 }
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high;
+
+   if (space >= nbits)
+   return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+   else {
+   value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+   value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
nbits);
+   return (value_low >> offset) 

[PATCH v9 0/4] Introduce the for_each_set_clump macro

2020-06-27 Thread Syed Nayyar Waris
Hello Linus,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

This patchset introduces a new generic version of for_each_set_clump. 
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro 
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word 
size is not multiple of clump size. Following are examples showing the working 
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example 
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour). 

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v9:
 - [Patch 4/4]: Remove looping of 'for_each_set_clump' and instead process two 
   halves of a 64-bit bitmap separately or individually. Use normal spin_lock 
   call for second inner lock. And take the spin_lock_init call outside the 'if'
   condition in the probe function of driver.

Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - [Patch 2/4]: Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - [Patch 2/4]: Make 'for loop' inside test_for_each_set_clump more
   succinct.

Changes in v5:
 - [Patch 4/4]: Minor change: Hardcode value for better code readability.

Changes in v4:
 - [Patch 2/4]: Use 'for' loop in test function of for_each_set_clump.
 - [Patch 3/4]: Minor change: Inline value for better code readability.
 - [Patch 4/4]: Minor change: Inline value for better code readability.

Changes in v3:
 - [Patch 3/4]: Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

CHanges in v2:
 - [Patch 2/4]: Unify different tests for 'for_each_set_clump'. Pass test data 
as
   function parameters.
 - [Patch 2/4]: Remove unnecessary bitmap_zero calls.

Syed Nayyar Waris (4):
  bitops: Introduce the for_each_set_clump macro
  lib/test_bitmap.c: Add for_each_set_clump test cases
  gpio: thunderx: Utilize for_each_set_clump macro
  gpio: xilinx: Utilize generic bitmap_get_value and _set_value.

 drivers/gpio/gpio-thunderx.c  |  11 ++-
 drivers/gpio/gpio-xilinx.c|  66 +++---
 include/asm-generic/bitops/find.h |  19 
 include/linux/bitmap.h|  61 +
 include/linux/bitops.h|  13

Re: [PATCH v8 1/4] bitops: Introduce the for_each_set_clump macro

2020-06-20 Thread Syed Nayyar Waris
On Tue, Jun 16, 2020 at 1:44 PM Andy Shevchenko
 wrote:
>
> On Mon, Jun 15, 2020 at 06:21:18PM +0530, Syed Nayyar Waris wrote:
> > This macro iterates for each group of bits (clump) with set bits,
> > within a bitmap memory region. For each iteration, "start" is set to
> > the bit offset of the found clump, while the respective clump value is
> > stored to the location pointed by "clump". Additionally, the
> > bitmap_get_value and bitmap_set_value functions are introduced to
> > respectively get and set a value of n-bits in a bitmap memory region.
> > The n-bits can have any size less than or equal to BITS_PER_LONG.
> > Moreover, during setting value of n-bit in bitmap, if a situation arise
> > that the width of next n-bit is exceeding the word boundary, then it
> > will divide itself such that some portion of it is stored in that word,
> > while the remaining portion is stored in the next higher word. Similar
> > situation occurs while retrieving value of n-bits from bitmap.
>
> On the second view...
>
> > +static inline unsigned long bitmap_get_value(const unsigned long *map,
> > +   unsigned long start,
> > +   unsigned long nbits)
> > +{
> > + const size_t index = BIT_WORD(start);
> > + const unsigned long offset = start % BITS_PER_LONG;
>
> > + const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
>
> This perhaps should use round_up()

I checked with 'round_up'. I am getting the same values as I was
getting with 'roundup'.
I have checked with different clump tests.
Moreover, wherever the 'space' was being evaluated as 64, in the case
of 'roundup', it is also getting evaluated to the same value (of 64),
in case of  'round_up' also.

Further below ...

>
> > + const unsigned long space = ceiling - start;
>
> And I think I see a scenario to complain.
>
> If start == 0, then ceiling will be 64.
> space == 64. Not good.

Yes, you are right, when the 'start' is '0', then 'space' will be 64
(on arch where BITS_PER_LONG is 64).
But actually I want this to happen. I need 'space' to hold value 64
when 'start' is '0'. The reason is as follows:

Taking the example of bitmap_set_value(). If the nbits is 16 (as
example) and 'start' is zero, The 'if' condition will be executed
inside bitmap_set_value() when 'start' is zero because space(64) >=
nbits(16) is true. This 'if' condition is for the case when nbits
falls completely into the first word and the nbits doesn't have to
divide itself into another higher word of the bitmap.

This is what should happen according to me. If space is less than 64,
lets say 63 or 62, then it will not correctly indicate the remaining
space for nbits to fill in (bitmap_set_value) or to extract from
(bitmap_get_value).

Kindly let me know If I have misunderstood something. Thanks !

>
> > + unsigned long value_low, value_high;
> > +
> > + if (space >= nbits)
> > + return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> > + else {
> > + value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
> > + value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
> > nbits);
> > + return (value_low >> offset) | (value_high << space);
> > + }
> > +}
>
> ...
>
> > +/**
> > + * bitmap_set_value - set n-bit value within a memory region
> > + * @map: address to the bitmap memory region
> > + * @value: value of nbits
> > + * @start: bit offset of the n-bit value
> > + * @nbits: size of value in bits
> > + */
> > +static inline void bitmap_set_value(unsigned long *map,
> > + unsigned long value,
> > + unsigned long start, unsigned long nbits)
> > +{
> > + const size_t index = BIT_WORD(start);
> > + const unsigned long offset = start % BITS_PER_LONG;
>
> > + const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
> > + const unsigned long space = ceiling - start;
>
> Ditto for both lines.
>
> > + value &= GENMASK(nbits - 1, 0);
> > +
> > + if (space >= nbits) {
> > + map[index] &= ~(GENMASK(nbits + offset - 1, offset));
> > + map[index] |= value << offset;
> > + } else {
> > + map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
> > + map[index] |= value << offset;
> > + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > + map[index + 1] |= (value >> space);
> > + }
> > +}
>
> --
> With Best Regards,
> Andy Shevchenko
>
>


Re: [PATCH v8 1/4] bitops: Introduce the for_each_set_clump macro

2020-06-20 Thread Syed Nayyar Waris
On Tue, Jun 16, 2020 at 1:44 PM Andy Shevchenko
 wrote:
>
> On Mon, Jun 15, 2020 at 06:21:18PM +0530, Syed Nayyar Waris wrote:
> > This macro iterates for each group of bits (clump) with set bits,
> > within a bitmap memory region. For each iteration, "start" is set to
> > the bit offset of the found clump, while the respective clump value is
> > stored to the location pointed by "clump". Additionally, the
> > bitmap_get_value and bitmap_set_value functions are introduced to
> > respectively get and set a value of n-bits in a bitmap memory region.
> > The n-bits can have any size less than or equal to BITS_PER_LONG.
> > Moreover, during setting value of n-bit in bitmap, if a situation arise
> > that the width of next n-bit is exceeding the word boundary, then it
> > will divide itself such that some portion of it is stored in that word,
> > while the remaining portion is stored in the next higher word. Similar
> > situation occurs while retrieving value of n-bits from bitmap.
>
> On the second view...
>
> > +static inline unsigned long bitmap_get_value(const unsigned long *map,
> > +   unsigned long start,
> > +   unsigned long nbits)
> > +{
> > + const size_t index = BIT_WORD(start);
> > + const unsigned long offset = start % BITS_PER_LONG;
>
> > + const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
>
> This perhaps should use round_up()

Hi Andy. I will see with round_up(). I will check and inform you.
Further below ...

>
> > + const unsigned long space = ceiling - start;
>
> And I think I see a scenario to complain.
>
> If start == 0, then ceiling will be 64.
> space == 64. Not good.

Yes, you are right, when the 'start' is '0', then 'space' will be 64
(on arch where BITS_PER_LONG is 64).
But actually I want this to happen. I need 'space' to hold value 64
when 'start' is '0'. The reason is as follows:

Taking the example of bitmap_set_value(). If the nbits is 16 (as
example) and 'start' is zero, The 'if' condition will be executed
inside bitmap_set_value() when 'start' is zero because space(64) >=
nbits(16) is true. This 'if' condition is for the case when nbits
falls completely into the first word and the nbits doesn't have to
divide itself into another higher word of the bitmap.

This is what I want to happen. I will think more about this and let
you know further.

Kindly let me know If I have misunderstood something. Thanks !

>
> > + unsigned long value_low, value_high;
> > +
> > + if (space >= nbits)
> > + return (map[index] >> offset) & GENMASK(nbits - 1, 0);
> > + else {
> > + value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
> > + value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
> > nbits);
> > + return (value_low >> offset) | (value_high << space);
> > + }
> > +}
>
> ...
>
> > +/**
> > + * bitmap_set_value - set n-bit value within a memory region
> > + * @map: address to the bitmap memory region
> > + * @value: value of nbits
> > + * @start: bit offset of the n-bit value
> > + * @nbits: size of value in bits
> > + */
> > +static inline void bitmap_set_value(unsigned long *map,
> > + unsigned long value,
> > + unsigned long start, unsigned long nbits)
> > +{
> > + const size_t index = BIT_WORD(start);
> > + const unsigned long offset = start % BITS_PER_LONG;
>
> > + const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
> > + const unsigned long space = ceiling - start;
>
> Ditto for both lines.
>
> > + value &= GENMASK(nbits - 1, 0);
> > +
> > + if (space >= nbits) {
> > + map[index] &= ~(GENMASK(nbits + offset - 1, offset));
> > + map[index] |= value << offset;
> > + } else {
> > + map[index] &= ~BITMAP_FIRST_WORD_MASK(start);
> > + map[index] |= value << offset;
> > + map[index + 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> > + map[index + 1] |= (value >> space);
> > + }
> > +}
>
> --
> With Best Regards,
> Andy Shevchenko
>
>


Re: [PATCH v8 4/4] gpio: xilinx: Utilize for_each_set_clump macro

2020-06-19 Thread Syed Nayyar Waris
>
> Hi Syed,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on 444fc5cde64330661bf59944c43844e7d4c2ccd8]
>
> url:
> https://github.com/0day-ci/linux/commits/Syed-Nayyar-Waris/Introduce-the-for_each_set_clump-macro/20200615-205729
> base:444fc5cde64330661bf59944c43844e7d4c2ccd8
> config: sparc64-randconfig-s032-20200615 (attached as .config)
> compiler: sparc64-linux-gcc (GCC) 9.3.0
> reproduce:
> # apt-get install sparse
> # sparse version: v0.6.2-rc1-3-g55607964-dirty
> # save the attached .config to linux build tree
> make W=1 C=1 ARCH=sparc64 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'
>

>
>
> sparse warnings: (new ones prefixed by >>)
>
> >> include/linux/bitmap.h:639:45: sparse: sparse: shift too big (64) for type 
> >> unsigned long
> >> include/linux/bitmap.h:639:45: sparse: sparse: shift too big (64) for type 
> >> unsigned long
>include/linux/bitmap.h:594:63: sparse: sparse: shift too big (64) for type 
> unsigned long
> >> include/linux/bitmap.h:639:45: sparse: sparse: shift too big (64) for type 
> >> unsigned long
> >> include/linux/bitmap.h:638:17: sparse: sparse: invalid access past the end 
> >> of 'old' (8 8)
>

Hi All,

It seems to me that to reproduce this warning, I have to use the
sparc64 compiler. I have installed 'sparc64-linux-gnu-gcc' on my
computer.
I have to specify that this compiler needs to be used for build
process. How/ Where do I specify this?

I have downloaded the config.gz (has config file) and placed it at the
root of the linux kernel project tree. But the Makefile STILL has
'gcc' as the compiler. When I build, it is the 'gcc' compiler being
used and not 'sparc64-linux-gnu-gcc'. I know I can manually change the
Makefile to use sparc64 compiler, but I think there must be some more
elegant way to do this, perhaps using make menuconfig?

Kindly illuminate as to how shall I reproduce the compiler warning.

Regards
Syed Nayyar Waris

> vim +639 include/linux/bitmap.h
>
> 169c474fb22d8a William Breathitt Gray 2019-12-04  613
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  614  /**
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  615   * bitmap_set_value - 
> set n-bit value within a memory region
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  616   * @map: address to 
> the bitmap memory region
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  617   * @value: value of 
> nbits
> 803024b6c8a375 Syed Nayyar Waris      2020-06-15  618   * @start: bit offset 
> of the n-bit value
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  619   * @nbits: size of 
> value in bits
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  620   */
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  621  static inline void 
> bitmap_set_value(unsigned long *map,
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  622     
>   unsigned long value,
> 803024b6c8a375 Syed Nayyar Waris      2020-06-15  623 
>   unsigned long start, unsigned long nbits)
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  624  {
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  625   const size_t index = 
> BIT_WORD(start);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  626   const unsigned long 
> offset = start % BITS_PER_LONG;
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  627   const unsigned long 
> ceiling = roundup(start + 1, BITS_PER_LONG);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  628   const unsigned long 
> space = ceiling - start;
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  629
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  630   value &= 
> GENMASK(nbits - 1, 0);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  631
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  632   if (space >= nbits) {
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  633   map[index] &= 
> ~(GENMASK(nbits + offset - 1, offset));
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  634   map[index] |= 
> value << offset;
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  635   } else {
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  636   map[index] &= 
> ~BITMAP_FIRST_WORD_MASK(start);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  637   map[index] |= 
> value << offset;
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15 @638   map[index + 
> 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15 @639   map[index + 
> 1] |= (value >> space);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  640   }
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  641  }
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  642
>
> :: The code at line 639 was first introduced by commit
> :: 803024b6c8a375ba9e9e9467595d7d52d4f6a38e bitops: Introduce the 
> for_each_set_clump macro
>
> :: TO: Syed Nayyar Waris 


Re: [PATCH v8 4/4] gpio: xilinx: Utilize for_each_set_clump macro

2020-06-15 Thread Syed Nayyar Waris
On Tue, Jun 16, 2020 at 1:39 AM kernel test robot  wrote:
>
> Hi Syed,
>
> Thank you for the patch! Perhaps something to improve:
>
> [auto build test WARNING on 444fc5cde64330661bf59944c43844e7d4c2ccd8]
>
> url:
> https://github.com/0day-ci/linux/commits/Syed-Nayyar-Waris/Introduce-the-for_each_set_clump-macro/20200615-205729
> base:444fc5cde64330661bf59944c43844e7d4c2ccd8
> config: sparc64-randconfig-s032-20200615 (attached as .config)
> compiler: sparc64-linux-gcc (GCC) 9.3.0
> reproduce:
> # apt-get install sparse
> # sparse version: v0.6.2-rc1-3-g55607964-dirty
> # save the attached .config to linux build tree
> make W=1 C=1 ARCH=sparc64 CF='-fdiagnostic-prefix -D__CHECK_ENDIAN__'
>
> If you fix the issue, kindly add following tag as appropriate
> Reported-by: kernel test robot 
>
>
> sparse warnings: (new ones prefixed by >>)
>
> >> include/linux/bitmap.h:639:45: sparse: sparse: shift too big (64) for type 
> >> unsigned long
> >> include/linux/bitmap.h:639:45: sparse: sparse: shift too big (64) for type 
> >> unsigned long
>include/linux/bitmap.h:594:63: sparse: sparse: shift too big (64) for type 
> unsigned long
> >> include/linux/bitmap.h:639:45: sparse: sparse: shift too big (64) for type 
> >> unsigned long
> >> include/linux/bitmap.h:638:17: sparse: sparse: invalid access past the end 
> >> of 'old' (8 8)
>
> vim +639 include/linux/bitmap.h
>
> 169c474fb22d8a William Breathitt Gray 2019-12-04  613
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  614  /**
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  615   * bitmap_set_value - 
> set n-bit value within a memory region
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  616   * @map: address to 
> the bitmap memory region
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  617   * @value: value of 
> nbits
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  618   * @start: bit offset 
> of the n-bit value
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  619   * @nbits: size of 
> value in bits
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  620   */
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  621  static inline void 
> bitmap_set_value(unsigned long *map,
> 803024b6c8a375 Syed Nayyar Waris      2020-06-15  622 
>       unsigned long value,
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  623     
>   unsigned long start, unsigned long nbits)
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  624  {
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  625   const size_t index = 
> BIT_WORD(start);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  626   const unsigned long 
> offset = start % BITS_PER_LONG;
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  627   const unsigned long 
> ceiling = roundup(start + 1, BITS_PER_LONG);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  628   const unsigned long 
> space = ceiling - start;
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  629
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  630   value &= 
> GENMASK(nbits - 1, 0);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  631
> 803024b6c8a375 Syed Nayyar Waris      2020-06-15  632   if (space >= nbits) {
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  633   map[index] &= 
> ~(GENMASK(nbits + offset - 1, offset));
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  634   map[index] |= 
> value << offset;
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  635   } else {
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  636   map[index] &= 
> ~BITMAP_FIRST_WORD_MASK(start);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  637   map[index] |= 
> value << offset;
> 803024b6c8a375 Syed Nayyar Waris      2020-06-15 @638   map[index + 
> 1] &= ~BITMAP_LAST_WORD_MASK(start + nbits);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15 @639   map[index + 
> 1] |= (value >> space);
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  640   }
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  641  }
> 803024b6c8a375 Syed Nayyar Waris  2020-06-15  642


Regarding the compilation warning reported above:

"sparse: shift too big (64) for type unsigned long" at line 639
"sparse: invalid access past the end of 'old' (8 8)" at line 638

Kindly refer to the code above, at these line numbers.

I am in the process of fixing this warning. But what would be the fix
? At the moment can't think of a code-fix to make the compilation
warning disappear (specially at line 639). Can anyone please explain
to me the meaning of the compilation warning more deeply?

By the way, this warning was not reported in (earlier) v7 of the patchset.

Regards
Syed Nayyar Waris


[PATCH v8 4/4] gpio: xilinx: Utilize for_each_set_clump macro

2020-06-15 Thread Syed Nayyar Waris
This patch reimplements the xgpio_set_multiple function in
drivers/gpio/gpio-xilinx.c to use the new for_each_set_clump macro.
Instead of looping for each bit in xgpio_set_multiple
function, now we can check each channel at a time and save cycles.

Cc: Bartosz Golaszewski 
Cc: Michal Simek 
Signed-off-by: Syed Nayyar Waris 
Signed-off-by: William Breathitt Gray 
---
Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - Minor change: Inline values '32' and '64' in code for better
   code readability.

Changes in v4:
 - Minor change: Inline values '32' and '64' in code for better
   code readability.

Changes in v3:
 - No change.

Changes in v2:
 - No change.

 drivers/gpio/gpio-xilinx.c | 62 --
 1 file changed, 32 insertions(+), 30 deletions(-)

diff --git a/drivers/gpio/gpio-xilinx.c b/drivers/gpio/gpio-xilinx.c
index 67f9f82e0db0..e81092dea27e 100644
--- a/drivers/gpio/gpio-xilinx.c
+++ b/drivers/gpio/gpio-xilinx.c
@@ -136,39 +136,41 @@ static void xgpio_set(struct gpio_chip *gc, unsigned int 
gpio, int val)
 static void xgpio_set_multiple(struct gpio_chip *gc, unsigned long *mask,
   unsigned long *bits)
 {
-   unsigned long flags;
+   unsigned long flags[2];
struct xgpio_instance *chip = gpiochip_get_data(gc);
-   int index = xgpio_index(chip, 0);
-   int offset, i;
-
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-
-   /* Write to GPIO signals */
-   for (i = 0; i < gc->ngpio; i++) {
-   if (*mask == 0)
-   break;
-   /* Once finished with an index write it out to the register */
-   if (index !=  xgpio_index(chip, i)) {
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET,
-  chip->gpio_state[index]);
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
-   index =  xgpio_index(chip, i);
-   spin_lock_irqsave(&chip->gpio_lock[index], flags);
-   }
-   if (__test_and_clear_bit(i, mask)) {
-   offset =  xgpio_offset(chip, i);
-   if (test_bit(i, bits))
-   chip->gpio_state[index] |= BIT(offset);
-   else
-   chip->gpio_state[index] &= ~BIT(offset);
-   }
+   u32 *const state = chip->gpio_state;
+   unsigned int *const width = chip->gpio_width;
+   unsigned long offset, clump;
+   size_t index;
+
+   DECLARE_BITMAP(old, 64);
+   DECLARE_BITMAP(new, 64);
+   DECLARE_BITMAP(changed, 64);
+
+   spin_lock_irqsave(&chip->gpio_lock[0], flags[0]);
+   spin_lock_irqsave(&chip->gpio_lock[1], flags[1]);
+
+   bitmap_set_value(old, state[0], 0, width[0]);
+   bitmap_set_value(old, state[1], width[0], width[1]);
+   bitmap_replace(new, old, bits, mask, gc->ngpio);
+
+   bitmap_set_value(old, state[0], 0, 32);
+   bitmap_set_value(old, state[1], 32, 32);
+   state[0] = bitmap_get_value(new, 0, width[0]);
+   state[1] = bitmap_get_value(new, width[0], width[1]);
+   bitmap_set_value(new, state[0], 0, 32);
+   bitmap_set_value(new, state[1], 32, 32);
+   bitmap_xor(changed, old, new, 64);
+
+   for_each_set_clump(offset, clump, changed, 64, 32) {
+   index = offset / 32;
+   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
+   index * XGPIO_CHANNEL_OFFSET,
+   state[index]);
}
 
-   xgpio_writereg(chip->regs + XGPIO_DATA_OFFSET +
-  index * XGPIO_CHANNEL_OFFSET, chip->gpio_state[index]);
-
-   spin_unlock_irqrestore(&chip->gpio_lock[index], flags);
+   spin_unlock_irqrestore(&chip->gpio_lock[1], flags[1]);
+   spin_unlock_irqrestore(&chip->gpio_lock[0], flags[0]);
 }
 
 /**
-- 
2.26.2



[PATCH v8 3/4] gpio: thunderx: Utilize for_each_set_clump macro

2020-06-15 Thread Syed Nayyar Waris
This patch reimplements the thunderx_gpio_set_multiple function in
drivers/gpio/gpio-thunderx.c to use the new for_each_set_clump macro.
Instead of looping for each bank in thunderx_gpio_set_multiple
function, now we can skip bank which is not set and save cycles.

Cc: Robert Richter 
Cc: Bartosz Golaszewski 
Signed-off-by: Syed Nayyar Waris 
Signed-off-by: William Breathitt Gray 
---
Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - Minor change: Inline value '64' in code for better code readability.

Changes in v3:
 - Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

Changes in v2:
 - No change.

 drivers/gpio/gpio-thunderx.c | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/gpio/gpio-thunderx.c b/drivers/gpio/gpio-thunderx.c
index 9f66deab46ea..58c9bb25a377 100644
--- a/drivers/gpio/gpio-thunderx.c
+++ b/drivers/gpio/gpio-thunderx.c
@@ -275,12 +275,15 @@ static void thunderx_gpio_set_multiple(struct gpio_chip 
*chip,
   unsigned long *bits)
 {
int bank;
-   u64 set_bits, clear_bits;
+   unsigned long set_bits, clear_bits, gpio_mask;
+   unsigned long offset;
+
struct thunderx_gpio *txgpio = gpiochip_get_data(chip);
 
-   for (bank = 0; bank <= chip->ngpio / 64; bank++) {
-   set_bits = bits[bank] & mask[bank];
-   clear_bits = ~bits[bank] & mask[bank];
+   for_each_set_clump(offset, gpio_mask, mask, chip->ngpio, 64) {
+   bank = offset / 64;
+   set_bits = bits[bank] & gpio_mask;
+   clear_bits = ~bits[bank] & gpio_mask;
writeq(set_bits, txgpio->register_base + (bank * GPIO_2ND_BANK) 
+ GPIO_TX_SET);
writeq(clear_bits, txgpio->register_base + (bank * 
GPIO_2ND_BANK) + GPIO_TX_CLR);
}
-- 
2.26.2



[PATCH v8 2/4] lib/test_bitmap.c: Add for_each_set_clump test cases

2020-06-15 Thread Syed Nayyar Waris
The introduction of the generic for_each_set_clump macro need test
cases to verify the implementation. This patch adds test cases for
scenarios in which clump sizes are 8 bits, 24 bits, 30 bits and 6 bits.
The cases contain situations where clump is getting split at the word
boundary and also when zeroes are present in the start and middle of
bitmap.

Signed-off-by: Syed Nayyar Waris 
Reviewed-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - Make 'for loop' inside 'test_for_each_set_clump' more succinct.

Changes in v5:
 - No change.

Changes in v4:
 - Use 'for' loop in test function of 'for_each_set_clump'.

Changes in v3:
 - No Change.

Changes in v2:
 - Unify different tests for 'for_each_set_clump'. Pass test data as
   function parameters.
 - Remove unnecessary bitmap_zero calls.

 lib/test_bitmap.c | 145 ++
 1 file changed, 145 insertions(+)

diff --git a/lib/test_bitmap.c b/lib/test_bitmap.c
index 6b13150667f5..78c0048870a6 100644
--- a/lib/test_bitmap.c
+++ b/lib/test_bitmap.c
@@ -155,6 +155,38 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
return true;
 }
 
+static bool __init __check_eq_clump(const char *srcfile, unsigned int line,
+   const unsigned int offset,
+   const unsigned int size,
+   const unsigned long *const clump_exp,
+   const unsigned long *const clump,
+   const unsigned long clump_size)
+{
+   unsigned long exp;
+
+   if (offset >= size) {
+   pr_warn("[%s:%u] bit offset for clump out-of-bounds: expected 
less than %u, got %u\n",
+   srcfile, line, size, offset);
+   return false;
+   }
+
+   exp = clump_exp[offset / clump_size];
+   if (!exp) {
+   pr_warn("[%s:%u] bit offset for zero clump: expected nonzero 
clump, got bit offset %u with clump value 0",
+   srcfile, line, offset);
+   return false;
+   }
+
+   if (*clump != exp) {
+   pr_warn("[%s:%u] expected clump value of 0x%lX, got clump value 
of 0x%lX",
+   srcfile, line, exp, *clump);
+   return false;
+   }
+
+   return true;
+}
+
+
 #define __expect_eq(suffix, ...)   \
({  \
int result = 0; \
@@ -172,6 +204,7 @@ static bool __init __check_eq_clump8(const char *srcfile, 
unsigned int line,
 #define expect_eq_pbl(...) __expect_eq(pbl, ##__VA_ARGS__)
 #define expect_eq_u32_array(...)   __expect_eq(u32_array, ##__VA_ARGS__)
 #define expect_eq_clump8(...)  __expect_eq(clump8, ##__VA_ARGS__)
+#define expect_eq_clump(...)   __expect_eq(clump, ##__VA_ARGS__)
 
 static void __init test_zero_clear(void)
 {
@@ -577,6 +610,28 @@ static void noinline __init test_mem_optimisations(void)
}
 }
 
+static const unsigned long clump_bitmap_data[] __initconst = {
+   0x38000201,
+   0x05ff0f38,
+   0xeffedcba,
+   0xabcd,
+   0x00aa,
+   0x00aa,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x,
+   0x,
+   0x,
+   0x0f00,
+   0x00ff,
+   0xaa00,
+   0xff00,
+   0x00aa,
+   0x0ac0,
+};
+
 static const unsigned char clump_exp[] __initconst = {
0x01,   /* 1 bit set */
0x02,   /* non-edge 1 bit set */
@@ -588,6 +643,95 @@ static const unsigned char clump_exp[] __initconst = {
0x05,   /* non-adjacent 2 bits set */
 };
 
+static const unsigned long clump_exp1[] __initconst = {
+   0x01,   /* 1 bit set */
+   0x02,   /* non-edge 1 bit set */
+   0x00,   /* zero bits set */
+   0x38,   /* 3 bits set across 4-bit boundary */
+   0x38,   /* Repeated clump */
+   0x0F,   /* 4 bits set */
+   0xFF,   /* all bits set */
+   0x05,   /* non-adjacent 2 bits set */
+};
+
+static const unsigned long clump_exp2[] __initconst = {
+   0xfedcba,   /* 24 bits */
+   0xabcdef,
+   0xaa,   /* Clump split between 2 words */
+   0x00,   /* zeroes in between */
+   0xaa,
+   0x00,
+   0xff,
+   0xaa,
+   0x00,
+   0xff,
+};
+
+static const unsigned long clump_exp3[] __initconst = 

[PATCH v8 1/4] bitops: Introduce the for_each_set_clump macro

2020-06-15 Thread Syed Nayyar Waris
This macro iterates for each group of bits (clump) with set bits,
within a bitmap memory region. For each iteration, "start" is set to
the bit offset of the found clump, while the respective clump value is
stored to the location pointed by "clump". Additionally, the
bitmap_get_value and bitmap_set_value functions are introduced to
respectively get and set a value of n-bits in a bitmap memory region.
The n-bits can have any size less than or equal to BITS_PER_LONG.
Moreover, during setting value of n-bit in bitmap, if a situation arise
that the width of next n-bit is exceeding the word boundary, then it
will divide itself such that some portion of it is stored in that word,
while the remaining portion is stored in the next higher word. Similar
situation occurs while retrieving value of n-bits from bitmap.

Cc: Arnd Bergmann 
Signed-off-by: Syed Nayyar Waris 
Reviewed-by: Andy Shevchenko 
Signed-off-by: William Breathitt Gray 
---
Changes in v8:
 - No change.

Changes in v7:
 - No change.

Changes in v6:
 - No change.

Changes in v5:
 - No change.

Changes in v4:
 - No change.

Changes in v3:
 - No change.

Changes in v2:
 - No change.

 include/asm-generic/bitops/find.h | 19 ++
 include/linux/bitmap.h| 61 +++
 include/linux/bitops.h| 13 +++
 lib/find_bit.c| 14 +++
 4 files changed, 107 insertions(+)

diff --git a/include/asm-generic/bitops/find.h 
b/include/asm-generic/bitops/find.h
index 9fdf21302fdf..4e6600759455 100644
--- a/include/asm-generic/bitops/find.h
+++ b/include/asm-generic/bitops/find.h
@@ -97,4 +97,23 @@ extern unsigned long find_next_clump8(unsigned long *clump,
 #define find_first_clump8(clump, bits, size) \
find_next_clump8((clump), (bits), (size), 0)
 
+/**
+ * find_next_clump - find next clump with set bits in a memory region
+ * @clump: location to store copy of found clump
+ * @addr: address to base the search on
+ * @size: bitmap size in number of bits
+ * @offset: bit offset at which to start searching
+ * @clump_size: clump size in bits
+ *
+ * Returns the bit offset for the next set clump; the found clump value is
+ * copied to the location pointed by @clump. If no bits are set, returns @size.
+ */
+extern unsigned long find_next_clump(unsigned long *clump,
+ const unsigned long *addr,
+ unsigned long size, unsigned long offset,
+ unsigned long clump_size);
+
+#define find_first_clump(clump, bits, size, clump_size) \
+   find_next_clump((clump), (bits), (size), 0, (clump_size))
+
 #endif /*_ASM_GENERIC_BITOPS_FIND_H_ */
diff --git a/include/linux/bitmap.h b/include/linux/bitmap.h
index 99058eb81042..7ab2c65fc964 100644
--- a/include/linux/bitmap.h
+++ b/include/linux/bitmap.h
@@ -75,7 +75,11 @@
  *  bitmap_from_arr32(dst, buf, nbits)  Copy nbits from u32[] buf to 
dst
  *  bitmap_to_arr32(buf, src, nbits)Copy nbits from buf to u32[] 
dst
  *  bitmap_get_value8(map, start)   Get 8bit value from map at 
start
+ *  bitmap_get_value(map, start, nbits)Get bit value of size
+ * 'nbits' from map at start
  *  bitmap_set_value8(map, value, start)Set 8bit value to map at start
+ *  bitmap_set_value(map, value, start, nbits) Set bit value of size 'nbits'
+ * of map at start
  *
  * Note, bitmap_zero() and bitmap_fill() operate over the region of
  * unsigned longs, that is, bits behind bitmap till the unsigned long
@@ -563,6 +567,34 @@ static inline unsigned long bitmap_get_value8(const 
unsigned long *map,
return (map[index] >> offset) & 0xFF;
 }
 
+/**
+ * bitmap_get_value - get a value of n-bits from the memory region
+ * @map: address to the bitmap memory region
+ * @start: bit offset of the n-bit value
+ * @nbits: size of value in bits
+ *
+ * Returns value of nbits located at the @start bit offset within the @map
+ * memory region.
+ */
+static inline unsigned long bitmap_get_value(const unsigned long *map,
+ unsigned long start,
+ unsigned long nbits)
+{
+   const size_t index = BIT_WORD(start);
+   const unsigned long offset = start % BITS_PER_LONG;
+   const unsigned long ceiling = roundup(start + 1, BITS_PER_LONG);
+   const unsigned long space = ceiling - start;
+   unsigned long value_low, value_high;
+
+   if (space >= nbits)
+   return (map[index] >> offset) & GENMASK(nbits - 1, 0);
+   else {
+   value_low = map[index] & BITMAP_FIRST_WORD_MASK(start);
+   value_high = map[index + 1] & BITMAP_LAST_WORD_MASK(start + 
nbits);
+   return (value_low >> offset) | (value_high << spa

[PATCH v8 0/4] Introduce the for_each_set_clump macro

2020-06-15 Thread Syed Nayyar Waris
Hello Linus,

Since this patchset primarily affects GPIO drivers, would you like
to pick it up through your GPIO tree?

This patchset introduces a new generic version of for_each_set_clump. 
The previous version of for_each_set_clump8 used a fixed size 8-bit
clump, but the new generic version can work with clump of any size but
less than or equal to BITS_PER_LONG. The patchset utilizes the new macro 
in several GPIO drivers.

The earlier 8-bit for_each_set_clump8 facilitated a
for-loop syntax that iterates over a memory region entire groups of set
bits at a time.

For example, suppose you would like to iterate over a 32-bit integer 8
bits at a time, skipping over 8-bit groups with no set bit, where
 represents the current 8-bit group:

Example:1010   00110011
First loop: 1010   
Second loop:1010   00110011
Third loop:    00110011

Each iteration of the loop returns the next 8-bit group that has at
least one set bit.

But with the new for_each_set_clump the clump size can be different from 8 bits.
Moreover, the clump can be split at word boundary in situations where word 
size is not multiple of clump size. Following are examples showing the working 
of new macro for clump sizes of 24 bits and 6 bits.

Example 1:
clump size: 24 bits, Number of clumps (or ports): 10
bitmap stores the bit information from where successive clumps are retrieved.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x00aa00aa;
0xabcdeffedcba;  /* Least significant bits */

Different iterations of for_each_set_clump:-
'offset' is the bit position and 'clump' is the 24 bit clump from the
above bitmap.
Iteration first:offset: 0 clump: 0xfedcba
Iteration second:   offset: 24 clump: 0xabcdef
Iteration third:offset: 48 clump: 0xaa
Iteration fourth:   offset: 96 clump: 0xaa
Iteration fifth:offset: 144 clump: 0xff
Iteration sixth:offset: 168 clump: 0xaa
Iteration seventh:  offset: 216 clump: 0xff
Loop breaks because in the end the remaining bits (0x00aa) size was less
than clump size of 24 bits.

In above example it can be seen that in iteration third, the 24 bit clump
that was retrieved was split between bitmap[0] and bitmap[1]. This example 
also shows that 24 bit zeroes if present in between, were skipped (preserving
the previous for_each_set_macro8 behaviour). 

Example 2:
clump size = 6 bits, Number of clumps (or ports) = 3.

 /* bitmap memory region */
0x00aaff00;  /* Most significant bits */
0xaaff;
0x0f00;
0x0ac0;  /* Least significant bits */

Different iterations of for_each_set_clump:
'offset' is the bit position and 'clump' is the 6 bit clump from the
above bitmap.
Iteration first:offset: 6 clump: 0x2b
Loop breaks because 6 * 3 = 18 bits traversed in bitmap.
Here 6 * 3 is clump size * no. of clumps.

Changes in v8:
 - [Patch 2/4]: Minor change: Use '__initdata' for correct section mismatch
   in 'clump_test_data' array.

Changes in v7:
 - [Patch 2/4]: Minor changes: Use macro 'DECLARE_BITMAP()' and split 'struct'
   definition and test data.

Changes in v6:
 - [Patch 2/4]: Make 'for loop' inside test_for_each_set_clump more
   succinct.

Changes in v5:
 - [Patch 4/4]: Minor change: Hardcode value for better code readability.

Changes in v4:
 - [Patch 2/4]: Use 'for' loop in test function of for_each_set_clump.
 - [Patch 3/4]: Minor change: Inline value for better code readability.
 - [Patch 4/4]: Minor change: Inline value for better code readability.

Changes in v3:
 - [Patch 3/4]: Change datatype of some variables from u64 to unsigned long
   in function thunderx_gpio_set_multiple.

CHanges in v2:
 - [Patch 2/4]: Unify different tests for 'for_each_set_clump'. Pass test data 
as
   function parameters.
 - [Patch 2/4]: Remove unnecessary bitmap_zero calls.

Syed Nayyar Waris (4):
  bitops: Introduce the for_each_set_clump macro
  lib/test_bitmap.c: Add for_each_set_clump test cases
  gpio: thunderx: Utilize for_each_set_clump macro
  gpio: xilinx: Utilize for_each_set_clump macro

 drivers/gpio/gpio-thunderx.c  |  11 ++-
 drivers/gpio/gpio-xilinx.c|  62 ++---
 include/asm-generic/bitops/find.h |  19 
 include/linux/bitmap.h|  61 +
 include/linux/bitops.h|  13 +++
 lib/find_bit.c|  14 +++
 lib/test_bitmap.c | 145 ++
 7 files changed, 291 insertions(+), 34 deletions(-)


base-commit: 444fc5cde64330661bf59944c43844e7d4c2ccd8
-- 
2.26.2



Re: [PATCH v7 0/4] Introduce the for_each_set_clump macro

2020-06-15 Thread Syed Nayyar Waris
On Mon, May 25, 2020 at 3:06 PM Bartosz Golaszewski
 wrote:
>
> niedz., 24 maj 2020 o 07:00 Syed Nayyar Waris  
> napisał(a):
> >
> > Hello Linus,
> >
> > Since this patchset primarily affects GPIO drivers, would you like
> > to pick it up through your GPIO tree?
> >
> > This patchset introduces a new generic version of for_each_set_clump.
> > The previous version of for_each_set_clump8 used a fixed size 8-bit
> > clump, but the new generic version can work with clump of any size but
> > less than or equal to BITS_PER_LONG. The patchset utilizes the new macro
> > in several GPIO drivers.
> >
> > The earlier 8-bit for_each_set_clump8 facilitated a
> > for-loop syntax that iterates over a memory region entire groups of set
> > bits at a time.
> >
>
> The GPIO part looks good to me. Linus: how do we go about merging it
> given the bitops dependency?
>
> Bart

A minor change has been done in patch [2/4] to fix compilation warning.
Kindly refer patchset v8 in future.

Thanks
Syed Nayyar Waris


Re: [PATCH v5 1/3] counter: 104-quad-8: Add lock guards - generic interface

2020-06-06 Thread Syed Nayyar Waris
On Sun, Jun 7, 2020 at 9:39 AM William Breathitt Gray
 wrote:
>
> On Sun, Jun 07, 2020 at 09:28:40AM +0530, Syed Nayyar Waris wrote:
> > On Sat, Apr 4, 2020 at 7:36 PM Jonathan Cameron  wrote:
> > >
> > > On Mon, 30 Mar 2020 23:54:32 +0530
> > > Syed Nayyar Waris  wrote:
> > >
> > > > Hi Jonathan
> > > >
> > > > >Looks good.  I'm not sure right now which tree I'll take this through
> > > > >(depends on whether it looks like we'll get an rc8 and hence I can 
> > > > >sneak
> > > > >it in for the coming merge window or not).
> > > > >
> > > > >So poke me if I seem to have forgotten to apply this in a week or so.
> > > >
> > > > Gentle Reminder.
> > > > Thanks !
> > > > Syed Nayyar Waris
> > >
> > > Thanks.  I've applied it to the fixes-togreg branch of iio.git which will 
> > > go
> > > upstream after the merge window closes.
> > >
> > > Thanks,
> > >
> > > Jonathan
> > >
> >
> > HI Jonathan,
> >
> > I think only the patch [1/3] has been applied. Patches [2/3] and [3/3] have 
> > not.
> >
> > The three patches were:
> > https://lore.kernel.org/patchwork/patch/1210135/
> > https://lore.kernel.org/patchwork/patch/1210136/
> > https://lore.kernel.org/patchwork/patch/1210137/
> >
> > The last 2 patches need to be applied, I think.
> >
> > Regards
> > Syed Nayyar Waris
>
> Just a heads-up: the relevant bugs are present in the 5.7 release so it
> would be prudent to tag those two patches with respective Fixes lines.
>
> William Breathitt Gray

Mentioning below, the 'Fixes' tags just for reference:
For patch [2/3]: counter: 104-quad-8: Add lock guards - differential encoder.
Fixes: bbef69e088c3 ("counter: 104-quad-8: Support Differential
Encoder Cable Status")

For patch [3/3]: counter: 104-quad-8: Add lock guards - filter clock prescaler.
Fixes: 9b74dddf79be ("counter: 104-quad-8: Support Filter Clock Prescaler")

I have replied on the v5 patches [2/3] and [3/3] with the (above)
'Fixes' tags. I have added the tags in the message.

I think that was what you meant.

Regards
Syed Nayyar Waris


  1   2   >