Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Scot Doyle
On Thu, 19 May 2016, David Daney wrote:
> On 05/18/2016 09:21 PM, Scot Doyle wrote:
> > Two current [1] and three previous [2] systems locked during boot
> > because the cursor flash timer was set using an ops->cur_blink_jiffies
> > value of 0. Previous patches attempted to solve the problem by moving
> > variable initialization earlier in the setup sequence [2].
> > 
> > Use the normal cursor blink default interval of 200 ms if
> > ops->cur_blink_jiffies is not in the range specified in commit
> > bd63364caa8d. Since invalid values are not used, specific system
> > initialization timings should not cause lockups.
> > 
> 
> This patch just papers over the problem that you yourself introduced in commit
> bd63364caa8d ("vt: add cursor blink interval escape sequence").
> 
> As you know, I have a patch that fixes the problem at the source:
> https://lkml.org/lkml/2016/5/17/455
> 
> I don't like the idea of silently ignoring bad values passed in from other
> code (drivers/tty/vt/vt.c), and much less doing the check for bad values each
> time the timer expires rather than just once, where the bad value is first
> introduced.
> 
> I think it would be preferable to WARN() at the site the bad value is
> introduced, so that we can easily find the real source of the problem.
> Initialize cur_blink_jiffies to a sane default value, then if something
> attempts to set it to a value that would cause soft lockup, WARN and refuse to
> change it.

I agree this approach would be cleaner and am willing to give it a try
by submitting an alternative patch and ack'ing yours. Thanks for taking 
the time to critique my proposal.



Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Scot Doyle
On Thu, 19 May 2016, David Daney wrote:
> On 05/18/2016 09:21 PM, Scot Doyle wrote:
> > Two current [1] and three previous [2] systems locked during boot
> > because the cursor flash timer was set using an ops->cur_blink_jiffies
> > value of 0. Previous patches attempted to solve the problem by moving
> > variable initialization earlier in the setup sequence [2].
> > 
> > Use the normal cursor blink default interval of 200 ms if
> > ops->cur_blink_jiffies is not in the range specified in commit
> > bd63364caa8d. Since invalid values are not used, specific system
> > initialization timings should not cause lockups.
> > 
> 
> This patch just papers over the problem that you yourself introduced in commit
> bd63364caa8d ("vt: add cursor blink interval escape sequence").
> 
> As you know, I have a patch that fixes the problem at the source:
> https://lkml.org/lkml/2016/5/17/455
> 
> I don't like the idea of silently ignoring bad values passed in from other
> code (drivers/tty/vt/vt.c), and much less doing the check for bad values each
> time the timer expires rather than just once, where the bad value is first
> introduced.
> 
> I think it would be preferable to WARN() at the site the bad value is
> introduced, so that we can easily find the real source of the problem.
> Initialize cur_blink_jiffies to a sane default value, then if something
> attempts to set it to a value that would cause soft lockup, WARN and refuse to
> change it.

I agree this approach would be cleaner and am willing to give it a try
by submitting an alternative patch and ack'ing yours. Thanks for taking 
the time to critique my proposal.



Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread David Daney

On 05/18/2016 09:21 PM, Scot Doyle wrote:

Two current [1] and three previous [2] systems locked during boot
because the cursor flash timer was set using an ops->cur_blink_jiffies
value of 0. Previous patches attempted to solve the problem by moving
variable initialization earlier in the setup sequence [2].

Use the normal cursor blink default interval of 200 ms if
ops->cur_blink_jiffies is not in the range specified in commit
bd63364caa8d. Since invalid values are not used, specific system
initialization timings should not cause lockups.



This patch just papers over the problem that you yourself introduced in 
commit bd63364caa8d ("vt: add cursor blink interval escape sequence").


As you know, I have a patch that fixes the problem at the source:
https://lkml.org/lkml/2016/5/17/455

I don't like the idea of silently ignoring bad values passed in from 
other code (drivers/tty/vt/vt.c), and much less doing the check for bad 
values each time the timer expires rather than just once, where the bad 
value is first introduced.


I think it would be preferable to WARN() at the site the bad value is 
introduced, so that we can easily find the real source of the problem. 
Initialize cur_blink_jiffies to a sane default value, then if something 
attempts to set it to a value that would cause soft lockup, WARN and 
refuse to change it.


Also there is a stylistic issue...



[1] https://bugs.launchpad.net/bugs/1574814
[2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5

Signed-off-by: Scot Doyle 
Cc:  [v4.2]
---
  drivers/video/console/fbcon.c | 17 +
  1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 6e92917..da61d87 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -396,13 +396,23 @@ static void fb_flashcursor(struct work_struct *work)
console_unlock();
  }

+static int cursor_blink_jiffies(int candidate)
+{
+   if (candidate >= msecs_to_jiffies(50) &&
+   candidate <= msecs_to_jiffies(USHRT_MAX))
+   return candidate;
+   else
+   return HZ / 5;


You use msecs_to_jiffies() is several places, then here open code the 
division.  Please use  msecs_to_jiffies(), that is it's intended job.




+}
+
  static void cursor_timer_handler(unsigned long dev_addr)
  {
struct fb_info *info = (struct fb_info *) dev_addr;
struct fbcon_ops *ops = info->fbcon_par;

queue_work(system_power_efficient_wq, >queue);
-   mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
+   mod_timer(>cursor_timer, jiffies +
+   cursor_blink_jiffies(ops->cur_blink_jiffies));
  }

  static void fbcon_add_cursor_timer(struct fb_info *info)
@@ -417,7 +427,8 @@ static void fbcon_add_cursor_timer(struct fb_info *info)

init_timer(>cursor_timer);
ops->cursor_timer.function = cursor_timer_handler;
-   ops->cursor_timer.expires = jiffies + ops->cur_blink_jiffies;
+   ops->cursor_timer.expires = jiffies +
+   cursor_blink_jiffies(ops->cur_blink_jiffies);
ops->cursor_timer.data = (unsigned long ) info;
add_timer(>cursor_timer);
ops->flags |= FBCON_FLAGS_CURSOR_TIMER;
@@ -709,7 +720,6 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, 
struct fb_info *info,
}

if (!err) {
-   ops->cur_blink_jiffies = HZ / 5;
info->fbcon_par = ops;

if (vc)
@@ -957,7 +967,6 @@ static const char *fbcon_startup(void)
ops->currcon = -1;
ops->graphics = 1;
ops->cur_rotate = -1;
-   ops->cur_blink_jiffies = HZ / 5;
info->fbcon_par = ops;
p->con_rotate = initial_rotation;
set_blitting_type(vc, info);





Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread David Daney

On 05/18/2016 09:21 PM, Scot Doyle wrote:

Two current [1] and three previous [2] systems locked during boot
because the cursor flash timer was set using an ops->cur_blink_jiffies
value of 0. Previous patches attempted to solve the problem by moving
variable initialization earlier in the setup sequence [2].

Use the normal cursor blink default interval of 200 ms if
ops->cur_blink_jiffies is not in the range specified in commit
bd63364caa8d. Since invalid values are not used, specific system
initialization timings should not cause lockups.



This patch just papers over the problem that you yourself introduced in 
commit bd63364caa8d ("vt: add cursor blink interval escape sequence").


As you know, I have a patch that fixes the problem at the source:
https://lkml.org/lkml/2016/5/17/455

I don't like the idea of silently ignoring bad values passed in from 
other code (drivers/tty/vt/vt.c), and much less doing the check for bad 
values each time the timer expires rather than just once, where the bad 
value is first introduced.


I think it would be preferable to WARN() at the site the bad value is 
introduced, so that we can easily find the real source of the problem. 
Initialize cur_blink_jiffies to a sane default value, then if something 
attempts to set it to a value that would cause soft lockup, WARN and 
refuse to change it.


Also there is a stylistic issue...



[1] https://bugs.launchpad.net/bugs/1574814
[2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5

Signed-off-by: Scot Doyle 
Cc:  [v4.2]
---
  drivers/video/console/fbcon.c | 17 +
  1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 6e92917..da61d87 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -396,13 +396,23 @@ static void fb_flashcursor(struct work_struct *work)
console_unlock();
  }

+static int cursor_blink_jiffies(int candidate)
+{
+   if (candidate >= msecs_to_jiffies(50) &&
+   candidate <= msecs_to_jiffies(USHRT_MAX))
+   return candidate;
+   else
+   return HZ / 5;


You use msecs_to_jiffies() is several places, then here open code the 
division.  Please use  msecs_to_jiffies(), that is it's intended job.




+}
+
  static void cursor_timer_handler(unsigned long dev_addr)
  {
struct fb_info *info = (struct fb_info *) dev_addr;
struct fbcon_ops *ops = info->fbcon_par;

queue_work(system_power_efficient_wq, >queue);
-   mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
+   mod_timer(>cursor_timer, jiffies +
+   cursor_blink_jiffies(ops->cur_blink_jiffies));
  }

  static void fbcon_add_cursor_timer(struct fb_info *info)
@@ -417,7 +427,8 @@ static void fbcon_add_cursor_timer(struct fb_info *info)

init_timer(>cursor_timer);
ops->cursor_timer.function = cursor_timer_handler;
-   ops->cursor_timer.expires = jiffies + ops->cur_blink_jiffies;
+   ops->cursor_timer.expires = jiffies +
+   cursor_blink_jiffies(ops->cur_blink_jiffies);
ops->cursor_timer.data = (unsigned long ) info;
add_timer(>cursor_timer);
ops->flags |= FBCON_FLAGS_CURSOR_TIMER;
@@ -709,7 +720,6 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, 
struct fb_info *info,
}

if (!err) {
-   ops->cur_blink_jiffies = HZ / 5;
info->fbcon_par = ops;

if (vc)
@@ -957,7 +967,6 @@ static const char *fbcon_startup(void)
ops->currcon = -1;
ops->graphics = 1;
ops->cur_rotate = -1;
-   ops->cur_blink_jiffies = HZ / 5;
info->fbcon_par = ops;
p->con_rotate = initial_rotation;
set_blitting_type(vc, info);





Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Ming Lei
On Thu, May 19, 2016 at 10:22 PM, Scot Doyle  wrote:
> On Thu, 19 May 2016, Pavel Machek wrote:
>> Hi!
>>
>> > Two current [1] and three previous [2] systems locked during boot
>> > because the cursor flash timer was set using an ops->cur_blink_jiffies
>> > value of 0. Previous patches attempted to solve the problem by moving
>> > variable initialization earlier in the setup sequence [2].
>> >
>> > Use the normal cursor blink default interval of 200 ms if
>> > ops->cur_blink_jiffies is not in the range specified in commit
>> > bd63364caa8d. Since invalid values are not used, specific system
>> > initialization timings should not cause lockups.
>> >
>> > [1] https://bugs.launchpad.net/bugs/1574814
>> > [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5
>>
>> Acked-by: Pavel Machek 
>>
>> >  static void cursor_timer_handler(unsigned long dev_addr)
>> >  {
>> > struct fb_info *info = (struct fb_info *) dev_addr;
>> > struct fbcon_ops *ops = info->fbcon_par;
>> >
>> > queue_work(system_power_efficient_wq, >queue);
>> > -   mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
>> > +   mod_timer(>cursor_timer, jiffies +
>> > +   cursor_blink_jiffies(ops->cur_blink_jiffies));
>> >  }
>> >
>> >  static void fbcon_add_cursor_timer(struct fb_info *info)
>>
>> And actually... perhaps mod_timer should have some check for too low
>> timeouts..?
>>
>> WARN_ON?
>>   Pavel
>
>
> Interesting idea. I applied this patch to a couple systems and
> receive the same warning on both:

If 'jiffies' is passed to mod_timer() for same timer unusually OR
mod_timer() isn't called from the timer handler, it shoudln't cause
soft lockup.

In the case of fbcon, 'jiffies' is always passed to mod_timer() and
mod_timer() is called from the timer handler meantime, that is a
real lockup.

>
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 73164c3..f6c0b69 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -788,6 +788,7 @@ __mod_timer(struct timer_list *timer, unsigned long 
> expires,
>
> timer_stats_timer_set_start_info(timer);
> BUG_ON(!timer->function);
> +   WARN_ONCE(expires == jiffies, "timer should expire in the future");
>
> base = lock_timer_base(timer, );
>
> --
>
> [2.060474] [ cut here ]
> [2.061613] WARNING: CPU: 0 PID: 164 at kernel/time/timer.c:791 
> mod_timer+0x233/0x240
> [2.062740] timer should expire in the future
> [2.062757] CPU: 0 PID: 164 Comm: kworker/0:2 Not tainted 4.6.0+ #7
> [2.065870] Hardware name: Toshiba Leon, BIOS  12/04/2013
> [2.067828] Workqueue: events_power_efficient hub_init_func3
> [2.069762]   88007443bbb8 8139932b 
> 88007443bc08
> [2.071701]   88007443bbf8 8112e57c 
> 0317
> [2.073655]  88007486a0b0 fffea2da 88007486a000 
> 0202
> [2.075594] Call Trace:
> [2.077503]  [] dump_stack+0x4d/0x72
> [2.079426]  [] __warn+0xcc/0xf0
> [2.081325]  [] warn_slowpath_fmt+0x4f/0x60
> [2.083212]  [] ? find_next_bit+0x15/0x20
> [2.085022]  [] ? cpumask_next_and+0x2f/0x40
> [2.086696]  [] mod_timer+0x233/0x240
> [2.088362]  [] usb_hcd_submit_urb+0x3f2/0x8c0
> [2.090026]  [] ? urb_destroy+0x24/0x30
> [2.091698]  [] ? insert_work+0x58/0xb0
> [2.093349]  [] usb_submit_urb+0x287/0x530
> [2.094985]  [] hub_activate+0x1fd/0x5d0
> [2.096625]  [] ? finish_task_switch+0x78/0x1f0
> [2.098268]  [] hub_init_func3+0x1a/0x20
> [2.099908]  [] process_one_work+0x140/0x3e0
> [2.101539]  [] worker_thread+0x4e/0x480
> [2.103173]  [] ? process_one_work+0x3e0/0x3e0
> [2.104790]  [] ? process_one_work+0x3e0/0x3e0
> [2.106259]  [] kthread+0xc9/0xe0
> [2.107731]  [] ret_from_fork+0x22/0x40
> [2.109215]  [] ? __kthread_parkme+0x70/0x70
> [2.110704] ---[ end trace 3519886a1a990d99 ]---
>
> mod_timer is called from over a thousand places. Should timers always
> expire in the future?
>


Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Ming Lei
On Thu, May 19, 2016 at 10:22 PM, Scot Doyle  wrote:
> On Thu, 19 May 2016, Pavel Machek wrote:
>> Hi!
>>
>> > Two current [1] and three previous [2] systems locked during boot
>> > because the cursor flash timer was set using an ops->cur_blink_jiffies
>> > value of 0. Previous patches attempted to solve the problem by moving
>> > variable initialization earlier in the setup sequence [2].
>> >
>> > Use the normal cursor blink default interval of 200 ms if
>> > ops->cur_blink_jiffies is not in the range specified in commit
>> > bd63364caa8d. Since invalid values are not used, specific system
>> > initialization timings should not cause lockups.
>> >
>> > [1] https://bugs.launchpad.net/bugs/1574814
>> > [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5
>>
>> Acked-by: Pavel Machek 
>>
>> >  static void cursor_timer_handler(unsigned long dev_addr)
>> >  {
>> > struct fb_info *info = (struct fb_info *) dev_addr;
>> > struct fbcon_ops *ops = info->fbcon_par;
>> >
>> > queue_work(system_power_efficient_wq, >queue);
>> > -   mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
>> > +   mod_timer(>cursor_timer, jiffies +
>> > +   cursor_blink_jiffies(ops->cur_blink_jiffies));
>> >  }
>> >
>> >  static void fbcon_add_cursor_timer(struct fb_info *info)
>>
>> And actually... perhaps mod_timer should have some check for too low
>> timeouts..?
>>
>> WARN_ON?
>>   Pavel
>
>
> Interesting idea. I applied this patch to a couple systems and
> receive the same warning on both:

If 'jiffies' is passed to mod_timer() for same timer unusually OR
mod_timer() isn't called from the timer handler, it shoudln't cause
soft lockup.

In the case of fbcon, 'jiffies' is always passed to mod_timer() and
mod_timer() is called from the timer handler meantime, that is a
real lockup.

>
> diff --git a/kernel/time/timer.c b/kernel/time/timer.c
> index 73164c3..f6c0b69 100644
> --- a/kernel/time/timer.c
> +++ b/kernel/time/timer.c
> @@ -788,6 +788,7 @@ __mod_timer(struct timer_list *timer, unsigned long 
> expires,
>
> timer_stats_timer_set_start_info(timer);
> BUG_ON(!timer->function);
> +   WARN_ONCE(expires == jiffies, "timer should expire in the future");
>
> base = lock_timer_base(timer, );
>
> --
>
> [2.060474] [ cut here ]
> [2.061613] WARNING: CPU: 0 PID: 164 at kernel/time/timer.c:791 
> mod_timer+0x233/0x240
> [2.062740] timer should expire in the future
> [2.062757] CPU: 0 PID: 164 Comm: kworker/0:2 Not tainted 4.6.0+ #7
> [2.065870] Hardware name: Toshiba Leon, BIOS  12/04/2013
> [2.067828] Workqueue: events_power_efficient hub_init_func3
> [2.069762]   88007443bbb8 8139932b 
> 88007443bc08
> [2.071701]   88007443bbf8 8112e57c 
> 0317
> [2.073655]  88007486a0b0 fffea2da 88007486a000 
> 0202
> [2.075594] Call Trace:
> [2.077503]  [] dump_stack+0x4d/0x72
> [2.079426]  [] __warn+0xcc/0xf0
> [2.081325]  [] warn_slowpath_fmt+0x4f/0x60
> [2.083212]  [] ? find_next_bit+0x15/0x20
> [2.085022]  [] ? cpumask_next_and+0x2f/0x40
> [2.086696]  [] mod_timer+0x233/0x240
> [2.088362]  [] usb_hcd_submit_urb+0x3f2/0x8c0
> [2.090026]  [] ? urb_destroy+0x24/0x30
> [2.091698]  [] ? insert_work+0x58/0xb0
> [2.093349]  [] usb_submit_urb+0x287/0x530
> [2.094985]  [] hub_activate+0x1fd/0x5d0
> [2.096625]  [] ? finish_task_switch+0x78/0x1f0
> [2.098268]  [] hub_init_func3+0x1a/0x20
> [2.099908]  [] process_one_work+0x140/0x3e0
> [2.101539]  [] worker_thread+0x4e/0x480
> [2.103173]  [] ? process_one_work+0x3e0/0x3e0
> [2.104790]  [] ? process_one_work+0x3e0/0x3e0
> [2.106259]  [] kthread+0xc9/0xe0
> [2.107731]  [] ret_from_fork+0x22/0x40
> [2.109215]  [] ? __kthread_parkme+0x70/0x70
> [2.110704] ---[ end trace 3519886a1a990d99 ]---
>
> mod_timer is called from over a thousand places. Should timers always
> expire in the future?
>


Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Scot Doyle
On Thu, 19 May 2016, Pavel Machek wrote:
> Hi!
> 
> > Two current [1] and three previous [2] systems locked during boot
> > because the cursor flash timer was set using an ops->cur_blink_jiffies
> > value of 0. Previous patches attempted to solve the problem by moving
> > variable initialization earlier in the setup sequence [2].
> > 
> > Use the normal cursor blink default interval of 200 ms if
> > ops->cur_blink_jiffies is not in the range specified in commit
> > bd63364caa8d. Since invalid values are not used, specific system
> > initialization timings should not cause lockups.
> > 
> > [1] https://bugs.launchpad.net/bugs/1574814
> > [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5
> 
> Acked-by: Pavel Machek 
> 
> >  static void cursor_timer_handler(unsigned long dev_addr)
> >  {
> > struct fb_info *info = (struct fb_info *) dev_addr;
> > struct fbcon_ops *ops = info->fbcon_par;
> >  
> > queue_work(system_power_efficient_wq, >queue);
> > -   mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
> > +   mod_timer(>cursor_timer, jiffies +
> > +   cursor_blink_jiffies(ops->cur_blink_jiffies));
> >  }
> >  
> >  static void fbcon_add_cursor_timer(struct fb_info *info)
> 
> And actually... perhaps mod_timer should have some check for too low
> timeouts..?
> 
> WARN_ON?
>   Pavel


Interesting idea. I applied this patch to a couple systems and 
receive the same warning on both:

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 73164c3..f6c0b69 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -788,6 +788,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
 
timer_stats_timer_set_start_info(timer);
BUG_ON(!timer->function);
+   WARN_ONCE(expires == jiffies, "timer should expire in the future");
 
base = lock_timer_base(timer, );
 
--

[2.060474] [ cut here ]
[2.061613] WARNING: CPU: 0 PID: 164 at kernel/time/timer.c:791 
mod_timer+0x233/0x240
[2.062740] timer should expire in the future
[2.062757] CPU: 0 PID: 164 Comm: kworker/0:2 Not tainted 4.6.0+ #7
[2.065870] Hardware name: Toshiba Leon, BIOS  12/04/2013
[2.067828] Workqueue: events_power_efficient hub_init_func3
[2.069762]   88007443bbb8 8139932b 
88007443bc08
[2.071701]   88007443bbf8 8112e57c 
0317
[2.073655]  88007486a0b0 fffea2da 88007486a000 
0202
[2.075594] Call Trace:
[2.077503]  [] dump_stack+0x4d/0x72
[2.079426]  [] __warn+0xcc/0xf0
[2.081325]  [] warn_slowpath_fmt+0x4f/0x60
[2.083212]  [] ? find_next_bit+0x15/0x20
[2.085022]  [] ? cpumask_next_and+0x2f/0x40
[2.086696]  [] mod_timer+0x233/0x240
[2.088362]  [] usb_hcd_submit_urb+0x3f2/0x8c0
[2.090026]  [] ? urb_destroy+0x24/0x30
[2.091698]  [] ? insert_work+0x58/0xb0
[2.093349]  [] usb_submit_urb+0x287/0x530
[2.094985]  [] hub_activate+0x1fd/0x5d0
[2.096625]  [] ? finish_task_switch+0x78/0x1f0
[2.098268]  [] hub_init_func3+0x1a/0x20
[2.099908]  [] process_one_work+0x140/0x3e0
[2.101539]  [] worker_thread+0x4e/0x480
[2.103173]  [] ? process_one_work+0x3e0/0x3e0
[2.104790]  [] ? process_one_work+0x3e0/0x3e0
[2.106259]  [] kthread+0xc9/0xe0
[2.107731]  [] ret_from_fork+0x22/0x40
[2.109215]  [] ? __kthread_parkme+0x70/0x70
[2.110704] ---[ end trace 3519886a1a990d99 ]---

mod_timer is called from over a thousand places. Should timers always 
expire in the future?



Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Scot Doyle
On Thu, 19 May 2016, Pavel Machek wrote:
> Hi!
> 
> > Two current [1] and three previous [2] systems locked during boot
> > because the cursor flash timer was set using an ops->cur_blink_jiffies
> > value of 0. Previous patches attempted to solve the problem by moving
> > variable initialization earlier in the setup sequence [2].
> > 
> > Use the normal cursor blink default interval of 200 ms if
> > ops->cur_blink_jiffies is not in the range specified in commit
> > bd63364caa8d. Since invalid values are not used, specific system
> > initialization timings should not cause lockups.
> > 
> > [1] https://bugs.launchpad.net/bugs/1574814
> > [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5
> 
> Acked-by: Pavel Machek 
> 
> >  static void cursor_timer_handler(unsigned long dev_addr)
> >  {
> > struct fb_info *info = (struct fb_info *) dev_addr;
> > struct fbcon_ops *ops = info->fbcon_par;
> >  
> > queue_work(system_power_efficient_wq, >queue);
> > -   mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
> > +   mod_timer(>cursor_timer, jiffies +
> > +   cursor_blink_jiffies(ops->cur_blink_jiffies));
> >  }
> >  
> >  static void fbcon_add_cursor_timer(struct fb_info *info)
> 
> And actually... perhaps mod_timer should have some check for too low
> timeouts..?
> 
> WARN_ON?
>   Pavel


Interesting idea. I applied this patch to a couple systems and 
receive the same warning on both:

diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 73164c3..f6c0b69 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -788,6 +788,7 @@ __mod_timer(struct timer_list *timer, unsigned long expires,
 
timer_stats_timer_set_start_info(timer);
BUG_ON(!timer->function);
+   WARN_ONCE(expires == jiffies, "timer should expire in the future");
 
base = lock_timer_base(timer, );
 
--

[2.060474] [ cut here ]
[2.061613] WARNING: CPU: 0 PID: 164 at kernel/time/timer.c:791 
mod_timer+0x233/0x240
[2.062740] timer should expire in the future
[2.062757] CPU: 0 PID: 164 Comm: kworker/0:2 Not tainted 4.6.0+ #7
[2.065870] Hardware name: Toshiba Leon, BIOS  12/04/2013
[2.067828] Workqueue: events_power_efficient hub_init_func3
[2.069762]   88007443bbb8 8139932b 
88007443bc08
[2.071701]   88007443bbf8 8112e57c 
0317
[2.073655]  88007486a0b0 fffea2da 88007486a000 
0202
[2.075594] Call Trace:
[2.077503]  [] dump_stack+0x4d/0x72
[2.079426]  [] __warn+0xcc/0xf0
[2.081325]  [] warn_slowpath_fmt+0x4f/0x60
[2.083212]  [] ? find_next_bit+0x15/0x20
[2.085022]  [] ? cpumask_next_and+0x2f/0x40
[2.086696]  [] mod_timer+0x233/0x240
[2.088362]  [] usb_hcd_submit_urb+0x3f2/0x8c0
[2.090026]  [] ? urb_destroy+0x24/0x30
[2.091698]  [] ? insert_work+0x58/0xb0
[2.093349]  [] usb_submit_urb+0x287/0x530
[2.094985]  [] hub_activate+0x1fd/0x5d0
[2.096625]  [] ? finish_task_switch+0x78/0x1f0
[2.098268]  [] hub_init_func3+0x1a/0x20
[2.099908]  [] process_one_work+0x140/0x3e0
[2.101539]  [] worker_thread+0x4e/0x480
[2.103173]  [] ? process_one_work+0x3e0/0x3e0
[2.104790]  [] ? process_one_work+0x3e0/0x3e0
[2.106259]  [] kthread+0xc9/0xe0
[2.107731]  [] ret_from_fork+0x22/0x40
[2.109215]  [] ? __kthread_parkme+0x70/0x70
[2.110704] ---[ end trace 3519886a1a990d99 ]---

mod_timer is called from over a thousand places. Should timers always 
expire in the future?



Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Pavel Machek
Hi!

> Two current [1] and three previous [2] systems locked during boot
> because the cursor flash timer was set using an ops->cur_blink_jiffies
> value of 0. Previous patches attempted to solve the problem by moving
> variable initialization earlier in the setup sequence [2].
> 
> Use the normal cursor blink default interval of 200 ms if
> ops->cur_blink_jiffies is not in the range specified in commit
> bd63364caa8d. Since invalid values are not used, specific system
> initialization timings should not cause lockups.
> 
> [1] https://bugs.launchpad.net/bugs/1574814
> [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5

Acked-by: Pavel Machek 

>  static void cursor_timer_handler(unsigned long dev_addr)
>  {
>   struct fb_info *info = (struct fb_info *) dev_addr;
>   struct fbcon_ops *ops = info->fbcon_par;
>  
>   queue_work(system_power_efficient_wq, >queue);
> - mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
> + mod_timer(>cursor_timer, jiffies +
> + cursor_blink_jiffies(ops->cur_blink_jiffies));
>  }
>  
>  static void fbcon_add_cursor_timer(struct fb_info *info)

And actually... perhaps mod_timer should have some check for too low
timeouts..?

WARN_ON?
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Pavel Machek
Hi!

> Two current [1] and three previous [2] systems locked during boot
> because the cursor flash timer was set using an ops->cur_blink_jiffies
> value of 0. Previous patches attempted to solve the problem by moving
> variable initialization earlier in the setup sequence [2].
> 
> Use the normal cursor blink default interval of 200 ms if
> ops->cur_blink_jiffies is not in the range specified in commit
> bd63364caa8d. Since invalid values are not used, specific system
> initialization timings should not cause lockups.
> 
> [1] https://bugs.launchpad.net/bugs/1574814
> [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5

Acked-by: Pavel Machek 

>  static void cursor_timer_handler(unsigned long dev_addr)
>  {
>   struct fb_info *info = (struct fb_info *) dev_addr;
>   struct fbcon_ops *ops = info->fbcon_par;
>  
>   queue_work(system_power_efficient_wq, >queue);
> - mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
> + mod_timer(>cursor_timer, jiffies +
> + cursor_blink_jiffies(ops->cur_blink_jiffies));
>  }
>  
>  static void fbcon_add_cursor_timer(struct fb_info *info)

And actually... perhaps mod_timer should have some check for too low
timeouts..?

WARN_ON?
Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Ming Lei
On Thu, May 19, 2016 at 12:21 PM, Scot Doyle  wrote:
> Two current [1] and three previous [2] systems locked during boot
> because the cursor flash timer was set using an ops->cur_blink_jiffies
> value of 0. Previous patches attempted to solve the problem by moving
> variable initialization earlier in the setup sequence [2].
>
> Use the normal cursor blink default interval of 200 ms if
> ops->cur_blink_jiffies is not in the range specified in commit
> bd63364caa8d. Since invalid values are not used, specific system
> initialization timings should not cause lockups.
>
> [1] https://bugs.launchpad.net/bugs/1574814
> [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5
>
> Signed-off-by: Scot Doyle 
> Cc:  [v4.2]

Tested-by: Ming Lei 

> ---
>  drivers/video/console/fbcon.c | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 6e92917..da61d87 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -396,13 +396,23 @@ static void fb_flashcursor(struct work_struct *work)
> console_unlock();
>  }
>
> +static int cursor_blink_jiffies(int candidate)
> +{
> +   if (candidate >= msecs_to_jiffies(50) &&
> +   candidate <= msecs_to_jiffies(USHRT_MAX))
> +   return candidate;
> +   else
> +   return HZ / 5;
> +}
> +
>  static void cursor_timer_handler(unsigned long dev_addr)
>  {
> struct fb_info *info = (struct fb_info *) dev_addr;
> struct fbcon_ops *ops = info->fbcon_par;
>
> queue_work(system_power_efficient_wq, >queue);
> -   mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
> +   mod_timer(>cursor_timer, jiffies +
> +   cursor_blink_jiffies(ops->cur_blink_jiffies));
>  }
>
>  static void fbcon_add_cursor_timer(struct fb_info *info)
> @@ -417,7 +427,8 @@ static void fbcon_add_cursor_timer(struct fb_info *info)
>
> init_timer(>cursor_timer);
> ops->cursor_timer.function = cursor_timer_handler;
> -   ops->cursor_timer.expires = jiffies + ops->cur_blink_jiffies;
> +   ops->cursor_timer.expires = jiffies +
> +   cursor_blink_jiffies(ops->cur_blink_jiffies);
> ops->cursor_timer.data = (unsigned long ) info;
> add_timer(>cursor_timer);
> ops->flags |= FBCON_FLAGS_CURSOR_TIMER;
> @@ -709,7 +720,6 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, 
> struct fb_info *info,
> }
>
> if (!err) {
> -   ops->cur_blink_jiffies = HZ / 5;
> info->fbcon_par = ops;
>
> if (vc)
> @@ -957,7 +967,6 @@ static const char *fbcon_startup(void)
> ops->currcon = -1;
> ops->graphics = 1;
> ops->cur_rotate = -1;
> -   ops->cur_blink_jiffies = HZ / 5;
> info->fbcon_par = ops;
> p->con_rotate = initial_rotation;
> set_blitting_type(vc, info);
> --
> 2.1.4
>


Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Ming Lei
On Thu, May 19, 2016 at 12:21 PM, Scot Doyle  wrote:
> Two current [1] and three previous [2] systems locked during boot
> because the cursor flash timer was set using an ops->cur_blink_jiffies
> value of 0. Previous patches attempted to solve the problem by moving
> variable initialization earlier in the setup sequence [2].
>
> Use the normal cursor blink default interval of 200 ms if
> ops->cur_blink_jiffies is not in the range specified in commit
> bd63364caa8d. Since invalid values are not used, specific system
> initialization timings should not cause lockups.
>
> [1] https://bugs.launchpad.net/bugs/1574814
> [2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5
>
> Signed-off-by: Scot Doyle 
> Cc:  [v4.2]

Tested-by: Ming Lei 

> ---
>  drivers/video/console/fbcon.c | 17 +
>  1 file changed, 13 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
> index 6e92917..da61d87 100644
> --- a/drivers/video/console/fbcon.c
> +++ b/drivers/video/console/fbcon.c
> @@ -396,13 +396,23 @@ static void fb_flashcursor(struct work_struct *work)
> console_unlock();
>  }
>
> +static int cursor_blink_jiffies(int candidate)
> +{
> +   if (candidate >= msecs_to_jiffies(50) &&
> +   candidate <= msecs_to_jiffies(USHRT_MAX))
> +   return candidate;
> +   else
> +   return HZ / 5;
> +}
> +
>  static void cursor_timer_handler(unsigned long dev_addr)
>  {
> struct fb_info *info = (struct fb_info *) dev_addr;
> struct fbcon_ops *ops = info->fbcon_par;
>
> queue_work(system_power_efficient_wq, >queue);
> -   mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
> +   mod_timer(>cursor_timer, jiffies +
> +   cursor_blink_jiffies(ops->cur_blink_jiffies));
>  }
>
>  static void fbcon_add_cursor_timer(struct fb_info *info)
> @@ -417,7 +427,8 @@ static void fbcon_add_cursor_timer(struct fb_info *info)
>
> init_timer(>cursor_timer);
> ops->cursor_timer.function = cursor_timer_handler;
> -   ops->cursor_timer.expires = jiffies + ops->cur_blink_jiffies;
> +   ops->cursor_timer.expires = jiffies +
> +   cursor_blink_jiffies(ops->cur_blink_jiffies);
> ops->cursor_timer.data = (unsigned long ) info;
> add_timer(>cursor_timer);
> ops->flags |= FBCON_FLAGS_CURSOR_TIMER;
> @@ -709,7 +720,6 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, 
> struct fb_info *info,
> }
>
> if (!err) {
> -   ops->cur_blink_jiffies = HZ / 5;
> info->fbcon_par = ops;
>
> if (vc)
> @@ -957,7 +967,6 @@ static const char *fbcon_startup(void)
> ops->currcon = -1;
> ops->graphics = 1;
> ops->cur_rotate = -1;
> -   ops->cur_blink_jiffies = HZ / 5;
> info->fbcon_par = ops;
> p->con_rotate = initial_rotation;
> set_blitting_type(vc, info);
> --
> 2.1.4
>


Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Jeremy Kerr
Hi Scot,

> Use the normal cursor blink default interval of 200 ms if
> ops->cur_blink_jiffies is not in the range specified in commit
> bd63364caa8d. Since invalid values are not used, specific system
> initialization timings should not cause lockups.

This fixes an issue we're seeing with the ast driver on OpenPOWER
machines, thanks!

Acked-by: Jeremy Kerr 

Cheers,


Jeremy


Re: [PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-19 Thread Jeremy Kerr
Hi Scot,

> Use the normal cursor blink default interval of 200 ms if
> ops->cur_blink_jiffies is not in the range specified in commit
> bd63364caa8d. Since invalid values are not used, specific system
> initialization timings should not cause lockups.

This fixes an issue we're seeing with the ast driver on OpenPOWER
machines, thanks!

Acked-by: Jeremy Kerr 

Cheers,


Jeremy


[PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-18 Thread Scot Doyle
Two current [1] and three previous [2] systems locked during boot
because the cursor flash timer was set using an ops->cur_blink_jiffies
value of 0. Previous patches attempted to solve the problem by moving
variable initialization earlier in the setup sequence [2].

Use the normal cursor blink default interval of 200 ms if
ops->cur_blink_jiffies is not in the range specified in commit
bd63364caa8d. Since invalid values are not used, specific system
initialization timings should not cause lockups.

[1] https://bugs.launchpad.net/bugs/1574814
[2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5

Signed-off-by: Scot Doyle 
Cc:  [v4.2]
---
 drivers/video/console/fbcon.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 6e92917..da61d87 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -396,13 +396,23 @@ static void fb_flashcursor(struct work_struct *work)
console_unlock();
 }
 
+static int cursor_blink_jiffies(int candidate)
+{
+   if (candidate >= msecs_to_jiffies(50) &&
+   candidate <= msecs_to_jiffies(USHRT_MAX))
+   return candidate;
+   else
+   return HZ / 5;
+}
+
 static void cursor_timer_handler(unsigned long dev_addr)
 {
struct fb_info *info = (struct fb_info *) dev_addr;
struct fbcon_ops *ops = info->fbcon_par;
 
queue_work(system_power_efficient_wq, >queue);
-   mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
+   mod_timer(>cursor_timer, jiffies +
+   cursor_blink_jiffies(ops->cur_blink_jiffies));
 }
 
 static void fbcon_add_cursor_timer(struct fb_info *info)
@@ -417,7 +427,8 @@ static void fbcon_add_cursor_timer(struct fb_info *info)
 
init_timer(>cursor_timer);
ops->cursor_timer.function = cursor_timer_handler;
-   ops->cursor_timer.expires = jiffies + ops->cur_blink_jiffies;
+   ops->cursor_timer.expires = jiffies +
+   cursor_blink_jiffies(ops->cur_blink_jiffies);
ops->cursor_timer.data = (unsigned long ) info;
add_timer(>cursor_timer);
ops->flags |= FBCON_FLAGS_CURSOR_TIMER;
@@ -709,7 +720,6 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, 
struct fb_info *info,
}
 
if (!err) {
-   ops->cur_blink_jiffies = HZ / 5;
info->fbcon_par = ops;
 
if (vc)
@@ -957,7 +967,6 @@ static const char *fbcon_startup(void)
ops->currcon = -1;
ops->graphics = 1;
ops->cur_rotate = -1;
-   ops->cur_blink_jiffies = HZ / 5;
info->fbcon_par = ops;
p->con_rotate = initial_rotation;
set_blitting_type(vc, info);
-- 
2.1.4



[PATCH] fbcon: use default if cursor blink interval is not valid

2016-05-18 Thread Scot Doyle
Two current [1] and three previous [2] systems locked during boot
because the cursor flash timer was set using an ops->cur_blink_jiffies
value of 0. Previous patches attempted to solve the problem by moving
variable initialization earlier in the setup sequence [2].

Use the normal cursor blink default interval of 200 ms if
ops->cur_blink_jiffies is not in the range specified in commit
bd63364caa8d. Since invalid values are not used, specific system
initialization timings should not cause lockups.

[1] https://bugs.launchpad.net/bugs/1574814
[2] see commits: 2a17d7e80f1d, f235f664a8af, a1e533ec07d5

Signed-off-by: Scot Doyle 
Cc:  [v4.2]
---
 drivers/video/console/fbcon.c | 17 +
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 6e92917..da61d87 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -396,13 +396,23 @@ static void fb_flashcursor(struct work_struct *work)
console_unlock();
 }
 
+static int cursor_blink_jiffies(int candidate)
+{
+   if (candidate >= msecs_to_jiffies(50) &&
+   candidate <= msecs_to_jiffies(USHRT_MAX))
+   return candidate;
+   else
+   return HZ / 5;
+}
+
 static void cursor_timer_handler(unsigned long dev_addr)
 {
struct fb_info *info = (struct fb_info *) dev_addr;
struct fbcon_ops *ops = info->fbcon_par;
 
queue_work(system_power_efficient_wq, >queue);
-   mod_timer(>cursor_timer, jiffies + ops->cur_blink_jiffies);
+   mod_timer(>cursor_timer, jiffies +
+   cursor_blink_jiffies(ops->cur_blink_jiffies));
 }
 
 static void fbcon_add_cursor_timer(struct fb_info *info)
@@ -417,7 +427,8 @@ static void fbcon_add_cursor_timer(struct fb_info *info)
 
init_timer(>cursor_timer);
ops->cursor_timer.function = cursor_timer_handler;
-   ops->cursor_timer.expires = jiffies + ops->cur_blink_jiffies;
+   ops->cursor_timer.expires = jiffies +
+   cursor_blink_jiffies(ops->cur_blink_jiffies);
ops->cursor_timer.data = (unsigned long ) info;
add_timer(>cursor_timer);
ops->flags |= FBCON_FLAGS_CURSOR_TIMER;
@@ -709,7 +720,6 @@ static int con2fb_acquire_newinfo(struct vc_data *vc, 
struct fb_info *info,
}
 
if (!err) {
-   ops->cur_blink_jiffies = HZ / 5;
info->fbcon_par = ops;
 
if (vc)
@@ -957,7 +967,6 @@ static const char *fbcon_startup(void)
ops->currcon = -1;
ops->graphics = 1;
ops->cur_rotate = -1;
-   ops->cur_blink_jiffies = HZ / 5;
info->fbcon_par = ops;
p->con_rotate = initial_rotation;
set_blitting_type(vc, info);
-- 
2.1.4