>>> On 08.01.18 at 17:07, <mar...@c-home.cz> wrote:
> On Mon, 8 Jan 2018, Jan Beulich wrote:
>>>>> On 07.01.18 at 13:34, <mar...@c-home.cz> wrote:
>>> (XEN) ----[ Xen-4.10.0-vgpu  x86_64  debug=n   Not tainted ]----
>>
>> The -vgpu tag makes me wonder whether you have any patches in
>> your tree on top of plain 4.10.0 (or 4.10-staging). Also the debug=n
>> above ...
> 
> 4.10.0 + 11 patches to make nvidia/vgpu work 
> (https://github.com/xenserver/xen-4.7.pg).
> debug=n because xen's modified debug build process.
> 
>>> (XEN)    [<ffff82d08026ae60>] __find_next_bit+0x10/0x80
>>> (XEN)    [<ffff82d080253180>] cpufreq_ondemand.c#do_dbs_timer+0x160/0x220
>>> (XEN)    [<ffff82d0802c7c0e>] mwait-idle.c#mwait_idle+0x23e/0x340
>>> (XEN)    [<ffff82d08026fa56>] domain.c#idle_loop+0x86/0xc0
>>
>> ... makes this call trace unreliable. But even with a reliable call
>> trace, analysis of the crash would be helped if you made
>> available the xen-syms (or xen.efi, depending on how you boot)
>> somewhere.
> 
> xen-syms - http://www.uschovna.cz/en/zasilka/UDP5LVE2679CGBIS-4YV/ 

Thanks. Looks to be a race between a timer in the governor and
the CPUs being brought down. In general the governor is supposed
to be disabled in the course of CPUs being brought down, so first
of all I wonder whether you're having some daemon in use which
sends management requests to the CPUfreq driver in Xen. Such a
daemon should of course be disabled by the system shutdown
scripts. Otherwise please try the attached debugging patch -
maybe we can see something from its output.

Jan

--- unstable.orig/xen/drivers/cpufreq/cpufreq.c 2017-09-12 12:39:58.310556379 
+0200
+++ unstable/xen/drivers/cpufreq/cpufreq.c      2018-01-09 17:21:09.659208437 
+0100
@@ -352,6 +352,8 @@ int cpufreq_del_cpu(unsigned int cpu)
 
     /* for HW_ALL, stop gov for each core of the _PSD domain */
     /* for SW_ALL & SW_ANY, stop gov for the 1st core of the _PSD domain */
+printk("cpufreq: del CPU%u (%u,%lx,%lu,%lx)\n", cpu,//temp
+       hw_all, cpufreq_dom->map->bits[0], perf->domain_info.num_processors, 
policy->cpus->bits[0]);//temp
     if (hw_all || (cpumask_weight(cpufreq_dom->map) ==
                    perf->domain_info.num_processors))
         __cpufreq_governor(policy, CPUFREQ_GOV_STOP);
--- unstable.orig/xen/drivers/cpufreq/cpufreq_ondemand.c        2017-09-12 
12:39:58.310556379 +0200
+++ unstable/xen/drivers/cpufreq/cpufreq_ondemand.c     2018-01-09 
17:16:07.633604995 +0100
@@ -218,6 +218,9 @@ int cpufreq_governor_dbs(struct cpufreq_
 
     switch (event) {
     case CPUFREQ_GOV_START:
+if(system_state > SYS_STATE_active) {//temp
+ printk("dbs: start CPU%u [%pS]\n", cpu, __builtin_return_address(0));
+}
         if ((!cpu_online(cpu)) || (!policy->cur))
             return -EINVAL;
 
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Reply via email to