于 2013年03月14日 22:05, Greg KH 写道:
On Thu, Mar 14, 2013 at 09:51:34PM +0800, Shuge wrote:Hi all, When the kernel printk too many log, the cpu is failed to come online. The problem is this: For example, cpu0 bring up cpu1:a. cpu0 call cpu_up: cpu_up() ->_cpu_up() ->__cpu_notify(CPU_UP_PREPARE) ->__cpu_up() ->boot_secondary() # ->wait_for_completion_timeout(&cpu_running, msecs_to_jiffires(1000)) -> if (!cpu_online(cpu)) { pr_crit("CPU%u: failed to come online\n", cpu); ret = -EIO; } ->cpu_notify(CPU_ONLINE) b. cpu1 enter kernel: secondary_start_kernel() @ ->printk("CPU%u: Booted secondary processor\n", cpu) * ->calibrate_delay() ->set_cpu_online() ->complete(cpu_running) ->cpumask_set_cpu() While cpu0 run to mark #, which wait that cpu1 complete cpu_running, and set online. Generally, cpu0 can get it. But if the __log_buf is too large or other threads write it unceasing, then cpu1 come to mark @ or * in this moment. Cpu1 is busy outputing buffer, which cost time more than 1s, and cpu1 have not join in sched, so cpu0 wait it timeout. By reading printk.c, I found that can_use_console() always return true, which be called by console_trylock_for_printk(). Because, have_callable_console() return ture always, if the console driver set CON_ANYTIME flag. I think that cpu should not output the __log_buf in coming online, even though have_callable_console() is true. /* * Can we actually use the console at this time on this cpu? * * Console drivers may assume that per-cpu resources have * been allocated. So unless they're explicitly marked as * being able to cope (CON_ANYTIME) don't call them until * this CPU is officially up. */ static inline int can_use_console(unsigned int cpu) { return cpu_online(cpu) || have_callable_console(); } In can_use_console, why not is &&, but ||? Kernel Version: 3.3.0Why such an old and obsolete kernel version? Please try this on 3.8, lots of work have gone into the printk area that should have solved this issue. greg k-h
I saw the printk.c in version 3.9, it still check console_trylock_for_printk() to decide to call console_unlock. In vprintk_emit(), cpu1 also have the opportunity to execute console_unlock() at coming online time. Once cpu which is coming online can output buffer, nothing can interrupt it until buffer is empty.But we can't ensure that none always write the __log_buf. It is danger! I think, the solution is that we should prevent to use console at coming online.
-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to [email protected] More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/

