Hi Greg & Arnd, On Tue, Jun 27, 2017 at 08:29:17AM +0200, Greg Kroah-Hartman wrote: > > A: Because it messes up the order in which people normally read text. > Q: Why is top-posting such a bad thing? > A: Top-posting. > Q: What is the most annoying thing in e-mail? > > A: No. > Q: Should I include quotations after my reply? > > http://daringfireball.net/2007/07/on_top
We are so sorry for any troubles to you. I will take the role of Zhongping to continue discussion here. > > On Tue, Jun 27, 2017 at 02:02:13AM +0000, Zhongping Tan (谭中平) wrote: > > Ok, firstly we need to discuss the list usage, for list head we need > > do initialization, but for list node we don't need initialization at > > all. > > I don't understand, why is your misc driver touching that field at all? > Do I need to go and make it "private" somehow? There maybe some mis-understanding caused by not very well english expression. I'll clarify what had happended to us. > > > And for misc_list head, we use LIST_HEAD to define and initialize > > it. So I don't know why we put INIT_LIST_HEAD(&misc->list) in function > > misc_register, any bugs when without it? > > Again, what is wrong with the code today? What driver is this causing > problems for? It maybe a little bit long story. In recent, our test team report a crash of non-stop reboot test on a mobile phone after 46 hours. After some investigation, we got the information as following, &misc_list = 0xFFFFFFFF822AC780 -> ( next_=_0xFFFFFFFFA0087158 -> ( next = 0xFFFFFFFFA0087158 -> ( next = 0xFFFFFFFFA0087158, prev = 0xFFFFFFFFA0087158 -> ( next = 0xFFFFFFFFA0087158, prev = 0xFFFFFFFFA0087158)), prev = 0xFFFFFFFFA0087158 -> ( next = 0xFFFFFFFFA0087158, prev = 0xFFFFFFFFA0087158)), prev = 0xFFFF88007BF55618) it seems that misc_list fall into a loop which cause surfaceflinger (a service from Android) dead. And we got futher more information after that, (struct miscdevice*)0xFFFFFFFFA0087140 = 0xFFFFFFFFA0087140 -> ( minor = 20, name = 0xFFFFFFFFA008606D -> "fm", fops = 0xFFFFFFFFA0086700 -> ( owner = 0xFFFFFFFFA0087480, llseek = 0x0, read = 0xFFFFFFFFA0081860, write = 0x0, read_iter = 0x0, write_iter = 0x0, iterate = 0x0, poll = 0x0, unlocked_ioctl = 0xFFFFFFFFA00832C0, compat_ioctl = 0xFFFFFFFFA0083870, mmap = 0x0, open = 0xFFFFFFFFA0081B80, flush = 0x0, release = 0xFFFFFFFFA00838C0, fsync = 0x0, aio_fsync = 0x0, fasync = 0x0, lock = 0x0, sendpage = 0x0, get_unmapped_area = 0x0, check_flags = 0x0, flock = 0x0, splice_write = 0x0, splice_read = 0x0, setlease = 0x0, fallocate = 0x0, show_fdinfo = 0x0), list = ( next = 0xFFFFFFFFA0087158, prev = 0xFFFFFFFFA0087158), parent = 0x0, this_device = 0xFFFF88007BD32000, groups = 0x0, nodename = 0x0, mode = 0) We found the device is "fm". We highly suspect that fm driver call misc_register twice and reinitialize list to make ->pre & ->next pointing to himself. Meanwhile, we checked fm driver and found nothing obviously wrong in the code. Consider that this is a crash after 46 hours continuous power-on/off, it maybe caused by some special cases we are hard to know for now. We think it might make some sence to add protection code into misc_register() at first. Hope this could help to understand our situation. We'll try to provide any detail inforamtion about this if necessary. Thanks, Orson > > thanks, > > greg k-h