Re: Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-13 Thread Eunbong Song
Hi, I just wonder. Is there no problem with endianess. I mean usually bit field is defined with __BIG_ENDIAN_BITFIELD or __LITTLE_ENDIAN_BITFIELD. But b_jlist and b_modfied is defined with no pad. It seems to be good but i just want to make sure. Thanks. 2013/5/13 Dmitry Monakhov : > On Mon, 13

Re: Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-13 Thread Dmitry Monakhov
On Mon, 13 May 2013 19:26:34 +0800, Zheng Liu wrote: > On Mon, May 13, 2013 at 09:53:25AM +, EUNBONG SONG wrote: > > > > > > > Hi all, > > > > > First of all I couldn't reproduce this regression in my sand box. So > > > the following speculation is only my guess. I suspect that the commit

Re: Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-13 Thread Zheng Liu
On Mon, May 13, 2013 at 09:53:25AM +, EUNBONG SONG wrote: > > > > Hi all, > > > First of all I couldn't reproduce this regression in my sand box. So > > the following speculation is only my guess. I suspect that the commit > > (ae4647fb) isn't root cause. It just uncover a potential bug t

Re: Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-13 Thread EUNBONG SONG
> Hi all, > First of all I couldn't reproduce this regression in my sand box. So > the following speculation is only my guess. I suspect that the commit > (ae4647fb) isn't root cause. It just uncover a potential bug that has > been there for a long time. I look at the code, and found two > s

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-13 Thread Zheng Liu
On Sun, May 12, 2013 at 07:04:45PM -0700, Tony Luck wrote: > On Sat, May 11, 2013 at 12:52 AM, Dmitry Monakhov > wrote:. > > What was page_size and fsblock size? > > CONFIG_IA64_PAGE_SIZE_64KB=y > > fsblock size is whatever is the default for SLES11SP2 on ia64 - which > tool will tell me? > >

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-13 Thread EUNBONG SONG
Hi, I have some problem to boot with 3.10-rc1. So i will test with e0fd9affeb64088eff407dfc98bbd3a5c17ea479. The commit message is as follow. commit e0fd9affeb64088eff407dfc98bbd3a5c17ea479 Merge: 3d15b79 ea9627c Author: Linus Torvalds Date: Wed May 8 15:29:48 2013 -0700 Merge tag 'rdma

Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Tony Luck
The 3.10-rc1 with ae4647fb765467 reverted is still running OK. At 3 hours now (only marginally longer that the 2.5 hours that one of the "bad" runs during the bisect managed). So I'm about 30% sure that we have a winner at the moment. I'll leave it running and check again in the morning. This peng

Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Sidorov, Andrei
Hi, Bitfields are likely to be implemented using read-modify-write semantics. Modifications of either b_jlist or b_jmodified must be done under lock since they share same uint. I guess this lock is missing somewhere. Regards, Andrei. On 12.05.2013 20:07, Theodore Ts'o wrote: > On Sun, May 12, 20

Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Mike Galbraith
On Sun, 2013-05-12 at 23:36 -0400, Theodore Ts'o wrote: > On Sun, May 12, 2013 at 08:11:59PM -0700, Tony Luck wrote: > > > > My best guess as to why this commit causes problems is that there are places > > where updates to individual fields in this structure used to be independent > > because the

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread EUNBONG SONG
> Hi, > Bitfields are likely to be implemented using read-modify-write semantics. > Modifications of either b_jlist or b_jmodified must be done under lock > since they share same uint. I guess this lock is missing somewhere. Hi, I agree with you. b_jlist and b_jmodified share the same unit. I t

Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Theodore Ts'o
On Sun, May 12, 2013 at 08:11:59PM -0700, Tony Luck wrote: > > My best guess as to why this commit causes problems is that there are places > where updates to individual fields in this structure used to be independent > because they were to whole words. Now we have bitfileds there are races > bet

Re: Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Tony Luck
On Sun, May 12, 2013 at 7:21 PM, EUNBONG SONG wrote: > Hi, my git bisect result is same yours. And i reported that to community > yesterday. Ah. Good to have some confirmation (I was never sure how long to keep running before deciding that a test was "good". My slowest "bad" test took about 2.5

Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Theodore Ts'o
On Sun, May 12, 2013 at 07:04:45PM -0700, Tony Luck wrote: > My git bisect finally competed and points the a finger at: > > commit ae4647fb7654676fc44a97e86eb35f9f06b99f66 > Author: Jan Kara > Date: Fri Apr 12 00:03:42 2013 -0400 > > jbd2: reduce journal_head size > > Remove unused t_

Re: Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread EUNBONG SONG
> CONFIG_IA64_PAGE_SIZE_64KB=y > fsblock size is whatever is the default for SLES11SP2 on ia64 - which > tool will tell me? > My git bisect finally competed and points the a finger at: > bisect> git bisect good > ae4647fb7654676fc44a97e86eb35f9f06b99f66 is first bad commit > commit ae4647fb765

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-12 Thread Tony Luck
On Sat, May 11, 2013 at 12:52 AM, Dmitry Monakhov wrote:. > What was page_size and fsblock size? CONFIG_IA64_PAGE_SIZE_64KB=y fsblock size is whatever is the default for SLES11SP2 on ia64 - which tool will tell me? My git bisect finally competed and points the a finger at: bisect> git bisect g

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-11 Thread Dmitry Monakhov
On Fri, 10 May 2013 10:27:58 -0700, Tony Luck wrote: Non-text part: multipart/mixed > I think I have the same (or highly similar) thing happening on ia64. What was page_size and fsblock size? > > Similarities: seeing assertions fail for b_transaction > Differences: I only have ext3 filesystems mo

Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-10 Thread David Daney
On 05/10/2013 12:27 PM, Theodore Ts'o wrote: Hmm, since you seem to be able to reproduce the problem reliably, any chance you can try bisecting the problem? I've looked at the commits that touch fs/jbd2 and nothing is jumping out at me. Also, how many CPU's do you have your system, and what kin

Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-10 Thread Theodore Ts'o
Hmm, since you seem to be able to reproduce the problem reliably, any chance you can try bisecting the problem? I've looked at the commits that touch fs/jbd2 and nothing is jumping out at me. Also, how many CPU's do you have your system, and what kind of storage device were you using when you wer

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-10 Thread Tony Luck
I think I have the same (or highly similar) thing happening on ia64. Similarities: seeing assertions fail for b_transaction Differences: I only have ext3 filesystems mounted, no ext4 See attached trace. I'm pretty certain that the highly unhelpful bugcheck! 0 [1] comes from the J_

Re: Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-09 Thread EUNBONG SONG
> Can you give us the full crash message, (i.e., the panic, the BUG, > WARN, the registers, etc.), and not the stack trace? > - Ted Hi, Ted Actually i try to find the crash point. And i confirmed crash point is in __journal_remove_journal_head() function. I added some debu

Re: EXT4 panic at jbd2_journal_put_journal_head() in 3.9+

2013-05-09 Thread Theodore Ts'o
On Thu, May 09, 2013 at 07:59:30AM +, EUNBONG SONG wrote: > > I got a message as below every time i ran iozone test. > > > [ 4876.293124] [] show_stack+0x68/0x80 > [ 4876.309411] [] notifier_call_chain+0x5c/0xa8 > [ 4876.315245] [] __atomic_notifier_call_chain+0x3c/0x58 > [ 4876.321860] []