Sure. It will need to be tested appropriately.
On Thu, Sep 10, 2015 at 4:49 AM, Joseph Qi wrote:
> Hi Junxiao & Sunil,
> Your comments would be appreciated.
>
> Thanks,
> Joseph
>
> On 2015/9/6 21:11, Joseph Qi wrote:
> > Comments for dlm_dispatch_work is described below:
> > /* Worker function
This is because you are specifying a 128k cluster size. Refer to man
mkfs.ocfs2 for more.
On Mar 17, 2015 8:04 PM, "Umarzuki Mochlis" wrote:
> Hi,
>
> What I meant by total size is output of 'du -hs'
>
> I can see output of fdisk on mpath1 of ocfs2 LUN similar to logical
> volume of ext4 partitio
What is the output of the commands? The protocol is supposed to do the
unlocking on its own. See what is it blocked on. It could be that the node
that has the lock cannot unlock it because it cannot flush the journal to
disk.
On Tue, Sep 9, 2014 at 7:55 PM, Guozhonghua wrote:
> Hi All:
>
>
>
>
not.
Sunil
On Tue, Aug 26, 2014 at 6:57 PM, Xue jiufei wrote:
> Hi, Sunil
> On 2014/8/26 1:13, Sunil Mushran wrote:
> > On Sun, Aug 24, 2014 at 11:05 PM, Joseph Qi <mailto:joseph...@huawei.com>> wrote:
> >
> > On 2014/8/25 13:45, Sunil Mushran wrote:
> >
On Sun, Aug 24, 2014 at 11:05 PM, Joseph Qi wrote:
> On 2014/8/25 13:45, Sunil Mushran wrote:
> > Please could you expand on that.
> >
> In our scenario, one node can mount multiple volumes across the
> cluster.
> For instance, N1 has mounted ocfs2 volumes say volume1,
Please could you expand on that.
On Aug 24, 2014 10:42 PM, "Joseph Qi" wrote:
> On 2014/8/25 13:00, Sunil Mushran wrote:
> > Functions in dlmdomain.c are only triggered during mount. So they cannot
> trigger the deadlock as described above in this thread. I would leave the
Functions in dlmdomain.c are only triggered during mount. So they cannot
trigger the deadlock as described above in this thread. I would leave them
as is.
On Aug 24, 2014 7:06 PM, "Xue jiufei" wrote:
> Hi Sunil,
> On 2014/8/23 1:08, Sunil Mushran wrote:
> > Allocs
Allocs made via GFP_NOFS, by definition, should not trigger any reclaim
from the fs.
So this situation should never arise. That's why all allocs in the dlm have
NOFS.
___
Ocfs2-devel mailing list
Ocfs2-devel@oss.oracle.com
https://oss.oracle.com/mailman/l
Acked-by: Sunil Mushran
You may want to re-add removed MODULE_DESCRIPTION with a short blurb in
some existing files.
On Tue, Nov 26, 2013 at 3:37 PM, Goldwyn Rodrigues wrote:
> The versioning information is confusing for end-users. The numbers
> are stuck at 1.5.0 while the tools v
You may want to do the same for the version file in dlm, dlmfs, etc.
On Tue, Nov 26, 2013 at 11:28 AM, Goldwyn Rodrigues wrote:
> The versioning information is confusing for end-users. The numbers
> are stuck at 1.5.0 when the tools have moved to 1.8.3.
>
> I suggest removing the versioning syst
It is encountering scsi errrors reading the device. Fixing that will fix
the issue.
If you want to stop the logging, I don't believe there is a method right
now. But i could be trivially added.
Allow user to disable mlog(ML_ERROR) logging.
On Thu, Oct 31, 2013 at 7:38 PM, Guozhonghua wrote:
>
So it's a test issue. The utility assumes the fs allocates in 4K units.
That's why it only works when clustersize is 4K.
On Thu, Aug 8, 2013 at 8:09 AM, David Weber wrote:
> Am Donnerstag, 8. August 2013, 07:30:27 schrieb Sunil Mushran:
> > Interesting. Please can you p
Interesting. Please can you print the inode disk using the command below. The
file path is minus the mounted dir.
debugfs.ocfs2 -R "stat /relative/path/to/file" /dev/DEVICE
It is saying that the fs has allocated a block when it did not need to. It
could be that the test utility does not handle
I see no need for a separate function. Just do
} else if (res->owner == DLM_LOCK_RES_OWNER_UNKNOWN) {
if (test_bit(node, res->refmap))
dlm_lockres_clear_refmap_bit(dlm, res, node);
}
On Thu, Aug 1, 2013 at 5:05 AM, Xue jiufei wrote:
> Function dlm_do_local_recovery_cleanup(
ect the actual situation.
>
> IMO, if there is no space both in the local_alloc and global_bitmap,
> it should steal space from other nodes local_alloc.
>
> Thx,
> Younger.
>
> > On Wed, Jul 31, 2013 at 08:33:48PM -0700, Sunil Mushran wrote:
> >> Because it makes no sense.
Because it makes no sense. Unlike inode/extent allocs, local_alloc is a
temporary cache. If you fail to allocate, you fallback to the global bitmap.
On Sat, Jul 27, 2013 at 3:27 AM, Younger Liu wrote:
> Hi,
> While analyzing ocfs2 block allocation, I found:
> When claiming space from inode_
What's the reasoning behind this patch?
On Jul 31, 2013, at 3:51 AM, Guozhonghua wrote:
> Hi,
>
> I find some code may be not correct as reviewing the heart beat code and test
> that.
> The heart beat writing onto disk.
> I have another question that why not encapsulate the o2hb_wait_on_io in
A general purpose file system requires one to manage over a million locks
concurrently. So performance is the main reason.
On Mon, Jul 1, 2013 at 6:22 PM, Jensen wrote:
> Hi Mark, sunil, jeff an Joel,
>Do you know why ? Thanks.
>
> Jensen.
> 2013-7-2
>
> On 2013/6/29 11:27, Jensen wrote:
>
NAK. Current code looks ok.
On Fri, Jun 28, 2013 at 1:49 PM, Andrew Morton wrote:
>
> Folks, 3.10 is nigh. Could we please have some review and test of this
> patch?
>
>
> From: Younger Liu
> Subject: ocfs2: should call ocfs2_journal_access_di() before
> ocfs2_delete_entry() in ocfs2_orphan_de
Acked-by: Sunil Mushran
On Fri, Jun 28, 2013 at 1:47 PM, Andrew Morton wrote:
> On Sun, 23 Jun 2013 18:39:16 +0800 Jeff Liu wrote:
>
> > Hi Jiufei,
> >
> > On 06/20/2013 07:13 PM, Xue jiufei wrote:
> >
> > > Function dlmlock_master() returns DLM_REC
The qs is whether this change is required for a real problem or not. If so,
what is that logic
that gets tripped up by this behaviour.
On Thu, Jun 27, 2013 at 3:08 PM, Andrew Morton wrote:
> On Wed, 26 Jun 2013 20:34:19 -0700 Sunil Mushran
> wrote:
>
> > AFAIR, this behavior
AFAIR, this behavior has been there since day 1 and changing it will impact
performance negatively. I would recommend against making this change for
one app.
On Wed, Jun 26, 2013 at 6:50 PM, shencanquan wrote:
> On 2013/6/27 9:25, Andrew Morton wrote:
>
> > On Thu, 27 Jun 2013 09:19:52 +0800 sh
Acked-by: Sunil Mushran
On Tue, May 14, 2013 at 12:08 AM, Joseph Qi wrote:
> Last time we found there is a lock/unlock bug in ocfs2_file_aio_write,
> and then we did a thoroughly search for all lock resources in
> ocfs2_inode_info, including rw, inode and open lockres and found t
True. The function could do with a little bit of cleanup. Feel free to send
a patch.
On Sun, May 19, 2013 at 7:49 PM, Joseph Qi wrote:
> On 2013/5/19 10:25, Joseph Qi wrote:
> > On 2013/5/18 21:26, Sunil Mushran wrote:
> >> The first node that gets the lock will do the ac
Acked-by: Sunil Mushran
On Mon, May 20, 2013 at 8:06 AM, Goldwyn Rodrigues wrote:
> While removing a non-empty directory, the kernel dumps a message:
> (rmdir,21743,1):ocfs2_unlink:953 ERROR: status = -39
>
> Suppress the error message from being printed in the dmesg so users
&
Acked-by: Sunil Mushran
On Tue, May 21, 2013 at 7:44 PM, shencanquan wrote:
> On 2013/5/22 10:38, xiaowei.hu wrote:
> > if there is error happen in , for example EIO in
> > __ocfs2_prepare_orphan_dir, ocfs2_prep_new_orphaned_file will release
> > the inode_
Acked-by: Sunil Mushran
On Mon, May 20, 2013 at 2:36 AM, Joseph Qi wrote:
> Below 3 functions have already been declared in dlmcommon.h, so we have
> no need to declare them again in dlmrecovery.c.
> dlm_complete_recovery_thread
> dlm_launch_recovery_thread
> dlm_kick_
Acked-by: Sunil Mushran
On Wed, May 22, 2013 at 8:50 AM, Joseph Qi wrote:
> In dlm_request_all_locks, ret is type enum. But o2net_send_message
> returns a type int value. Then it will never run into the following
> error branch. So we should change the ret type from enum to int.
&
The first node that gets the lock will do the actual recovery. The others will
get the lock and see a clean journal and skip the recovery. A thread should
never error out if it fails to get the lock. It should try and try again.
On May 17, 2013, at 11:27 PM, Joseph Qi wrote:
> Hi,
> Once there
Resending as my reply bounced.
On Thu, May 9, 2013 at 10:01 AM, Sunil Mushran wrote:
> A better fix is to _not_ disconnect on o2net timeout once a connection has
> been
> cleanly established. Only disconnect on o2hb timeout.
>
> The reconnects are a problem as we could lose pac
A better fix is to _not_ disconnect on o2net timeout once a connection has
been
cleanly established. Only disconnect on o2hb timeout.
The reconnects are a problem as we could lose packets and not be aware of it
leading to o2dlm hangs.
IOW, this patch looks to be papering over one specific problem
It should almost never trigger. ocfs2_inode_lock() should always succeed and
only return after it has gotten the required lock.
On Wed, May 8, 2013 at 12:38 PM, Andrew Morton wrote:
> On Mon, 6 May 2013 22:43:39 +0800 Joseph Qi wrote:
>
> > In ocfs2_file_aio_write, it does ocfs2_rw_lock first a
Looks good to me.
Acked-by: Sunil Mushran
On Mon, May 6, 2013 at 7:43 AM, Joseph Qi wrote:
> In ocfs2_file_aio_write, it does ocfs2_rw_lock first and then
> ocfs2_inode_lock. But if ocfs2_inode_lock failed, it goes to out_sems
> without unlocking rw lock. This will cause
ockres, if we have already sent some locks, say
> DLM_MAX_MIGRATABLE_LOCKS for the first time, and then
> dlm_send_mig_lockres_msg failed because of network down, it will redo
> it. During the redo_bucket, the lockres can be hashed and migrated
> again.
>
> On 2013/5/3 1:19, Sunil
Do you know under what conditions does it create a new lock when it should
not?
This code should only trigger if the lockres is/was mastered on another
node.
Meaning this node will not know about the newlock. Meaning that code should
never trigger.
1949 if (lock->ml.cookie
[ 1481.620253] o2hb: Unable to stabilize heartbeart on region
1352E2692E704EEB8040E5B8FF560997 (vdb)
What this means is that the device is suspect. o2hb writes are not hitting
the disk. vdb is accepting and
acknowledging the write but spitting out something else during the next
read. Heartbeat de
Use latest tools. 1.8.x. It includes a new tool 'o2cb' that allows removal of
nodes, global heartbeat, etc.
On Jan 6, 2013, at 3:48 AM, Gihan Munasinghe wrote:
> Hi
>
> I am trying to manage the ocfs2 cluster set up using the o2cb_ctl tool
> to create cluster add nodes etc.
> But I see that
NAK.
hb_task is a local variable that is not even accessed after kthread_stop().
The oops is in kthread_stop(). Points to a problem with get/put in
task_struct.
Not an ocfs2 issue.
On Mon, Dec 3, 2012 at 7:18 PM, wrote:
> From: "Xiaowei.Hu"
>
> Pid: 4508, comm:
> mkfs.ocfs2 Not tainted 2.6.
On Wed, Aug 22, 2012 at 8:44 PM, Tao Ma wrote:
> I guess the final solution will be WRITE_FUA, and I see btrfs uses it to
> write out the superblock. It will be handled differently by the
> underlying block layer so that it will not be in the elevator queue. It
> should work but I am not sure whe
On Wed, Aug 22, 2012 at 9:01 PM, Jie Liu wrote:
> BTW, Sunil mentioned there already has an IO priority patch set but not
> yet merged. However, I only searched
> an old posts back to 2006 at:
> http://www.digipedia.pl/usenet/thread/11947/7120/
>
> Am I missing something?
>
No, I said the code
Yes. WRITE_SYNC should be good. Not FUA.
Also, you may want to look into using io priorities. The code is all there.
Just needs activation.
On Wed, Aug 22, 2012 at 10:13 AM, srinivas eeda wrote:
>
> On 8/22/2012 7:17 AM, Jie Liu wrote:
>
> Hi All,
>
> These days, I am investigating an issue rega
Acked-by: Sunil Mushran
On Wed, Aug 22, 2012 at 2:38 AM, Jeff Liu wrote:
> Not sure if this patch does make sense or not, but it could make the
> signature of those routines
> in a consistent manner with others for heartbeating.
>
> CC: Sunil Mushran
> Signed-off-by: Jie
On Tue, Aug 30, 2011 at 02:14:04PM -0700, Sunil Mushran wrote:
> > Thanks. I'll fix the two.
> >
> > On 08/25/2011 06:01 PM, Dan Carpenter wrote:
> > >Hello Sunil Mushran,
> > >
> > >1dfecf810e0e: "ocfs2/cluster: Clean up messages in o2net
On Tue, Aug 14, 2012 at 11:28 PM, Xue jiufei wrote:
>
> Sorry, I haven't described it clearly.
>
> We trigger the BUG() in dlmrecovery.c:1923.
>
> Lockres had copyed lvb from previous valid locks and then meet with
> another lock with the EX level.
>
> 1907if (!dlm_lvb
Acked-by: Sunil Mushran
On Mon, Aug 13, 2012 at 7:06 PM, Xue jiufei wrote:
> We trigger a bug in __dlm_lockres_reserve_ast() when we parallel umount
> 4 nodes. The situation is as follows:
> 1) Node A migrate all lockres it owned(eg. lockres A) to other nodes say
> node B whe
On Mon, Aug 13, 2012 at 7:03 PM, Xue jiufei wrote:
> A parallel umount on 4 nodes triggered a bug in
> dlm_process_recovery_date(). Here’s the situation:
> Receiving MIG_LOCKRES message, A node processes the locks in migratable
> lockres. It copys lvb from migratable lockres when processing t
From: Sunil Mushran
Commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 was missing a var init.
Reported-and-Tested-by: Vincent Etienne
Signed-off-by: Sunil Mushran
---
fs/ocfs2/symlink.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/symlink.c b/fs/ocfs2
Thanks for your help.
On Fri, Aug 3, 2012 at 12:22 AM, Vincent ETIENEN wrote:
>
>
> Le 02/08/2012 23:08, Sunil Mushran a écrit :
>
> On Thu, Aug 2, 2012 at 12:28 PM, Vincent ETIENNE wrote:
>
>> Hi
>>
>> based on current git ( commit 1a9b4993b70fb18847169027
On Thu, Aug 2, 2012 at 12:28 PM, Vincent ETIENNE wrote:
> Hi
>
> based on current git ( commit 1a9b4993b70fb1884716902774dc9025b457760d )
> and reverting commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1
>
> commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1
> Author: Al Viro
> Date: Thu May 3 10
On Fri, Jul 27, 2012 at 1:32 PM, Mark Fasheh wrote:
> On Thu, Jul 26, 2012 at 04:05:05PM +0300, Dan Carpenter wrote:
> > My static checker complains that this is called with a spin_lock held
> > in dlm_master_requery_handler() from dlmrecovery.c. Probably the reason
> > we have not received any
The fallocate() oops is probably the same that is fixed by this patch.
https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=commit;h=a2118b301104a24381b414bc93371d666fe8d43a
Is in the list of patches that are ready to be pushed.
https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=shortlog;h=m
On Wed, Jul 11, 2012 at 1:51 AM, Joel Becker wrote:
> On Wed, Jul 11, 2012 at 02:49:56PM +0800, Junxiao Bi wrote:
> > Signed-off-by: Junxiao Bi
> > ---
> > fs/ocfs2/dlm/dlmmaster.c |4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs
On Tue, Jul 17, 2012 at 12:10 AM, Junxiao Bi wrote:
> In the target node of the dlm lock migration, the logic to find
> the local dlm lock is wrong, it shouldn't change the loop variable
> "lock" in the list_for_each_entry loop. This will cause a NULL-pointer
> accessing crash.
>
> Signed-off-by:
https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=shortlog;h=mw-3.4-mar15
I had prepared some patches sometime ago that could be pushed to mainline.
Though some patches may need to be removed as they look to be in this list.
On Fri, Jul 6, 2012 at 12:44 AM, Joel Becker wrote:
> Linus et al
On Tue, May 29, 2012 at 5:41 PM, Xiaowei wrote:
> On 05/30/2012 06:09 AM, Sunil Mushran wrote:
> I would suggest exploring adding this in dlm hb down event. Checking live
> map all
> over the place is hacky. We do it more than we should right now. Let's not
> add to the
On Thu, May 24, 2012 at 10:53 PM, wrote:
>
> diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
> index 01ebfd0..62659e8 100644
> --- a/fs/ocfs2/dlm/dlmrecovery.c
> +++ b/fs/ocfs2/dlm/dlmrecovery.c
> @@ -555,6 +555,7 @@ static int dlm_remaster_locks(struct dlm_ctxt *dlm, u8
> de
exceeds the total bit count. In each instance
the bitmap is correct. Only the free bit count is incorrect.
This patch checks the current bit value and increments the free bit count
only if the bit was previously set. It also prints information to allow
us to debug further.
Signed-off-by: Sunil Mushran
Signed-off-by: Sunil Mushran
---
fs/ocfs2/mmap.h | 18 ++
fs/ocfs2/ocfs2_trace.h | 19 +++
fs/ocfs2/quota.h| 16 ++--
fs/ocfs2/quota_global.c | 20 ++--
fs/ocfs2/quota_local.c | 19 +--
5
Commit b0d4f817ba5de8adb875ace594554a96d7737710 introduced dlm->track_lock
to protect operations on dlm->tracking_list. But it was still using the
older lock (dlm->spin_lock) to add new resources to the list.
Signed-off-by: Sunil Mushran
---
fs/ocfs2/dlm/dlmmaster.c |4 ++--
1 file
fallocate() was oopsing on ocfs2 because we were passing in a
NULL file pointer.
Signed-off-by: Sunil Mushran
---
fs/ocfs2/file.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 061591a..8f30e74 100644
--- a/fs/ocfs2/file.c
+++ b
This patch silences this message.
Signed-off-by: Sunil Mushran
---
fs/ocfs2/quota_global.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index b24eab3..2a3f12c 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2
nlink_t was replaced as per the suggestion in the following link.
https://lkml.org/lkml/2012/2/2/577
Reported-by: Al Viro
Signed-of-by: Sunil Mushran
---
fs/ocfs2/namei.c |6 +++---
1 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
index
serialization accounting for writes. This patch
seperates the handler functions to avoid this issue.
Signed-off-by: Sunil Mushran
---
fs/ocfs2/aops.c | 44
1 files changed, 32 insertions(+), 12 deletions(-)
diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c
averting the case in which lock is set to NULL.
Reported-by: Julia Lawall
Signed-off-by: Sunil Mushran
---
fs/ocfs2/dlm/dlmrecovery.c | 10 +-
1 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 01ebfd0..c881be6
Patch fixes some possible null pointer dereferences that were detected by the
static code analyser, smatch.
Reported-by: Dan Carpenter
Signed-off-by: Sunil Mushran
---
fs/ocfs2/cluster/tcp.c | 10 +-
1 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/ocfs2/cluster
ask_list = {
> next = 0x8105d2a28360,
> prev = 0x8105d2a28360
> }
> },
> l_debug_list = {
> next = 0x81033cabad08,
> prev = 0x8105d2a284a8
> },
> l_lock_num_prmode = 0,
> l_lock_num_exmode = 0,
> l_lock_num_prmode_failed = 0,
> l_lock_num_exmode_failed = 0,
pin lock,here A doesn't
> hold spin locks,
> then it start to execute the proxy ast handler , process bast request
> from nodeB,
> then dlmthread flushed the bast, after this node A start to queue its
> ast in ocfs2_dlm_lock() function.
>
> Thanks,
> Xiaowei
> On 02/2
Moreover what is lockres_clear_pending doing in 1.4. That code
is not meant for 1.4. It fixes a problem associated with fsdlm.
It was left out of 1.4 for a reason.
Meaning this bug was introduced by the patch that introduced this
one in 1.4.
On 02/20/2012 10:12 PM, xiaowei...@oracle.com wrote:
>
> bast queued and flushed,before the ast was queued
Unlikely with o2dlm. dlmthread always sends ASTs before BASTs.
Can you recreate the entire lockres? A full dump may yield more
information.
Sunil
On 02/20/2012 10:12 PM, xiaowei...@oracle.com wrote:
> I am trying to fix bug13611997,CT's machi
On 02/13/2012 12:29 PM, Dan Carpenter wrote:
> On Mon, Feb 13, 2012 at 12:04:09PM -0800, Joel Becker wrote:
>> On Mon, Feb 13, 2012 at 04:50:47PM +0300, Dan Carpenter wrote:
>>> If "ret" is NULL, then "hs" is also NULL, so there is no need to free
>>> it. config_group_init_type_name() can't fail i
On 02/13/2012 12:08 PM, Dan Carpenter wrote:
> On Mon, Feb 13, 2012 at 11:39:27AM -0800, Sunil Mushran wrote:
>> hmm... I would say NAK because config_group_item_type_name() could
>> change in the future. And there is nothing wrong with the current
>> code.
>
> The e
hmm... I would say NAK because config_group_item_type_name() could
change in the future. And there is nothing wrong with the current
code.
On 02/13/2012 05:50 AM, Dan Carpenter wrote:
> If "ret" is NULL, then "hs" is also NULL, so there is no need to free
> it. config_group_init_type_name() can't
Signed-off-by: Sunil Mushran
On 02/08/2012 10:42 PM, Jeff Liu wrote:
> Hello,
>
> Since ENXIO only means "offset beyond EOF" for SEEK_DATA/SEEK_HOLE,
> Hence we should return the internal error unchanged if ocfs2_inode_lock() or
> ocfs2_get_clusters_nocache() call
On 02/04/2012 12:04 PM, Eric Sandeen wrote:
> Now that ext4, xfs,& ocfs2 can support punch hole, a tool to
> "re-sparsify" a file by punching out ranges of 0s might be in order.
>
> I whipped this up fast, it probably has bugs& off-by-ones but thought
> I'd send it out. It's not terribly efficie
sob
On 01/30/2012 09:51 PM, Srinivas Eeda wrote:
> When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it
> deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread.
> Below is the stack snippet.
>
> The patch disables interrupts when acquiring dc_task_lock sp
Comments inlined.
On 01/28/2012 06:13 PM, Srinivas Eeda wrote:
> When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ for
> I/O completion it deadlock itself trying to get same spinlock in
> ocfs2_wake_downconvert_thread
>
> The patch disables interrupts when acquiring dc_task_loc
fallocate() was oopsing on ocfs2 because we were passing in a
NULL file pointer.
Signed-off-by: Sunil Mushran
---
fs/ocfs2/file.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c
index 061591a..8f30e74 100644
--- a/fs/ocfs2/file.c
+++ b
We've seen this too. The problem happens because of the patch added to delay
dropping of the dentry locks (first patch below). The other two are related.
It was added to avoid a deadlock in quotas but adds problems of its own.
Srini has studied this issue and may be able to expand on this. The quic
Joel,
Please pull 6 patches (bug fixes) from the following repo.
git://oss.oracle.com/git/smushran/linux-2.6.git mw-3.3-jan13
BTW, not sure if I emailed before but we have to rollback 3 patches
related to deletes. These patches were added to fix deadlocks with
quotas. Well, it has just broken
On 12/20/2011 06:49 PM, Wengang Wang wrote:
> This is a fix on dlm_clean_master_list()
>
> During the hash table browsing, we remove mle from hash table then free
> the memory on the last reference. So we have to use a _safe() version
> of the browsing function when doing that.
>
> This fixes Orabu
Acked-by: Sunil Mushran
On 12/05/2011 09:18 PM, Tao Ma wrote:
> On 12/06/2011 12:57 PM, Noboru Iwamatsu wrote:
>> Under heavy I/O load, writing the disk heartbeat can be forced
>> to wait for minutes, and this causes the node to be fenced.
>>
>> This patch tries to u
This patch silences this message.
Signed-off-by: Sunil Mushran
---
fs/ocfs2/quota_global.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index b24eab3..2a3f12c 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2
Patch fixes some possible null pointer dereferences that were detected by the
static code analyser, smatch.
Reported-by: Dan Carpenter
Signed-off-by: Sunil Mushran
---
fs/ocfs2/cluster/tcp.c | 10 +-
1 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/ocfs2/cluster
exceeds the total bit count. In each instance
the bitmap is correct. Only the free bit count is incorrect.
This patch checks the current bit value and increments the free bit count
only if the bit was previously set. It also prints information to allow
us to debug further.
Signed-off-by: Sunil Mushran
Commit b0d4f817ba5de8adb875ace594554a96d7737710 introduced dlm->track_lock
to protect operations on dlm->tracking_list. But it was still using the
older lock (dlm->spin_lock) to add new resources to the list.
Signed-off-by: Sunil Mushran
---
fs/ocfs2/dlm/dlmmaster.c |4 ++--
1 file
Signed-off-by: Sunil Mushran
---
fs/ocfs2/mmap.h | 18 ++
fs/ocfs2/ocfs2_trace.h | 19 +++
fs/ocfs2/quota.h| 16 ++--
fs/ocfs2/quota_global.c | 20 ++--
fs/ocfs2/quota_local.c | 19 +--
5
averting the case in which lock is set to NULL.
Reported-by: Julia Lawall
Signed-off-by: Sunil Mushran
---
fs/ocfs2/dlm/dlmrecovery.c | 10 +-
1 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c
index 01ebfd0..c881be6
NCOMPAT_REFCOUNT_TREE,
> + OCFS2_FEATURE_RO_COMPAT_UNWRITTEN}, /* FS_VMSTORE */
>
> These options are the ones that, when choosing for vmstore, are
> enabled by default. Is this correct?
>
> Thanks.
>
> Att.
> Artur Baruchi
>
>
>
> On Wed, Nov 16, 2011 at 3:26 PM, Sunil Mushran
> wrote:
>
fstype is a handy way to format the volume with parameters that are thought
to be useful for that use-case. The result of this is printed during format by
way of the parameters selected. man mkfs.ocfs2 has a blurb about the features
it enabled by default.
On 11/16/2011 08:45 AM, Artur Baruchi wrot
I think it got lost in the shuffle. We had decided to use the list_for_each().
The code is simpler to understand than the other proposed fix.
Joel, do you want me to send a patch?
On 11/02/2011 12:39 AM, Dan Carpenter wrote:
> What ever happened with this? The bug is still there in the latest
>
Joel, Please add this to the linux-next branch.
Original Message
Subject:[Ocfs2-devel] [PATCH] ocfs2: Add a missing journal credit in
ocfs2_link_credits() -v2
Date: Wed, 19 Oct 2011 09:34:19 +0800
From: xiaowei...@oracle.com
To: ocfs2-devel@oss.oracle.com
CC:
On 10/14/2011 01:57 AM, Wengang Wang wrote:
> Problem reproduced(against mainline) with the above patch applied. Also with
> the hacking
> patch(attached).
>
> testcase is attached.
>
> (kworker/u:2,14465,1):dlm_assert_master_handler:1828 ERROR: DIE! Mastery
> assert from 0, but current owner is 1
http://oss.oracle.com/git/?p=jlbec/linux-2.6.git;a=commitdiff;h=ff0a522e7db79625aa27a433467eb94c5e255718
Are you sure you have this patch?
On 10/13/2011 05:19 PM, Wengang Wang wrote:
> 2.6.18-128.
>
> thanks,
> wengang.
> On 11-10-13 16:37, Sunil Mushran wrote:
>> whic
which kernel?
On 10/13/2011 04:35 PM, Wengang Wang wrote:
> On 11-10-13 09:09, Sunil Mushran wrote:
>> The last email you said it reproduced. Now you say it did not.
>> I'm confused.
> Oh? Did I. If I did, I meant it had reproductions in different customers's
>
The last email you said it reproduced. Now you say it did not.
I'm confused.
On 10/12/2011 07:13 PM, Wengang Wang wrote:
> On 11-10-12 19:11, Sunil Mushran wrote:
>> That's what ovm does. Have you reproduced it with ovm3 kernel?
>>
> No, I have no reproductions.
>
&
That's what ovm does. Have you reproduced it with ovm3 kernel?
On 10/12/2011 07:07 PM, Wengang Wang wrote:
> On 11-10-13 09:51, Wengang Wang wrote:
>> On 11-10-12 18:47, Sunil Mushran wrote:
>>> I meant master_request (not query). We set refmap _before_
>>> asse
I meant master_request (not query). We set refmap _before_
asserting. So that should not happen.
On 10/12/2011 06:02 PM, Wengang Wang wrote:
> Hi Sunil,
>
> On 11-10-12 17:32, Sunil Mushran wrote:
>> So you are saying a lockres can get purged before the node is asserting
>>
So you are saying a lockres can get purged before the node is asserting
master to other nodes?
The main place where we dispatch assert is during master_query.
There we set refmap before dispatching. Meaning refmap will protect
us from purging.
But I think it could happen in master_requery, which
Acked-by: Sunil Mushran
On 10/12/2011 12:22 AM, Wengang Wang wrote:
> There are three cases found that in error cases, journal transactions are not
> committed nor aborted. We should take care of these case by committing the
> transactions. Otherwise, there would left a journal handle w
The first two are ok. Have a comment for the last one.
On 09/25/2011 02:13 AM, Wengang Wang wrote:
> Commit transactions in error cases.
>
> There are three cases found that in error cases, journal transactions are not
> committed nor aborted. We should take care of these case by committing the
>
1 - 100 of 1280 matches
Mail list logo