Re: [Ocfs2-devel] [RFC] Doubt about dlm_worker

2015-09-10 Thread Sunil Mushran
Sure. It will need to be tested appropriately. On Thu, Sep 10, 2015 at 4:49 AM, Joseph Qi wrote: > Hi Junxiao & Sunil, > Your comments would be appreciated. > > Thanks, > Joseph > > On 2015/9/6 21:11, Joseph Qi wrote: > > Comments for dlm_dispatch_work is described below: > > /* Worker function

Re: [Ocfs2-devel] [Ocfs2-users] size increase

2015-03-17 Thread Sunil Mushran
This is because you are specifying a 128k cluster size. Refer to man mkfs.ocfs2 for more. On Mar 17, 2015 8:04 PM, "Umarzuki Mochlis" wrote: > Hi, > > What I meant by total size is output of 'du -hs' > > I can see output of fdisk on mpath1 of ocfs2 LUN similar to logical > volume of ext4 partitio

Re: [Ocfs2-devel] [Ocfs2-users] How to unlock a bloked resource? Thanks

2014-09-10 Thread Sunil Mushran
What is the output of the commands? The protocol is supposed to do the unlocking on its own. See what is it blocked on. It could be that the node that has the lock cannot unlock it because it cannot flush the journal to disk. On Tue, Sep 9, 2014 at 7:55 PM, Guozhonghua wrote: > Hi All: > > > >

Re: [Ocfs2-devel] A deadlock when system do not has sufficient memory

2014-08-27 Thread Sunil Mushran
not. Sunil On Tue, Aug 26, 2014 at 6:57 PM, Xue jiufei wrote: > Hi, Sunil > On 2014/8/26 1:13, Sunil Mushran wrote: > > On Sun, Aug 24, 2014 at 11:05 PM, Joseph Qi <mailto:joseph...@huawei.com>> wrote: > > > > On 2014/8/25 13:45, Sunil Mushran wrote: > >

Re: [Ocfs2-devel] A deadlock when system do not has sufficient memory

2014-08-25 Thread Sunil Mushran
On Sun, Aug 24, 2014 at 11:05 PM, Joseph Qi wrote: > On 2014/8/25 13:45, Sunil Mushran wrote: > > Please could you expand on that. > > > In our scenario, one node can mount multiple volumes across the > cluster. > For instance, N1 has mounted ocfs2 volumes say volume1,

Re: [Ocfs2-devel] A deadlock when system do not has sufficient memory

2014-08-24 Thread Sunil Mushran
Please could you expand on that. On Aug 24, 2014 10:42 PM, "Joseph Qi" wrote: > On 2014/8/25 13:00, Sunil Mushran wrote: > > Functions in dlmdomain.c are only triggered during mount. So they cannot > trigger the deadlock as described above in this thread. I would leave the

Re: [Ocfs2-devel] A deadlock when system do not has sufficient memory

2014-08-24 Thread Sunil Mushran
Functions in dlmdomain.c are only triggered during mount. So they cannot trigger the deadlock as described above in this thread. I would leave them as is. On Aug 24, 2014 7:06 PM, "Xue jiufei" wrote: > Hi Sunil, > On 2014/8/23 1:08, Sunil Mushran wrote: > > Allocs

Re: [Ocfs2-devel] A deadlock when system do not has sufficient memory

2014-08-22 Thread Sunil Mushran
Allocs made via GFP_NOFS, by definition, should not trigger any reclaim from the fs. So this situation should never arise. That's why all allocs in the dlm have NOFS. ___ Ocfs2-devel mailing list Ocfs2-devel@oss.oracle.com https://oss.oracle.com/mailman/l

Re: [Ocfs2-devel] [PATCH] Remove versioning information

2013-11-26 Thread Sunil Mushran
Acked-by: Sunil Mushran You may want to re-add removed MODULE_DESCRIPTION with a short blurb in some existing files. On Tue, Nov 26, 2013 at 3:37 PM, Goldwyn Rodrigues wrote: > The versioning information is confusing for end-users. The numbers > are stuck at 1.5.0 while the tools v

Re: [Ocfs2-devel] [PATCH] Remove versioning information

2013-11-26 Thread Sunil Mushran
You may want to do the same for the version file in dlm, dlmfs, etc. On Tue, Nov 26, 2013 at 11:28 AM, Goldwyn Rodrigues wrote: > The versioning information is confusing for end-users. The numbers > are stuck at 1.5.0 when the tools have moved to 1.8.3. > > I suggest removing the versioning syst

Re: [Ocfs2-devel] [Ocfs2-users] How to break out the unstop loop in the recovery thread? Thanks a lot.

2013-11-01 Thread Sunil Mushran
It is encountering scsi errrors reading the device. Fixing that will fix the issue. If you want to stop the logging, I don't believe there is a method right now. But i could be trivially added. Allow user to disable mlog(ML_ERROR) logging. On Thu, Oct 31, 2013 at 7:38 PM, Guozhonghua wrote: >

Re: [Ocfs2-devel] FIEMAP problem

2013-08-08 Thread Sunil Mushran
So it's a test issue. The utility assumes the fs allocates in 4K units. That's why it only works when clustersize is 4K. On Thu, Aug 8, 2013 at 8:09 AM, David Weber wrote: > Am Donnerstag, 8. August 2013, 07:30:27 schrieb Sunil Mushran: > > Interesting. Please can you p

Re: [Ocfs2-devel] FIEMAP problem

2013-08-08 Thread Sunil Mushran
Interesting. Please can you print the inode disk using the command below. The file path is minus the mounted dir. debugfs.ocfs2 -R "stat /relative/path/to/file" /dev/DEVICE It is saying that the fs has allocated a block when it did not need to. It could be that the test utility does not handle

Re: [Ocfs2-devel] [PATCH] ocfs2: force clean refmap when doing local recovery cleanup

2013-08-01 Thread Sunil Mushran
I see no need for a separate function. Just do } else if (res->owner == DLM_LOCK_RES_OWNER_UNKNOWN) { if (test_bit(node, res->refmap)) dlm_lockres_clear_refmap_bit(dlm, res, node); } On Thu, Aug 1, 2013 at 5:05 AM, Xue jiufei wrote: > Function dlm_do_local_recovery_cleanup(

Re: [Ocfs2-devel] Why ocfs2 haven't implemented "steal" for local_alloc system files?

2013-08-01 Thread Sunil Mushran
ect the actual situation. > > IMO, if there is no space both in the local_alloc and global_bitmap, > it should steal space from other nodes local_alloc. > > Thx, > Younger. > > > On Wed, Jul 31, 2013 at 08:33:48PM -0700, Sunil Mushran wrote: > >> Because it makes no sense.

Re: [Ocfs2-devel] Why ocfs2 haven't implemented "steal" for local_alloc system files?

2013-07-31 Thread Sunil Mushran
Because it makes no sense. Unlike inode/extent allocs, local_alloc is a temporary cache. If you fail to allocate, you fallback to the global bitmap. On Sat, Jul 27, 2013 at 3:27 AM, Younger Liu wrote: > Hi, > While analyzing ocfs2 block allocation, I found: > When claiming space from inode_

Re: [Ocfs2-devel] Heart beat source code review and test, founding it may be not correct. Is the changes OK, requesting reviews and advices.

2013-07-31 Thread Sunil Mushran
What's the reasoning behind this patch? On Jul 31, 2013, at 3:51 AM, Guozhonghua wrote: > Hi, > > I find some code may be not correct as reviewing the heart beat code and test > that. > The heart beat writing onto disk. > I have another question that why not encapsulate the o2hb_wait_on_io in

Re: [Ocfs2-devel] why oracle give up dlm by disk on ocfs2? because performance?

2013-07-01 Thread Sunil Mushran
A general purpose file system requires one to manage over a million locks concurrently. So performance is the main reason. On Mon, Jul 1, 2013 at 6:22 PM, Jensen wrote: > Hi Mark, sunil, jeff an Joel, >Do you know why ? Thanks. > > Jensen. > 2013-7-2 > > On 2013/6/29 11:27, Jensen wrote: >

Re: [Ocfs2-devel] [PATCH] ocfs2: should call ocfs2_journal_access_di() before ocfs2_delete_entry() in ocfs2_orphan_del()

2013-06-28 Thread Sunil Mushran
NAK. Current code looks ok. On Fri, Jun 28, 2013 at 1:49 PM, Andrew Morton wrote: > > Folks, 3.10 is nigh. Could we please have some review and test of this > patch? > > > From: Younger Liu > Subject: ocfs2: should call ocfs2_journal_access_di() before > ocfs2_delete_entry() in ocfs2_orphan_de

Re: [Ocfs2-devel] [PATCH] ocfs2: dlmlock_master should return DLM_NORMAL after adding lock to blocked list

2013-06-28 Thread Sunil Mushran
Acked-by: Sunil Mushran On Fri, Jun 28, 2013 at 1:47 PM, Andrew Morton wrote: > On Sun, 23 Jun 2013 18:39:16 +0800 Jeff Liu wrote: > > > Hi Jiufei, > > > > On 06/20/2013 07:13 PM, Xue jiufei wrote: > > > > > Function dlmlock_master() returns DLM_REC

Re: [Ocfs2-devel] [PATCH] ocfs2: llseek requires to ocfs2 inode lock for the file in SEEK_END

2013-06-27 Thread Sunil Mushran
The qs is whether this change is required for a real problem or not. If so, what is that logic that gets tripped up by this behaviour. On Thu, Jun 27, 2013 at 3:08 PM, Andrew Morton wrote: > On Wed, 26 Jun 2013 20:34:19 -0700 Sunil Mushran > wrote: > > > AFAIR, this behavior

Re: [Ocfs2-devel] [PATCH] ocfs2: llseek requires to ocfs2 inode lock for the file in SEEK_END

2013-06-26 Thread Sunil Mushran
AFAIR, this behavior has been there since day 1 and changing it will impact performance negatively. I would recommend against making this change for one app. On Wed, Jun 26, 2013 at 6:50 PM, shencanquan wrote: > On 2013/6/27 9:25, Andrew Morton wrote: > > > On Thu, 27 Jun 2013 09:19:52 +0800 sh

Re: [Ocfs2-devel] [PATCH v2] ocfs2: goto out_unlock if ocfs2_get_clusters_nocache failed in ocfs2_fiemap

2013-05-22 Thread Sunil Mushran
Acked-by: Sunil Mushran On Tue, May 14, 2013 at 12:08 AM, Joseph Qi wrote: > Last time we found there is a lock/unlock bug in ocfs2_file_aio_write, > and then we did a thoroughly search for all lock resources in > ocfs2_inode_info, including rw, inode and open lockres and found t

Re: [Ocfs2-devel] ocfs2: Question for ocfs2_recovery_thread

2013-05-22 Thread Sunil Mushran
True. The function could do with a little bit of cleanup. Feel free to send a patch. On Sun, May 19, 2013 at 7:49 PM, Joseph Qi wrote: > On 2013/5/19 10:25, Joseph Qi wrote: > > On 2013/5/18 21:26, Sunil Mushran wrote: > >> The first node that gets the lock will do the ac

Re: [Ocfs2-devel] [PATCH] Remove unecessary ERROR when removing non-empty directory

2013-05-22 Thread Sunil Mushran
Acked-by: Sunil Mushran On Mon, May 20, 2013 at 8:06 AM, Goldwyn Rodrigues wrote: > While removing a non-empty directory, the kernel dumps a message: > (rmdir,21743,1):ocfs2_unlink:953 ERROR: status = -39 > > Suppress the error message from being printed in the dmesg so users &

Re: [Ocfs2-devel] [PATCH] ocfs2_prep_new_orphaned_file should return ret

2013-05-22 Thread Sunil Mushran
Acked-by: Sunil Mushran On Tue, May 21, 2013 at 7:44 PM, shencanquan wrote: > On 2013/5/22 10:38, xiaowei.hu wrote: > > if there is error happen in , for example EIO in > > __ocfs2_prepare_orphan_dir, ocfs2_prep_new_orphaned_file will release > > the inode_

Re: [Ocfs2-devel] [PATCH] clean up duplicate declaration in dlmrecovery.c

2013-05-22 Thread Sunil Mushran
Acked-by: Sunil Mushran On Mon, May 20, 2013 at 2:36 AM, Joseph Qi wrote: > Below 3 functions have already been declared in dlmcommon.h, so we have > no need to declare them again in dlmrecovery.c. > dlm_complete_recovery_thread > dlm_launch_recovery_thread > dlm_kick_

Re: [Ocfs2-devel] [PATCH] ret should be int instead of enum in dlm_request_all_locks

2013-05-22 Thread Sunil Mushran
Acked-by: Sunil Mushran On Wed, May 22, 2013 at 8:50 AM, Joseph Qi wrote: > In dlm_request_all_locks, ret is type enum. But o2net_send_message > returns a type int value. Then it will never run into the following > error branch. So we should change the ret type from enum to int. &

Re: [Ocfs2-devel] ocfs2: Question for ocfs2_recovery_thread

2013-05-18 Thread Sunil Mushran
The first node that gets the lock will do the actual recovery. The others will get the lock and see a clean journal and skip the recovery. A thread should never error out if it fails to get the lock. It should try and try again. On May 17, 2013, at 11:27 PM, Joseph Qi wrote: > Hi, > Once there

Re: [Ocfs2-devel] Patch request reviews, for node reconnecting with other nodes whose node number is little than local, thanks a lot.

2013-05-09 Thread Sunil Mushran
Resending as my reply bounced. On Thu, May 9, 2013 at 10:01 AM, Sunil Mushran wrote: > A better fix is to _not_ disconnect on o2net timeout once a connection has > been > cleanly established. Only disconnect on o2hb timeout. > > The reconnects are a problem as we could lose pac

Re: [Ocfs2-devel] Patch request reviews, for node reconnecting with other nodes whose node number is little than local, thanks a lot.

2013-05-09 Thread Sunil Mushran
A better fix is to _not_ disconnect on o2net timeout once a connection has been cleanly established. Only disconnect on o2hb timeout. The reconnects are a problem as we could lose packets and not be aware of it leading to o2dlm hangs. IOW, this patch looks to be papering over one specific problem

Re: [Ocfs2-devel] [PATCH] ocfs2: unlock rw lock if inode lock failed

2013-05-08 Thread Sunil Mushran
It should almost never trigger. ocfs2_inode_lock() should always succeed and only return after it has gotten the required lock. On Wed, May 8, 2013 at 12:38 PM, Andrew Morton wrote: > On Mon, 6 May 2013 22:43:39 +0800 Joseph Qi wrote: > > > In ocfs2_file_aio_write, it does ocfs2_rw_lock first a

Re: [Ocfs2-devel] [PATCH] ocfs2: unlock rw lock if inode lock failed

2013-05-06 Thread Sunil Mushran
Looks good to me. Acked-by: Sunil Mushran On Mon, May 6, 2013 at 7:43 AM, Joseph Qi wrote: > In ocfs2_file_aio_write, it does ocfs2_rw_lock first and then > ocfs2_inode_lock. But if ocfs2_inode_lock failed, it goes to out_sems > without unlocking rw lock. This will cause

Re: [Ocfs2-devel] [PATCH v2] ocfs2: fix possible memory leak in dlm_process_recovery_data

2013-05-03 Thread Sunil Mushran
ockres, if we have already sent some locks, say > DLM_MAX_MIGRATABLE_LOCKS for the first time, and then > dlm_send_mig_lockres_msg failed because of network down, it will redo > it. During the redo_bucket, the lockres can be hashed and migrated > again. > > On 2013/5/3 1:19, Sunil

Re: [Ocfs2-devel] [PATCH v2] ocfs2: fix possible memory leak in dlm_process_recovery_data

2013-05-02 Thread Sunil Mushran
Do you know under what conditions does it create a new lock when it should not? This code should only trigger if the lockres is/was mastered on another node. Meaning this node will not know about the newlock. Meaning that code should never trigger. 1949 if (lock->ml.cookie

Re: [Ocfs2-devel] [OCFS2] Crash at o2net_shutdown_sc()

2013-03-01 Thread Sunil Mushran
[ 1481.620253] o2hb: Unable to stabilize heartbeart on region 1352E2692E704EEB8040E5B8FF560997 (vdb) What this means is that the device is suspect. o2hb writes are not hitting the disk. vdb is accepting and acknowledging the write but spitting out something else during the next read. Heartbeat de

Re: [Ocfs2-devel] o2cb_ctl -D option

2013-01-05 Thread Sunil Mushran
Use latest tools. 1.8.x. It includes a new tool 'o2cb' that allows removal of nodes, global heartbeat, etc. On Jan 6, 2013, at 3:48 AM, Gihan Munasinghe wrote: > Hi > > I am trying to manage the ocfs2 cluster set up using the o2cb_ctl tool > to create cluster add nodes etc. > But I see that

Re: [Ocfs2-devel] [PATCH] mkfs.ocfs2 null pointer dereference. -- resend

2012-12-04 Thread Sunil Mushran
NAK. hb_task is a local variable that is not even accessed after kthread_stop(). The oops is in kthread_stop(). Points to a problem with get/put in task_struct. Not an ocfs2 issue. On Mon, Dec 3, 2012 at 7:18 PM, wrote: > From: "Xiaowei.Hu" > > Pid: 4508, comm: > mkfs.ocfs2 Not tainted 2.6.

Re: [Ocfs2-devel] RFC: OCFS2 heartbeat improvements

2012-08-23 Thread Sunil Mushran
On Wed, Aug 22, 2012 at 8:44 PM, Tao Ma wrote: > I guess the final solution will be WRITE_FUA, and I see btrfs uses it to > write out the superblock. It will be handled differently by the > underlying block layer so that it will not be in the elevator queue. It > should work but I am not sure whe

Re: [Ocfs2-devel] RFC: OCFS2 heartbeat improvements

2012-08-23 Thread Sunil Mushran
On Wed, Aug 22, 2012 at 9:01 PM, Jie Liu wrote: > BTW, Sunil mentioned there already has an IO priority patch set but not > yet merged. However, I only searched > an old posts back to 2006 at: > http://www.digipedia.pl/usenet/thread/11947/7120/ > > Am I missing something? > No, I said the code

Re: [Ocfs2-devel] RFC: OCFS2 heartbeat improvements

2012-08-22 Thread Sunil Mushran
Yes. WRITE_SYNC should be good. Not FUA. Also, you may want to look into using io priorities. The code is all there. Just needs activation. On Wed, Aug 22, 2012 at 10:13 AM, srinivas eeda wrote: > > On 8/22/2012 7:17 AM, Jie Liu wrote: > > Hi All, > > These days, I am investigating an issue rega

Re: [Ocfs2-devel] [PATCH 2/4] ocfs2: s/o2hb_hearbeat_xxx/o2hb_heartbeat_xxx/g at heartbeat.c

2012-08-22 Thread Sunil Mushran
Acked-by: Sunil Mushran On Wed, Aug 22, 2012 at 2:38 AM, Jeff Liu wrote: > Not sure if this patch does make sense or not, but it could make the > signature of those routines > in a consistent manner with others for heartbeating. > > CC: Sunil Mushran > Signed-off-by: Jie

Re: [Ocfs2-devel] ocfs2/cluster: Clean up messages in o2net

2012-08-15 Thread Sunil Mushran
On Tue, Aug 30, 2011 at 02:14:04PM -0700, Sunil Mushran wrote: > > Thanks. I'll fix the two. > > > > On 08/25/2011 06:01 PM, Dan Carpenter wrote: > > >Hello Sunil Mushran, > > > > > >1dfecf810e0e: "ocfs2/cluster: Clean up messages in o2net

Re: [Ocfs2-devel] [PATCH] ocfs2: skip locks in the blocked list

2012-08-15 Thread Sunil Mushran
On Tue, Aug 14, 2012 at 11:28 PM, Xue jiufei wrote: > > Sorry, I haven't described it clearly. > > We trigger the BUG() in dlmrecovery.c:1923. > > Lockres had copyed lvb from previous valid locks and then meet with > another lock with the EX level. > > 1907if (!dlm_lvb

Re: [Ocfs2-devel] [PATCH] ocfs2: delay migration when the lockres is in migration state

2012-08-14 Thread Sunil Mushran
Acked-by: Sunil Mushran On Mon, Aug 13, 2012 at 7:06 PM, Xue jiufei wrote: > We trigger a bug in __dlm_lockres_reserve_ast() when we parallel umount > 4 nodes. The situation is as follows: > 1) Node A migrate all lockres it owned(eg. lockres A) to other nodes say > node B whe

Re: [Ocfs2-devel] [PATCH] ocfs2: skip locks in the blocked list

2012-08-14 Thread Sunil Mushran
On Mon, Aug 13, 2012 at 7:03 PM, Xue jiufei wrote: > A parallel umount on 4 nodes triggered a bug in > dlm_process_recovery_date(). Here’s the situation: > Receiving MIG_LOCKRES message, A node processes the locks in migratable > lockres. It copys lvb from migratable lockres when processing t

[Ocfs2-devel] [PATCH] ocfs2: Fix oops in ocfs2_fast_symlink_readpage() code path

2012-08-03 Thread sunil . mushran
From: Sunil Mushran Commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 was missing a var init. Reported-and-Tested-by: Vincent Etienne Signed-off-by: Sunil Mushran --- fs/ocfs2/symlink.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/symlink.c b/fs/ocfs2

Re: [Ocfs2-devel] kernel BUG at fs/buffer.c:2886! Linux 3.5.0

2012-08-03 Thread Sunil Mushran
Thanks for your help. On Fri, Aug 3, 2012 at 12:22 AM, Vincent ETIENEN wrote: > > > Le 02/08/2012 23:08, Sunil Mushran a écrit : > > On Thu, Aug 2, 2012 at 12:28 PM, Vincent ETIENNE wrote: > >> Hi >> >> based on current git ( commit 1a9b4993b70fb18847169027

Re: [Ocfs2-devel] kernel BUG at fs/buffer.c:2886! Linux 3.5.0

2012-08-02 Thread Sunil Mushran
On Thu, Aug 2, 2012 at 12:28 PM, Vincent ETIENNE wrote: > Hi > > based on current git ( commit 1a9b4993b70fb1884716902774dc9025b457760d ) > and reverting commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 > > commit ea022dfb3c2a4680483b00eb2fecc9fc4f6091d1 > Author: Al Viro > Date: Thu May 3 10

Re: [Ocfs2-devel] [patch] ocfs2/dlm: use GFP_ATOMIC inside a spin_lock

2012-07-30 Thread Sunil Mushran
On Fri, Jul 27, 2012 at 1:32 PM, Mark Fasheh wrote: > On Thu, Jul 26, 2012 at 04:05:05PM +0300, Dan Carpenter wrote: > > My static checker complains that this is called with a spin_lock held > > in dlm_master_requery_handler() from dlmrecovery.c. Probably the reason > > we have not received any

Re: [Ocfs2-devel] kernel BUG at fs/buffer.c:2886! Linux 3.5.0

2012-07-30 Thread Sunil Mushran
The fallocate() oops is probably the same that is fixed by this patch. https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=commit;h=a2118b301104a24381b414bc93371d666fe8d43a Is in the list of patches that are ready to be pushed. https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=shortlog;h=m

Re: [Ocfs2-devel] [PATCH] ocfs2: break useless while loop

2012-07-19 Thread Sunil Mushran
On Wed, Jul 11, 2012 at 1:51 AM, Joel Becker wrote: > On Wed, Jul 11, 2012 at 02:49:56PM +0800, Junxiao Bi wrote: > > Signed-off-by: Junxiao Bi > > --- > > fs/ocfs2/dlm/dlmmaster.c |4 +++- > > 1 file changed, 3 insertions(+), 1 deletion(-) > > > > diff --git a/fs/ocfs2/dlm/dlmmaster.c b/fs

Re: [Ocfs2-devel] [PATCH] ocfs2: fix dlm lock migration crash

2012-07-17 Thread Sunil Mushran
On Tue, Jul 17, 2012 at 12:10 AM, Junxiao Bi wrote: > In the target node of the dlm lock migration, the logic to find > the local dlm lock is wrong, it shouldn't change the loop variable > "lock" in the list_for_each_entry loop. This will cause a NULL-pointer > accessing crash. > > Signed-off-by:

Re: [Ocfs2-devel] [GIT PULL] ocfs2 fixes for 3.5-rc5

2012-07-17 Thread Sunil Mushran
https://oss.oracle.com/git/?p=smushran/linux-2.6.git;a=shortlog;h=mw-3.4-mar15 I had prepared some patches sometime ago that could be pushed to mainline. Though some patches may need to be removed as they look to be in this list. On Fri, Jul 6, 2012 at 12:44 AM, Joel Becker wrote: > Linus et al

Re: [Ocfs2-devel] [PATCH] Fix waiting status race condition in dlm recovery

2012-05-30 Thread Sunil Mushran
On Tue, May 29, 2012 at 5:41 PM, Xiaowei wrote: > On 05/30/2012 06:09 AM, Sunil Mushran wrote: > I would suggest exploring adding this in dlm hb down event. Checking live > map all > over the place is hacky. We do it more than we should right now. Let's not > add to the

Re: [Ocfs2-devel] [PATCH] Fix waiting status race condition in dlm recovery

2012-05-29 Thread Sunil Mushran
On Thu, May 24, 2012 at 10:53 PM, wrote: > > diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c > index 01ebfd0..62659e8 100644 > --- a/fs/ocfs2/dlm/dlmrecovery.c > +++ b/fs/ocfs2/dlm/dlmrecovery.c > @@ -555,6 +555,7 @@ static int dlm_remaster_locks(struct dlm_ctxt *dlm, u8 > de

[Ocfs2-devel] [PATCH 6/9] ocfs2: Tighten free bit calculation in the global bitmap

2012-03-01 Thread Sunil Mushran
exceeds the total bit count. In each instance the bitmap is correct. Only the free bit count is incorrect. This patch checks the current bit value and increments the free bit count only if the bit was previously set. It also prints information to allow us to debug further. Signed-off-by: Sunil Mushran

[Ocfs2-devel] [PATCH 2/9] ocfs2: Add missing copyright in few files

2012-03-01 Thread Sunil Mushran
Signed-off-by: Sunil Mushran --- fs/ocfs2/mmap.h | 18 ++ fs/ocfs2/ocfs2_trace.h | 19 +++ fs/ocfs2/quota.h| 16 ++-- fs/ocfs2/quota_global.c | 20 ++-- fs/ocfs2/quota_local.c | 19 +-- 5

[Ocfs2-devel] [PATCH 4/9] ocfs2/dlm: Use dlm->track_lock when adding resource to the tracking list

2012-03-01 Thread Sunil Mushran
Commit b0d4f817ba5de8adb875ace594554a96d7737710 introduced dlm->track_lock to protect operations on dlm->tracking_list. But it was still using the older lock (dlm->spin_lock) to add new resources to the list. Signed-off-by: Sunil Mushran --- fs/ocfs2/dlm/dlmmaster.c |4 ++-- 1 file

[Ocfs2-devel] [PATCH 7/9] ocfs2: Fix oops in fallocate()

2012-03-01 Thread Sunil Mushran
fallocate() was oopsing on ocfs2 because we were passing in a NULL file pointer. Signed-off-by: Sunil Mushran --- fs/ocfs2/file.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 061591a..8f30e74 100644 --- a/fs/ocfs2/file.c +++ b

[Ocfs2-devel] [PATCH 3/9] ocfs2: Silence message in ocfs2_global_read_info()

2012-03-01 Thread Sunil Mushran
This patch silences this message. Signed-off-by: Sunil Mushran --- fs/ocfs2/quota_global.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c index b24eab3..2a3f12c 100644 --- a/fs/ocfs2/quota_global.c +++ b/fs/ocfs2

[Ocfs2-devel] [PATCH 8/9] ocfs2: Replace nlink_t with unsigned int

2012-03-01 Thread Sunil Mushran
nlink_t was replaced as per the suggestion in the following link. https://lkml.org/lkml/2012/2/2/577 Reported-by: Al Viro Signed-of-by: Sunil Mushran --- fs/ocfs2/namei.c |6 +++--- 1 files changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c index

[Ocfs2-devel] [PATCH 9/9] ocfs2: Fix tiny race in unaligned aio+dio

2012-03-01 Thread Sunil Mushran
serialization accounting for writes. This patch seperates the handler functions to avoid this issue. Signed-off-by: Sunil Mushran --- fs/ocfs2/aops.c | 44 1 files changed, 32 insertions(+), 12 deletions(-) diff --git a/fs/ocfs2/aops.c b/fs/ocfs2/aops.c

[Ocfs2-devel] [PATCH 5/9] ocfs2/dlm: Fix list traversal in dlm_process_recovery_data()

2012-03-01 Thread Sunil Mushran
averting the case in which lock is set to NULL. Reported-by: Julia Lawall Signed-off-by: Sunil Mushran --- fs/ocfs2/dlm/dlmrecovery.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index 01ebfd0..c881be6

[Ocfs2-devel] [PATCH 1/9] ocfs2/cluster: Fix possible null pointer dereference

2012-03-01 Thread Sunil Mushran
Patch fixes some possible null pointer dereferences that were detected by the static code analyser, smatch. Reported-by: Dan Carpenter Signed-off-by: Sunil Mushran --- fs/ocfs2/cluster/tcp.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/cluster

Re: [Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.

2012-02-21 Thread Sunil Mushran
ask_list = { > next = 0x8105d2a28360, > prev = 0x8105d2a28360 > } > }, > l_debug_list = { > next = 0x81033cabad08, > prev = 0x8105d2a284a8 > }, > l_lock_num_prmode = 0, > l_lock_num_exmode = 0, > l_lock_num_prmode_failed = 0, > l_lock_num_exmode_failed = 0,

Re: [Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.

2012-02-21 Thread Sunil Mushran
pin lock,here A doesn't > hold spin locks, > then it start to execute the proxy ast handler , process bast request > from nodeB, > then dlmthread flushed the bast, after this node A start to queue its > ast in ocfs2_dlm_lock() function. > > Thanks, > Xiaowei > On 02/2

Re: [Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.

2012-02-21 Thread Sunil Mushran
Moreover what is lockres_clear_pending doing in 1.4. That code is not meant for 1.4. It fixes a problem associated with fsdlm. It was left out of 1.4 for a reason. Meaning this bug was introduced by the patch that introduced this one in 1.4. On 02/20/2012 10:12 PM, xiaowei...@oracle.com wrote: >

Re: [Ocfs2-devel] Race condition between OCFS2 downconvert thread and ocfs2 cluster lock.

2012-02-21 Thread Sunil Mushran
> bast queued and flushed,before the ast was queued Unlikely with o2dlm. dlmthread always sends ASTs before BASTs. Can you recreate the entire lockres? A full dump may yield more information. Sunil On 02/20/2012 10:12 PM, xiaowei...@oracle.com wrote: > I am trying to fix bug13611997,CT's machi

Re: [Ocfs2-devel] [patch] ocfs2: cleanup error handling in o2hb_alloc_hb_set()

2012-02-13 Thread Sunil Mushran
On 02/13/2012 12:29 PM, Dan Carpenter wrote: > On Mon, Feb 13, 2012 at 12:04:09PM -0800, Joel Becker wrote: >> On Mon, Feb 13, 2012 at 04:50:47PM +0300, Dan Carpenter wrote: >>> If "ret" is NULL, then "hs" is also NULL, so there is no need to free >>> it. config_group_init_type_name() can't fail i

Re: [Ocfs2-devel] [patch] ocfs2: cleanup error handling in o2hb_alloc_hb_set()

2012-02-13 Thread Sunil Mushran
On 02/13/2012 12:08 PM, Dan Carpenter wrote: > On Mon, Feb 13, 2012 at 11:39:27AM -0800, Sunil Mushran wrote: >> hmm... I would say NAK because config_group_item_type_name() could >> change in the future. And there is nothing wrong with the current >> code. > > The e

Re: [Ocfs2-devel] [patch] ocfs2: cleanup error handling in o2hb_alloc_hb_set()

2012-02-13 Thread Sunil Mushran
hmm... I would say NAK because config_group_item_type_name() could change in the future. And there is nothing wrong with the current code. On 02/13/2012 05:50 AM, Dan Carpenter wrote: > If "ret" is NULL, then "hs" is also NULL, so there is no need to free > it. config_group_init_type_name() can't

Re: [Ocfs2-devel] [PATCH] ocfs2: for SEEK_DATA/SEEK_HOLE, return internal error unchanged if ocfs2_get_clusters_nocache() or ocfs2_inode_lock() call failed.

2012-02-09 Thread Sunil Mushran
Signed-off-by: Sunil Mushran On 02/08/2012 10:42 PM, Jeff Liu wrote: > Hello, > > Since ENXIO only means "offset beyond EOF" for SEEK_DATA/SEEK_HOLE, > Hence we should return the internal error unchanged if ocfs2_inode_lock() or > ocfs2_get_clusters_nocache() call

Re: [Ocfs2-devel] sparsify - utility to punch out blocks of 0s in a file

2012-02-06 Thread Sunil Mushran
On 02/04/2012 12:04 PM, Eric Sandeen wrote: > Now that ext4, xfs,& ocfs2 can support punch hole, a tool to > "re-sparsify" a file by punching out ranges of 0s might be in order. > > I whipped this up fast, it probably has bugs& off-by-ones but thought > I'd send it out. It's not terribly efficie

Re: [Ocfs2-devel] [PATCH 1/1] ocfs2: use spinlock irqsave for downconvert lock.patch

2012-01-31 Thread Sunil Mushran
sob On 01/30/2012 09:51 PM, Srinivas Eeda wrote: > When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ it > deadlock itself trying to get same spinlock in ocfs2_wake_downconvert_thread. > Below is the stack snippet. > > The patch disables interrupts when acquiring dc_task_lock sp

Re: [Ocfs2-devel] [PATCH 1/1] ocfs2: use spinlock irqsave for downconvert lock.patch

2012-01-30 Thread Sunil Mushran
Comments inlined. On 01/28/2012 06:13 PM, Srinivas Eeda wrote: > When ocfs2dc thread holds dc_task_lock spinlock and receives soft IRQ for > I/O completion it deadlock itself trying to get same spinlock in > ocfs2_wake_downconvert_thread > > The patch disables interrupts when acquiring dc_task_loc

[Ocfs2-devel] [PATCH 1/1] ocfs2: Fix oops in fallocate()

2012-01-30 Thread Sunil Mushran
fallocate() was oopsing on ocfs2 because we were passing in a NULL file pointer. Signed-off-by: Sunil Mushran --- fs/ocfs2/file.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/file.c b/fs/ocfs2/file.c index 061591a..8f30e74 100644 --- a/fs/ocfs2/file.c +++ b

Re: [Ocfs2-devel] Question about incorrect free bits setting

2012-01-18 Thread Sunil Mushran
We've seen this too. The problem happens because of the patch added to delay dropping of the dentry locks (first patch below). The other two are related. It was added to avoid a deadlock in quotas but adds problems of its own. Srini has studied this issue and may be able to expand on this. The quic

[Ocfs2-devel] pull request

2012-01-13 Thread Sunil Mushran
Joel, Please pull 6 patches (bug fixes) from the following repo. git://oss.oracle.com/git/smushran/linux-2.6.git mw-3.3-jan13 BTW, not sure if I emailed before but we have to rollback 3 patches related to deletes. These patches were added to fix deadlocks with quotas. Well, it has just broken

Re: [Ocfs2-devel] [PATCH][ocfs2/dlm] fix dlm_clean_master_list

2011-12-22 Thread Sunil Mushran
On 12/20/2011 06:49 PM, Wengang Wang wrote: > This is a fix on dlm_clean_master_list() > > During the hash table browsing, we remove mle from hash table then free > the memory on the last reference. So we have to use a _safe() version > of the browsing function when doing that. > > This fixes Orabu

Re: [Ocfs2-devel] [PATCH] ocfs2: submit disk heartbeat bio using WRITE_SYNC

2011-12-08 Thread Sunil Mushran
Acked-by: Sunil Mushran On 12/05/2011 09:18 PM, Tao Ma wrote: > On 12/06/2011 12:57 PM, Noboru Iwamatsu wrote: >> Under heavy I/O load, writing the disk heartbeat can be forced >> to wait for minutes, and this causes the node to be fenced. >> >> This patch tries to u

[Ocfs2-devel] [PATCH 3/6] ocfs2: Silence message in ocfs2_global_read_info()

2011-11-17 Thread Sunil Mushran
This patch silences this message. Signed-off-by: Sunil Mushran --- fs/ocfs2/quota_global.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c index b24eab3..2a3f12c 100644 --- a/fs/ocfs2/quota_global.c +++ b/fs/ocfs2

[Ocfs2-devel] [PATCH 1/6] ocfs2/cluster: Fix possible null pointer dereference

2011-11-17 Thread Sunil Mushran
Patch fixes some possible null pointer dereferences that were detected by the static code analyser, smatch. Reported-by: Dan Carpenter Signed-off-by: Sunil Mushran --- fs/ocfs2/cluster/tcp.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/cluster

[Ocfs2-devel] [PATCH 6/6] ocfs2: Tighten free bit calculation in the global bitmap

2011-11-17 Thread Sunil Mushran
exceeds the total bit count. In each instance the bitmap is correct. Only the free bit count is incorrect. This patch checks the current bit value and increments the free bit count only if the bit was previously set. It also prints information to allow us to debug further. Signed-off-by: Sunil Mushran

[Ocfs2-devel] [PATCH 4/6] ocfs2/dlm: Use track_lock when manipulating tracking_list

2011-11-17 Thread Sunil Mushran
Commit b0d4f817ba5de8adb875ace594554a96d7737710 introduced dlm->track_lock to protect operations on dlm->tracking_list. But it was still using the older lock (dlm->spin_lock) to add new resources to the list. Signed-off-by: Sunil Mushran --- fs/ocfs2/dlm/dlmmaster.c |4 ++-- 1 file

[Ocfs2-devel] [PATCH 2/6] ocfs2: Add missing copyright in few files

2011-11-17 Thread Sunil Mushran
Signed-off-by: Sunil Mushran --- fs/ocfs2/mmap.h | 18 ++ fs/ocfs2/ocfs2_trace.h | 19 +++ fs/ocfs2/quota.h| 16 ++-- fs/ocfs2/quota_global.c | 20 ++-- fs/ocfs2/quota_local.c | 19 +-- 5

[Ocfs2-devel] [PATCH 5/6] ocfs2/dlm: Fix list traversal in dlm_process_recovery_data

2011-11-17 Thread Sunil Mushran
averting the case in which lock is set to NULL. Reported-by: Julia Lawall Signed-off-by: Sunil Mushran --- fs/ocfs2/dlm/dlmrecovery.c | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/fs/ocfs2/dlm/dlmrecovery.c b/fs/ocfs2/dlm/dlmrecovery.c index 01ebfd0..c881be6

Re: [Ocfs2-devel] vmstore option - mkfs

2011-11-16 Thread Sunil Mushran
NCOMPAT_REFCOUNT_TREE, > + OCFS2_FEATURE_RO_COMPAT_UNWRITTEN}, /* FS_VMSTORE */ > > These options are the ones that, when choosing for vmstore, are > enabled by default. Is this correct? > > Thanks. > > Att. > Artur Baruchi > > > > On Wed, Nov 16, 2011 at 3:26 PM, Sunil Mushran > wrote: >

Re: [Ocfs2-devel] vmstore option - mkfs

2011-11-16 Thread Sunil Mushran
fstype is a handy way to format the volume with parameters that are thought to be useful for that use-case. The result of this is printed during format by way of the parameters selected. man mkfs.ocfs2 has a blurb about the features it enabled by default. On 11/16/2011 08:45 AM, Artur Baruchi wrot

Re: [Ocfs2-devel] [PATCH 1/2] fs/ocfs2/dlm: Eliminate update of list_for_each_entry loop cursor

2011-11-02 Thread Sunil Mushran
I think it got lost in the shuffle. We had decided to use the list_for_each(). The code is simpler to understand than the other proposed fix. Joel, do you want me to send a patch? On 11/02/2011 12:39 AM, Dan Carpenter wrote: > What ever happened with this? The bug is still there in the latest >

[Ocfs2-devel] Fwd: [PATCH] ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2

2011-10-19 Thread Sunil Mushran
Joel, Please add this to the linux-next branch. Original Message Subject:[Ocfs2-devel] [PATCH] ocfs2: Add a missing journal credit in ocfs2_link_credits() -v2 Date: Wed, 19 Oct 2011 09:34:19 +0800 From: xiaowei...@oracle.com To: ocfs2-devel@oss.oracle.com CC:

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-14 Thread Sunil Mushran
On 10/14/2011 01:57 AM, Wengang Wang wrote: > Problem reproduced(against mainline) with the above patch applied. Also with > the hacking > patch(attached). > > testcase is attached. > > (kworker/u:2,14465,1):dlm_assert_master_handler:1828 ERROR: DIE! Mastery > assert from 0, but current owner is 1

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-13 Thread Sunil Mushran
http://oss.oracle.com/git/?p=jlbec/linux-2.6.git;a=commitdiff;h=ff0a522e7db79625aa27a433467eb94c5e255718 Are you sure you have this patch? On 10/13/2011 05:19 PM, Wengang Wang wrote: > 2.6.18-128. > > thanks, > wengang. > On 11-10-13 16:37, Sunil Mushran wrote: >> whic

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-13 Thread Sunil Mushran
which kernel? On 10/13/2011 04:35 PM, Wengang Wang wrote: > On 11-10-13 09:09, Sunil Mushran wrote: >> The last email you said it reproduced. Now you say it did not. >> I'm confused. > Oh? Did I. If I did, I meant it had reproductions in different customers's >

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-13 Thread Sunil Mushran
The last email you said it reproduced. Now you say it did not. I'm confused. On 10/12/2011 07:13 PM, Wengang Wang wrote: > On 11-10-12 19:11, Sunil Mushran wrote: >> That's what ovm does. Have you reproduced it with ovm3 kernel? >> > No, I have no reproductions. > &

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-12 Thread Sunil Mushran
That's what ovm does. Have you reproduced it with ovm3 kernel? On 10/12/2011 07:07 PM, Wengang Wang wrote: > On 11-10-13 09:51, Wengang Wang wrote: >> On 11-10-12 18:47, Sunil Mushran wrote: >>> I meant master_request (not query). We set refmap _before_ >>> asse

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-12 Thread Sunil Mushran
I meant master_request (not query). We set refmap _before_ asserting. So that should not happen. On 10/12/2011 06:02 PM, Wengang Wang wrote: > Hi Sunil, > > On 11-10-12 17:32, Sunil Mushran wrote: >> So you are saying a lockres can get purged before the node is asserting >>

Re: [Ocfs2-devel] avoid being purged when queued for assert_master

2011-10-12 Thread Sunil Mushran
So you are saying a lockres can get purged before the node is asserting master to other nodes? The main place where we dispatch assert is during master_query. There we set refmap before dispatching. Meaning refmap will protect us from purging. But I think it could happen in master_requery, which

Re: [Ocfs2-devel] [PATCH] ocfs2: Commit transactions in error cases -v2

2011-10-12 Thread Sunil Mushran
Acked-by: Sunil Mushran On 10/12/2011 12:22 AM, Wengang Wang wrote: > There are three cases found that in error cases, journal transactions are not > committed nor aborted. We should take care of these case by committing the > transactions. Otherwise, there would left a journal handle w

Re: [Ocfs2-devel] [PATCH] ocfs2: Commit transactions in error cases.

2011-10-11 Thread Sunil Mushran
The first two are ok. Have a comment for the last one. On 09/25/2011 02:13 AM, Wengang Wang wrote: > Commit transactions in error cases. > > There are three cases found that in error cases, journal transactions are not > committed nor aborted. We should take care of these case by committing the >

  1   2   3   4   5   6   7   8   9   10   >