[Pvfs2-developers] disabling ncache
Hey gang One of the ROMIO tests (noncontig_coll2) started failing around the time we comitted the revamped ncache. If I disable the ncache by setting NCACHE_DEFAULT_TIMEOUT_MSECS to zero, the test passes again. Sam and I have a couple solutions, but none of them are satsifying given we need to coordinate any API changes with ROMIO: - To compliment PVFS2_SET_DEBUGMASK, add a PVFS2_NCACHE_TIMEOUT environment variable (this one might be the winner) - Add a new routine to the pvfs2 API that modifies the ncache timeout values. This would impose a ROMIO pre-req on pvfs2-1.whatever-is-next. - ignore the problem. It only shows up if one set of processes open/write/read/close a file, it is deleted, and a different set (or same set but different order) open/write/read/close the file. Most places that care to set ROMIO's cb_config_list hint aren't going to change it when they delete and re-create a file. I think I'm going to go with the environment variable solution, since if ROMIO happens to link against an older PVFS2, nothing bad will happen (as long as they don't go crazy with cb_config_list). Does anybody else have any suggestions? ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Labs, IL USAB29D F333 664A 4280 315B ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] disabling ncache
On Mon, Aug 21, 2006 at 01:27:39PM -0500, Robert Latham wrote: Hey gang One of the ROMIO tests (noncontig_coll2) started failing around the time we comitted the revamped ncache. If I disable the ncache by setting NCACHE_DEFAULT_TIMEOUT_MSECS to zero, the test passes again. Sam and I have a couple solutions, but none of them are satsifying given we need to coordinate any API changes with ROMIO: - To compliment PVFS2_SET_DEBUGMASK, add a PVFS2_NCACHE_TIMEOUT environment variable (this one might be the winner) right, sorry, that's the PVFS2_DEBUGMASK environtment variable (not pvfs2-set-debugmask the utility). ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Labs, IL USAB29D F333 664A 4280 315B ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] disabling ncache
Hey Rob, Shall we first try to diagnose the problem a little bit? Can you describe the problem a little bit more? Thanks, Murali Hey gang One of the ROMIO tests (noncontig_coll2) started failing around the time we comitted the revamped ncache. If I disable the ncache by setting NCACHE_DEFAULT_TIMEOUT_MSECS to zero, the test passes again. Sam and I have a couple solutions, but none of them are satsifying given we need to coordinate any API changes with ROMIO: - To compliment PVFS2_SET_DEBUGMASK, add a PVFS2_NCACHE_TIMEOUT environment variable (this one might be the winner) - Add a new routine to the pvfs2 API that modifies the ncache timeout values. This would impose a ROMIO pre-req on pvfs2-1.whatever-is-next. - ignore the problem. It only shows up if one set of processes open/write/read/close a file, it is deleted, and a different set (or same set but different order) open/write/read/close the file. Most places that care to set ROMIO's cb_config_list hint aren't going to change it when they delete and re-create a file. I think I'm going to go with the environment variable solution, since if ROMIO happens to link against an older PVFS2, nothing bad will happen (as long as they don't go crazy with cb_config_list). Does anybody else have any suggestions? ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Labs, IL USAB29D F333 664A 4280 315B ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
Re: [Pvfs2-developers] disabling ncache
I doubt that this is the problem, but I should mention that I although the main ncache patch was accepted, we are still missing the sys-rename update in trunk (I just noticed Friday). It is the second of the two patches mentioned in this thread: http://www.beowulf-underground.org/pipermail/pvfs2-developers/2006-August/002390.html I am attaching it to this email just in case. I seriously doubt that the mpi program is using rename but we may as well get this in there :) If the ncache really needs to be disabled I don't really have a preference. Probably the env variable is the easiest. -Phil Murali Vilayannur wrote: Hey Rob, Shall we first try to diagnose the problem a little bit? Can you describe the problem a little bit more? Thanks, Murali Hey gang One of the ROMIO tests (noncontig_coll2) started failing around the time we comitted the revamped ncache. If I disable the ncache by setting NCACHE_DEFAULT_TIMEOUT_MSECS to zero, the test passes again. Sam and I have a couple solutions, but none of them are satsifying given we need to coordinate any API changes with ROMIO: - To compliment PVFS2_SET_DEBUGMASK, add a PVFS2_NCACHE_TIMEOUT environment variable (this one might be the winner) - Add a new routine to the pvfs2 API that modifies the ncache timeout values. This would impose a ROMIO pre-req on pvfs2-1.whatever-is-next. - ignore the problem. It only shows up if one set of processes open/write/read/close a file, it is deleted, and a different set (or same set but different order) open/write/read/close the file. Most places that care to set ROMIO's cb_config_list hint aren't going to change it when they delete and re-create a file. I think I'm going to go with the environment variable solution, since if ROMIO happens to link against an older PVFS2, nothing bad will happen (as long as they don't go crazy with cb_config_list). Does anybody else have any suggestions? ==rob -- Rob Latham Mathematics and Computer Science DivisionA215 0178 EA2D B059 8CDF Argonne National Labs, IL USAB29D F333 664A 4280 315B ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers Index: pvfs2_src/src/client/sysint/sys-rename.sm === --- pvfs2_src/src/client/sysint/sys-rename.sm (revision 2102) +++ pvfs2_src/src/client/sysint/sys-rename.sm (revision 2103) @@ -911,7 +911,20 @@ { PINT_acache_invalidate(sm_p-object_ref); } +else +{ +gossip_debug(GOSSIP_CLIENT_DEBUG, + rename state: updating ncache with entry [%s] + ref.handle=%llu ref.fsid=%d\n, + sm_p-u.rename.entries[1], + llu(sm_p-u.rename.refns[0].handle), + sm_p-u.rename.parent_refns[0].fs_id); +PINT_ncache_update((const char*) sm_p-u.rename.entries[1], + (const PVFS_object_ref*) (sm_p-u.rename.refns[0]), + (const PVFS_object_ref*) (sm_p-u.rename.parent_refns[0])); +} + PINT_SM_GETATTR_STATE_CLEAR(sm_p-getattr); sm_p-op_complete = 1; ___ Pvfs2-developers mailing list Pvfs2-developers@beowulf-underground.org http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers