Hi Chris
Christopher Dwan wrote:
I've asked this here before, but I'm hoping that there have been some
changes or updates and I missed the email:
What's the state of the art in terms of configuring permissions on
BLAST target files (specifically the MBF files) so that multiple users
can run searches against targets in a shared directory, make use of
cached index and sequence files, and not encounter the dreaded:
Error opening /common/data/nt.mbf
open: Permission denied
Fatal Error:
This is very hard as you do not know a-priori if
a) two mpiblast jobs are from users in the same group
b) two mpiblast jobs using a database named "qwerty" actually want to
use the same database. You would need to store hash signatures per
db-chunk.
From perusing the list archives I see the following options:
------------------------------------------------------------------------
--------
* Make a custom BLAST target directory for every user (not enough disk
space, doesn't scale)
This is the only solution I know that will work. Of course the other
issue is if someone starts mpiblasting a really huge db, then you have
effectively denied service on the compute nodes to other jobs due to the
disk cache issue.
* Hack MPIBlast code to change the file permissions on the MBF files
(seems reasonable, but there must be some reason why it's not already
done. Probably concerns about users stepping on each other's jobs).
This is something I ran into ~3 years ago with early versions. Users
would stamp all over others runs, and complain that the cluster was broken.
There is no satisfactory solution other than isolation. Isolation
effectively defeats caching. If we break out the file distribution (or
more accurately, the chunk distribution) from mpiblast, you may be able
to have another lower layer work on it, but that layer would need to be
designed.
An alternative (but a very bad one for a number of reasons) is to have
all users have the same group, and give every one group read/write
permission into the cache directory. You want to avoid any scenario
which does this due to the potential for users to step on each other,
and the complete lack of access control that such a scheme requires to
function.
* Install pre and post scripts in my DRM to erase cached files after
each job is done (why have a cache at all?)
for long queries. Speeds the local access. Caching per job.
Is there another option that I'm missing here? How do folks handle
this at large-ish installations with multiple users who each may run
mpiblast jobs against shared targets?
Separate cache locations per user. Really fast access to shared fs.
Really fast and large local fs.
-Chris Dwan
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log
files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users
--
Joseph Landman, Ph.D
Founder and CEO
Scalable Informatics LLC,
email: [EMAIL PROTECTED]
web : http://www.scalableinformatics.com
phone: +1 734 786 8423
fax : +1 734 786 8452 or +1 866 888 3112
cell : +1 734 612 4615
-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems? Stop! Download the new AJAX search engine that makes
searching your log files as easy as surfing the web. DOWNLOAD SPLUNK!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=103432&bid=230486&dat=121642
_______________________________________________
Mpiblast-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mpiblast-users