On Oct 6, 2006, at 12:19 PM, Sam Lang wrote:
On Oct 6, 2006, at 2:01 AM, Julian Martin Kunkel wrote:
Hi,
This way callers wouldn't be able to muck with the internals of the
hint struct.
Ok, I will definitly do this.
As I said, I prefer letting the hint struct be defined
externally and requiring an array of them to the system interfaces.
It seems to match what we have throughout the rest of PVFS.
If the pvfs2-team thinks that is the way to go I will do so, even
if I prefer
the list for the system interface. So I think the array wins ?
Setting specific IO servers for a file is interesting for research
purposes, but kind of breaks the encapsulation and abstraction that
pvfs tries to provide. We've already got a distribution parameter
that we pass in to PVFS_sys_create when appropriate.
I don't know that we want to go around telling users to set an
environment variable (a separate interface, remember) for specific
behavior that they've requested. I'd rather figure out why the dist
parameter isn't providing the bits of functionality that they need,
and solve their problem from that angle.
I vaguely remember having
this discussion about specifying datafile handles through the dist
parameter, and there being some concerns, but I can't remember the
details. Can you remind me why that doesn't work?
The dist parameter is opaque for the client interface and should
only be
interpreted from the distribution. However, on the layer of the
distribution
the hostnames are not available, also it would be available only to
distributions which interprete the given hostnames.
The specific dist implementation is certainly opaque from the
client interface, but we do provide functions to set parameters on
a particular distribution. What I would envision is something
along these lines:
struct PINT_dist_server_settable_indices {
int count;
int * servers;
};
...
PVFS_sys_dist mydist;
struct PINT_dist_server_settable_indices indices =
{
4,
{2, 4, 6, 8}
}
...
mydist.name = "server-settable-dist";
PVFS_sys_dist_setparam(&mydist, "servers", &indices);
....
PVFS_sys_create(..., &mydist, ...);
The server-settable-dist would be implemented to store the indices
for the IO servers in the params field of the PVFS_sys_dist
structure. I used server indices, because the PINT_dist_*
interfaces allow for that (through the PINT_request_file_data
struct), but its a bit ugly and probably confusing to the user. We
could change that though, and use something like server aliases or
hostnames, passing them through to the distribution parameters
instead of indices. This would require a change to the
distribution method calls, and the
Also the create sm has to
be modified a lot to use the new distribution facilities.
It certainly would. The client create state machine is a little
broken in this respect anyway though. It normally just gets the
list of IO servers from the cached server config file and does a
create request to all of them. There's special case handling right
now for the directory hints on the parent to see if the number of
datafiles handles should be fewer than available.
It seems like that could be generalized to get a list of server
indices from the distribution with a distribution method that
returns that info. We already store a distribution in the
directory hints anyway, so we could probably just throw away that
dfile_count parameter. In any case, I would be in favor of a
change to the create sm that allows the distribution to optionally
specify the server indices, and fallback to the cached config
lookup otherwise.
Of course by adding
another array to the distribution struct which is common to all
distributions
would solve the issue but I would not prefer this solution.
No that's gross.
With the hint it is a common infrastructure given for all available
distributions and it also is a uncommon case to set the
distribution..
Right, I don't like the idea of 'overriding' a distribution's
behavior with a hint. Conceptually it seems like the distribution
should take care of this, and if it doesn't, we should fix it so it
does.
Also, the server list hint gets a little ugly across sysint calls,
since the list doesn't actually get stored anywhere for that file
(does it?). With the distribution, we only have to specify it once
at creation time, instead of passing it along to all future IO
calls on that file.
Actually not sure what I was thinking there. The datafile handles of
course get stored in the dh keyval.
-sam
-sam
Wasn't
the hint be intented to be useful for research purposes ?
Thanks for the discussion and the time you spend,
Julian
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers
_______________________________________________
Pvfs2-developers mailing list
Pvfs2-developers@beowulf-underground.org
http://www.beowulf-underground.org/mailman/listinfo/pvfs2-developers