Re: [9fans] Different representations of the same
On Wed, 2009-06-17 at 09:54 +0100, Charles Forsyth wrote: > >The only drawback so far seems to be the fact that if one > >needs flexibility, then every file becomes a subdirectory. > >Not that it is scary or anything, but it smells too much > >of resource forks (or may be I'm just too easily scared). > > it's the other way round: they ought to have represented > collections of related data and metadata using directories > instead of inventing rubbish like resource forks. Having thought of this some more, I believe you're absolutely right. Now, the *only* thing that you don't get if you go that route is a read/write on a "default" file representation. Thanks, Roman.
Re: [9fans] Different representations of the same
On Jun 17, 2009, at 1:54 AM, Charles Forsyth wrote: The only drawback so far seems to be the fact that if one needs flexibility, then every file becomes a subdirectory. Not that it is scary or anything, but it smells too much of resource forks (or may be I'm just too easily scared). it's the other way round: they ought to have represented collections of related data and metadata using directories instead of inventing rubbish like resource forks. They sort-of-kind-of got there in Mac OS X with `application bundles' that are just specially named directories with some canonical contents. It'll take another major systems change for them to wean themselves off of resource forks entirely, I expect. *chad
Re: [9fans] Different representations of the same
>The only drawback so far seems to be the fact that if one >needs flexibility, then every file becomes a subdirectory. >Not that it is scary or anything, but it smells too much >of resource forks (or may be I'm just too easily scared). it's the other way round: they ought to have represented collections of related data and metadata using directories instead of inventing rubbish like resource forks.
Re: [9fans] Different representations of the same
On Sat, 2009-06-13 at 05:43 +0200, lu...@proxima.alt.za wrote: > > Sure, but if *each* file can have more than one representation then > > where's the best place for the ctl thing to be? In each subdirectory? > > At the top of the hierarchy (accepting the full path names, of course)? > > Well, assume you have a canonical representation for a given file, I'd > have the ctl file in the same directory. You'd then use a command > that includes the basename as well as the representation selector to > create the new entry. If the representation directory already exists, > then the file is added to whatever is already there, otherwise the > directory is created first: > > ; ls /n/synthetic > /n/synthetic/ctl > /n/synthetic/image.canonical > ; echo GIF image.canonical > /n/synthetic/ctl > ; ls /n/synthetic > /n/synthetic/ctl > /n/synthetic/gif > /n/synthetic/image.canonical > ; ls /n/synthetic/gif > /n/synthetic/gif/image.canonical # sic > > If you need additional depth to the directory, then I think you ought > to be looking to upas/fs and how it manipulates its directory for > further hints. Hm. This looks more complex to me than essentially turning every file into a "subdirectory" of sorts full of names for representations. Thanks, Roman.
Re: [9fans] Different representations of the same
On Fri, 2009-06-12 at 21:56 -0400, erik quanstrom wrote: > > > I thought you might want a "ctl" file into which you write the > > > representation you want and that magically creates a new file or > > > directory. > > > > Sure, but if *each* file can have more than one representation then > > where's the best place for the ctl thing to be? In each subdirectory? > > At the top of the hierarchy (accepting the full path names, of course)? > [...] > > I'm simply asking for the best practices. > > generally, plan 9 fs pick a cannonical format if they can get > away with it. > > the right answer is going to depend very strongly on one's > exact constraints. > > if the client knows he's always going to want pngs, it might > be good to pass png as the spec, as in > ; mount /srv/media /n/media png > i suppose this would be a lot like content negotiation. Exactly! From where I stand it seems that a particular representation has to be either part of the name, or it has to hide in a "invisbible" part of the protocol. The benefits of having representation spec being part of the name are obvious -- you are alway know what you're asking for, plus you can explicitly list representations if there's more than one. The only drawback so far seems to be the fact that if one needs flexibility, then every file becomes a subdirectory. Not that it is scary or anything, but it smells too much of resource forks (or may be I'm just too easily scared). > there are some really wild examples running about. hackery in > upas that will serve up x(\.$e)? for any $e in certain circumstances. > (see upas/ned/nedmail.c:/^plumb.) the reason for this is so that > upas/fs can serve up content with type unknown to upas/fs so > that it can match plumbing rules that expect file extensions. > the alternative would have been to add an explicitly declared filetype to > the plumb messages and adding a ruletype to the plumb files. > i suppose the idea was to not break everyone's plumbing rules. Interesting... Thanks, Roman.
Re: [9fans] Different representations of the same
> Sure, but if *each* file can have more than one representation then > where's the best place for the ctl thing to be? In each subdirectory? > At the top of the hierarchy (accepting the full path names, of course)? Well, assume you have a canonical representation for a given file, I'd have the ctl file in the same directory. You'd then use a command that includes the basename as well as the representation selector to create the new entry. If the representation directory already exists, then the file is added to whatever is already there, otherwise the directory is created first: ; ls /n/synthetic /n/synthetic/ctl /n/synthetic/image.canonical ; echo GIF image.canonical > /n/synthetic/ctl ; ls /n/synthetic /n/synthetic/ctl /n/synthetic/gif /n/synthetic/image.canonical ; ls /n/synthetic/gif /n/synthetic/gif/image.canonical # sic If you need additional depth to the directory, then I think you ought to be looking to upas/fs and how it manipulates its directory for further hints. Whether this is any better than content negotiation may be a judgement call. I'll read the wikipedia entry later, thank you for pointing me to it. ++L
Re: [9fans] Different representations of the same
> > I thought you might want a "ctl" file into which you write the > > representation you want and that magically creates a new file or > > directory. > > Sure, but if *each* file can have more than one representation then > where's the best place for the ctl thing to be? In each subdirectory? > At the top of the hierarchy (accepting the full path names, of course)? [...] > I'm simply asking for the best practices. generally, plan 9 fs pick a cannonical format if they can get away with it. the right answer is going to depend very strongly on one's exact constraints. if the client knows he's always going to want pngs, it might be good to pass png as the spec, as in ; mount /srv/media /n/media png i suppose this would be a lot like content negotiation. there are some really wild examples running about. hackery in upas that will serve up x(\.$e)? for any $e in certain circumstances. (see upas/ned/nedmail.c:/^plumb.) the reason for this is so that upas/fs can serve up content with type unknown to upas/fs so that it can match plumbing rules that expect file extensions. the alternative would have been to add an explicitly declared filetype to the plumb messages and adding a ruletype to the plumb files. i suppose the idea was to not break everyone's plumbing rules. - erik
Re: [9fans] Different representations of the same
On Thu, 2009-06-11 at 06:49 +0200, lu...@proxima.alt.za wrote: > > but at that point it becomes no more appealing than the content > > negotiation techniques of HTTP. > > I thought you might want a "ctl" file into which you write the > representation you want and that magically creates a new file or > directory. Sure, but if *each* file can have more than one representation then where's the best place for the ctl thing to be? In each subdirectory? At the top of the hierarchy (accepting the full path names, of course)? > Or use a "clone" style protocol which is more suitable for > the automatic creation of new entities. "clone" doesn't quite work for me in REST world (not that it can't be made to work, it is just complicated). > Of course, you may specifically want to go for a totally different > approach, in which case I plead guilty to not understanding the exact > nature of the solution you're seeking. I'm simply asking for the best practices. Also, as I admitted in my original email, I'm not really implementing this in 9P. So I have an option that is native to the protocol I'm using: content negotiation (http://en.wikipedia.org/wiki/Content_negotiation) Now, since 9P doesn't have that I was simply wondering what would be the agreed upon wisdom to have the same functionality _cleanly_ implemented in a 9P based synthetic filesystem. Thanks, Roman.
Re: [9fans] Different representations of the same
> but at that point it becomes no more appealing than the content > negotiation techniques of HTTP. I thought you might want a "ctl" file into which you write the representation you want and that magically creates a new file or directory. Or use a "clone" style protocol which is more suitable for the automatic creation of new entities. The "mntgen" approach is clever, but I find the Schroedinger Cat nature of it a little daunting. Of course, you may specifically want to go for a totally different approach, in which case I plead guilty to not understanding the exact nature of the solution you're seeking. ++L
Re: [9fans] Different representations of the same file/resource in a synthetic FS
В Втр, 09/06/2009 в 11:27 -0600, andrey mirtchovski пишет: > I think I've mentioned this before, but on a few of my synthetic file > systems here I'm using what you describe to slice a database by > specific orderings. For example, I have a (long) list of resources > which I'm managing in a particular environment each has an owner, > type, status and a few static data containers. It's all backed by a > relational database, but the file server presents different "slices" > of that to external users, where the directory structure is rigidly > defined as: > > / > available/ > by-type/ > by-owner/ > inuse/ > ... > > with all data to fill the directories being dynamically pulled from > the database. This looks like a slightly different use case than what I'm worried about. Mainly it seems that you don't really have to deal with the representations of the same resource, your problem is how to group these resources in a reasonable fashion. Essentially you're mapping a relational database to a tree hierarchy. In your case, the sooner you have the fork of by-this/ by-that/ in your hierarchy -- the better. My case is a flip side of that. In fact, my worst case scenario is that I can't really predict all the representations of existing resources down the road, thus it appears that I have to push that part of a filename as close to an actual file as possible: /long-path/file. I'm almost tempted to consider "virtual extensions": /long-path/file ## default representation /long-path/file.gif /long-path/file.pdf but at that point it becomes no more appealing than the content negotiation techniques of HTTP. Thanks, Roman.
Re: [9fans] Different representations of the same file/resource in a synthetic FS
> On Tue, Jun 9, 2009 at 1:16 PM, erik quanstrom wrote: > >> still a hash. i'm not doing anything particularly clever for speed, > >> and it shows in places. > > I lied a bit here: in some cases, for example where a particular query > would involve going through several (up to a couple of thousand) files > and subdirectories to compose, i provide a single file that gives me > that information much faster and in only a fraction of the 9p queries > it would normally would. it's by no means a general solution to the > speed problem, but it does get me the data 30-50 times faster... > > but i digress... interstingly, i considered mentioning the old upas/fs trick of the info file, which i believe is approximately the same hack. the "xxx" hack i recently added to the file has table of upas/fs is somewhat the mirror image of the info file. i suppose that in the language of standardized assessment tests technique is to hack as 30x is to 5%. - erik
Re: [9fans] Different representations of the same file/resource in a synthetic FS
On Tue, Jun 9, 2009 at 1:16 PM, erik quanstrom wrote: >> still a hash. i'm not doing anything particularly clever for speed, >> and it shows in places. I lied a bit here: in some cases, for example where a particular query would involve going through several (up to a couple of thousand) files and subdirectories to compose, i provide a single file that gives me that information much faster and in only a fraction of the 9p queries it would normally would. it's by no means a general solution to the speed problem, but it does get me the data 30-50 times faster... but i digress...
Re: [9fans] Different representations of the same file/resource in a synthetic FS
> still a hash. i'm not doing anything particularly clever for speed, > and it shows in places. listing large directories is the slowest > operation by far, as it would be for most cases where several thousand > "stat" structures would have to be dynamically created for each entry > in a directory. i'm not pre-generating anything however, so in daily > use, where each client knows exactly where to go, i'm not seeing > slowdowns. thanks! > not that i'm worried: we recently discovered a few misconfigured > clusters around here (names withheld) that were using ldap and no > local nameservice caching. each stat on those boxes would take 0.05 ms > (instead of 0.005) to complete because it needed to contact a server > for username lookup. the wait became unbearable above a number of > thousands of files in a particular directory, so people finally > started complaining after waiting for minutes for 'ls -l' to finish. > things could be way worse, i guess :) i suppose we could all be forced to reimplement vi for the apollo landing computer. - erik
Re: [9fans] Different representations of the same file/resource in a synthetic FS
> how are the resultant files looked up? it turns out that generating > the file hash table was the single most expensive operation for > upas/fs, given mailboxes with ~10k messages. > (http://9fans.net/archive/2009/05/106) still a hash. i'm not doing anything particularly clever for speed, and it shows in places. listing large directories is the slowest operation by far, as it would be for most cases where several thousand "stat" structures would have to be dynamically created for each entry in a directory. i'm not pre-generating anything however, so in daily use, where each client knows exactly where to go, i'm not seeing slowdowns. not that i'm worried: we recently discovered a few misconfigured clusters around here (names withheld) that were using ldap and no local nameservice caching. each stat on those boxes would take 0.05 ms (instead of 0.005) to complete because it needed to contact a server for username lookup. the wait became unbearable above a number of thousands of files in a particular directory, so people finally started complaining after waiting for minutes for 'ls -l' to finish. things could be way worse, i guess :)
Re: [9fans] Different representations of the same file/resource in a synthetic FS
On Tue Jun 9 14:15:29 EDT 2009, n...@lsub.org wrote: > With mail2fs I leave messages alone and use all kinds of mail lists > that contain just relative paths to actual messages. Perhaps nupas > could do the same. > i think that essential strategy is a winner. upas would use .idx files. so the general plan would be to never delete messages. they're in the dump anyway. just delete them from the index. so that seems simple, why isn't that done already? first, i should point out that there are some complicating assumptions i have. our heavy users are receiving about 1000 messages a day after spam filtering. it would be pretty easy to accumulate a million messages. also, it's important for imap support to be able to scan several hundred mailboxes in short order. hopefully that's enough context to understand the two basic problems i see: 1. the mdir format is limited by the underlying fs in the number of messages that can be efficiently stored. (my previous tests with ken's fs on a pIII machine showed that 100k was a lower upper bound.) a million-message index would be ~600mb on-disk. 2. upas/fs needs memory proportinal to the number of messages in the mailbox. and clients need to read directories that are sized in proportion to the number of messages. #1 seems straightforward to fix. #2 seems more fundamental. - erik
Re: [9fans] Different representations of the same file/resource in a synthetic FS
With mail2fs I leave messages alone and use all kinds of mail lists that contain just relative paths to actual messages. Perhaps nupas could do the same. El 09/06/2009, a las 20:11, quans...@quanstro.net escribió: On Tue Jun 9 13:28:55 EDT 2009, mirtchov...@gmail.com wrote: I think I've mentioned this before, but on a few of my synthetic file systems here I'm using what you describe to slice a database by specific orderings. For example, I have a (long) list of resources which I'm managing in a particular environment each has an owner, type, status and a few static data containers. It's all backed by a relational database, but the file server presents different "slices" of that to external users, where the directory structure is rigidly defined [...] this is definately a problem for upas/fs. it would be nice, for example, for upas/fs to have the option of sorting mailboxes in various ways. imap4d's requirements are not the same as nedmail's. by thread, by date, by order of arrival are all useful sortings. and of course, given the ability to manage giant piles of messages reasonably efficiently, it's tempting to replace the idea of different boxes with different views of the same giant pile of messages. how are the resultant files looked up? it turns out that generating the file hash table was the single most expensive operation for upas/fs, given mailboxes with ~10k messages. (http://9fans.net/archive/2009/05/106) - erik [/mail/box/nemo/msgs/200906/41850]
Re: [9fans] Different representations of the same file/resource in a synthetic FS
On Tue Jun 9 13:28:55 EDT 2009, mirtchov...@gmail.com wrote: > I think I've mentioned this before, but on a few of my synthetic file > systems here I'm using what you describe to slice a database by > specific orderings. For example, I have a (long) list of resources > which I'm managing in a particular environment each has an owner, > type, status and a few static data containers. It's all backed by a > relational database, but the file server presents different "slices" > of that to external users, where the directory structure is rigidly > defined [...] this is definately a problem for upas/fs. it would be nice, for example, for upas/fs to have the option of sorting mailboxes in various ways. imap4d's requirements are not the same as nedmail's. by thread, by date, by order of arrival are all useful sortings. and of course, given the ability to manage giant piles of messages reasonably efficiently, it's tempting to replace the idea of different boxes with different views of the same giant pile of messages. how are the resultant files looked up? it turns out that generating the file hash table was the single most expensive operation for upas/fs, given mailboxes with ~10k messages. (http://9fans.net/archive/2009/05/106) - erik
Re: [9fans] Different representations of the same file/resource in a synthetic FS
On Tue, Jun 9, 2009 at 1:14 PM, Roman V Shaposhnik wrote: > Lets assume a classical example (modified slightly to fit 9P): > a synthetic filesystem that serves images from a web cam. > The very same frame can be asked for in different formats > (.gif, .png, .pdf, etc.). Is serving > gif/frame > png/frame > ... > pdf/frame > and relying on reading > /// > for the list of "supported" representations really better > than what HTTP content negotiation offers? > Plan 9 does this a bit, in that you can ask a special file in /net for how to dial a certain host across all protocols. You can then pick the one that suits you, and get instructions on how to use that proto inside /net. I think it's a good use.
Re: [9fans] Different representations of the same file/resource in a synthetic FS
I think I've mentioned this before, but on a few of my synthetic file systems here I'm using what you describe to slice a database by specific orderings. For example, I have a (long) list of resources which I'm managing in a particular environment each has an owner, type, status and a few static data containers. It's all backed by a relational database, but the file server presents different "slices" of that to external users, where the directory structure is rigidly defined as: / available/ by-type/ by-owner/ inuse/ ... with all data to fill the directories being dynamically pulled from the database. in this particular case it saves me having to implement a generic SQL query mechanism, which is unsafe, as well as pushing the complexity of knowing the underlying database structure onto the clients. in the end, clients only know how to navigate to a particular resource and 'reserve' or 'release' it. this scheme could potentially be extended (at least in my case) to match your "user-defined sets" by simply enumerating every unique column in the database as subdirectories. a user defines a "subset" of all available nodes of a particular type foo which have owner bar by cd-ing to /available/by-type/foo/by-owner/bar. (I admit this is a bit hasty, so perhaps not what you're really after)... the reason i'm sticking with a file system is that if i have to do this for many different resources which can't be easily stuck in the same database, I'd have to design a protocol in order to avoid replicating everything, and if I'm going to design a protocol I may as well use 9p, something that's simple and I'm familiar with.