Re: [vfs] parsing uri
As you might have seen I implemented the plugin-resolve-stuff. Now we could extend VFS by simply drop a jar into the classpath and if we find a /META-INF/vfs-plugins.xml it would be added. That way we could keep the VFS core slim and could provide extension jars to allow whatever we can think of. I think this is a good compromise. Is this the point of view that I am trying to compromise with? We should add everything to vfs that seems at least remotely useful or if not useful then at least somewhat cool. And if at somepoint something we added is no longer neither useful nor cool we still keep it around to keep vfs backward compatible. Did I get this right? My point of view is: We should clearly and explicitly define the scope of vfs to be an excellent api to filesystems in general in heterogenous and distributed environment. We should write elegant, logicallly correct and well documented piece of software to do that. And make it extremely robust. So the compromise is this (please confirm): We make all providers to be pluggable so that there is the vfs-core with maybe one provider for logical testing of the core. And a bunch of provider plugins nicely packaged so that you can just grab the once you need and ignore the rest. And the core will not get any extra quirks because it would be nice when doing something with hibernate through vfs. So, yes, I think this could be a good compromise between the conservatists (me) and the liberal (them). Note that politically I am liberal but logically I am conservatist. But I already talked about, think of accessing your mailfolder through an imap provider and your mailcontent through an mime provider. e.g. mime:imap://[EMAIL PROTECTED]/INBOX/mail9012718!/part1.txt Sooner or later, this might happen ... and why not - its cool, isnt it? Our ideas of coolness slightly differ. My idea of coolness would be that the imap protocol would be better defined and more to the point (pop3 was much better in this). I remember the times when I was planning on accessing lotus notes through it's imap interface that supposedly could give you a hierarchical representation of notes databases. Here's a new cool provider idea for vfs. Lotus Notes provider that uses the imap service of notes. This way you could nicely present notes documents and forms as files and folders. And write a few books about the possible semantics. Plus since notes is commercial one could actually make a few dollars out of it. So you get the point. Not all that glitters is gold. But sometimes all we really want and need is just the glitter :) - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
And sometimes we just find some spare minutes and would like to experiment a little bit I have absolutely nothing against experimenting and coding all kinds of weird and useful things for your own purposes. I was just talking about what should be included into vfs. even if the result is glitter, maybe for the time being it become gold. The difference between gold and glitter is that the good feeling that gold gives lasts long. With glitter it only lasts a moment. YES! And I hope my progress so far made clear that I am definitely would NOT put some quirks into VFS. In fact I would say I am VERY conservative in the stuff I do. 12 points ... Mario. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
Your asumption about the used servers is correct. Now why uml or vmware: It is a pain to setup all this stuff and keep it in sync with any junit changes. With uml or vmware I can provide a image one simply can drop into its box and startup the tests. So no security problem, just to simplify the installation. Sounds good. Vmware is the best IMHO around. I have used it only their cracked open source so I don't know about their goodwill to open source dudes :) Just as a sidenote: I think it is not the responsibility of VFS to ensure running with different server implementations. The used libraries should handle this. Though, we should do what we can to support them finding problems with exotic platforms. Good point. agree. I am not at home now, I will send one later. Take your time. I don't pay anything for this :) Tempfs uses the DefaultFileReplicator to handle its content. So where are the files stored? Do they get deleted when vfs closes. Or when jvm closes? what if jvm crashes? - url provider bothers me because it kind of duplicates vfs. And it DUPLICATES the effort of vfs (http, ftp, jar ...) Now you get emotional ;-) Its better to integrate than to rule out. We also provide a method to wrap VFS into a URLConnection. I was not emotional. I was rational. Now that I have been sipping some italian red wine I am ready to get emotional. What do you mean by integration? Integrate into what? The point is that it does not offer any capabilities that are not already provided by vfs. So i does not give any further integrative possibilities. What it does give is undocumented features that duplicate documented features. And it does not work (probably) with all implementations of Java API. And the whole project of accessing any urls with some api (like the URLConnection API) is doomed to fail because url is such a broad concept and there will be cases of url that fit VERY badly to the API. I mean you can point to anything with URL (that is where the universal comes from). And you can not have a meaningful api to ANYTHING. URIs and URLs are about universal naming in the world of computers (and internet specifically). Api's tend to go beyond naming. Further this let's embrace everything attitude will take vfs into the world of yet another universal whatever. And the evolution is like this. A lot of good things and features are provided that are trendy at the moment. When the system becomes too messy to understans it is forgotten. Virtual filesystem can mean anything because of the magic word virtual. But I wish this would be just a filesystem that can integrate different kinds of filesystems on the network. That already is a tall order. And also note that filesystem model is very simple hierarchical model. So we should not see it as the ultimate way to model and interact with data. I think I am still being rational but in a good emotianal way :) - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] JDBC FileSystem: planning an implementation
Please do not take my continued questioning as a sign of not liking the vfs - jdbc provider. I just like to question because I still don't completely understand. E.g. if it is not possible to setup a ftp/ssh server but already have a working database connection (e.g. in webshop environments) it might be nice if VFS could use this database as filestore e.g. for pictures too. If I have a webapp getting it's data from a database and that same database also has the pictures in it. Then how is it easier to get the picture data through vfs layer and the other data directly through jdbc? Or maybe in an distributed environment. Multiple clients working with the same filesystem - again without setup for a fileserver. So jdbc is seen here more like a transport protocol rather than vendor independent api for retrieving data from sql database. This is possible but the transport protocol is side effect of jdbc. But I accept it. Still very often if someone opens for you the not so secure jdbc port(s) (for DB2/AS400 jdbc driver one needs to open around 10 ports) he is quite willing to open ssh port. Or what about fileupload forms? We could use the database as storage and dont have the risk someone uploads and executes a trojan. Hmm... We could leave the reorganisation to the database administrator. I'm sure he's going to be happy about it :) It introduces no new dependency so why NOT do it and see if there are others finding other usages for it. It just seems to expand vfs into the area of universal access interface. Next thing you know we access ldap servers, mail servers over pop3 etc. Now it has been accessing filesystems in heterogenous and distributed environment. Maybe I am a little afraid of bloatware. zip, jar and other kinds of providers have been still somewhat filesystem oriented. I'm interested in using it e.g. to store the thumbnails that are being used in one of my applications. Is it somehow easier to store data through vfs api than through jdbc api? Also for some versioning systems it may prove useful. This I understand. Because database is more flexible than more or less predefined filesystem structure, database enables someone to build a more versatile filesystem that can support a lot of metadata (attributes) for files. And we could think history info of a file as it's metadata. But... Ok, I leave this party spoiling now and hope success and fun for this provider. I just could not keep my mouth :) - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
I glanced the tests you have in place for uris and naming and they seem extensive. I am not going to test this extensively. But the case that brought this issue up was like the following I called FileObject resolveFile(File baseFile, String name) on FileSystemManager and the baseFile has path something like /foo/%bar and the name was just some ordinary fname So just make sure you have this in your junit tests. Anyway quickly trying the snapshot in my program brings no errors. By the way could you add the following to VFS public static void close() { try { // Closes FileSystemManager instance final Method closeMethod = instance.getClass().getMethod( close, null ); closeMethod.invoke(instance, null); } catch (Exception e) { e.printStackTrace(); // Ignore; don't close } instance = null; } DefaultFileSystemManager already has the close method. And this would be equal to the init method. (I have some gc problems and this alleviates it a bit) About the testing environment. What exactly is needed to run the tests? A quick quess would be - ftp server - sftp (ssh) server - samba server - tomcat for http webdav Why is uml or vmware needed? The way I would test is to just have those servers running on my machine. Of course if something crashes vmware can bring security but if none of the services run as root the setting should be secure enough. I have never run the tests but I could try doing it. It would be easy for me to set up those services on my gentoo. Of course if you are planning on testing stuff on many platforms and different server implementations then vmware would be needed but isn't that an overkill? I mean how many different ftp servers are there? And where do you draw the line. I am sure that you are not going to test all ftp servers running on OS/400 using EBCDIC encoding :?) If you could give a quick tutorial (that could be added then to docs) about how to run tests I could give it a shot. And I have even an XP on separate machine for smb testing. And maybe there could be some kind of a profile where you tell what services you have on for testing and where they can be found? And now for some random thoughts: It seems to my that currently the providers could be categorized into 4 categories: - local filesystem - network protocol based providers - ftp, sftp, smb, webdav, http - layered filesystems - tar, jar, bzip2, compressed, gzip, zip - filesystems based on concepts from java environment - temp, url, res I really have no deep understanding about this but please enlighten me where I am wrong. - temp seems to have a special place because almost nothing is implemented under temp package. So the implementation must be somewhere higher. I assume that the implementation uses java's temporary file concept from java.io.File API ??? - resources have a special place for a java program and earn their place because of that. - url provider bothers me because it kind of duplicates vfs. Basically it says that you can access any url but the reality is that you can access only urls for which there exists a provider inside sun's jdk. The set of these providers is not part of the api and thus undocumented and subject to change any day. And do we find all those providers in other jdk's. And it DUPLICATES the effort of vfs (http, ftp, jar ...) And then one question about layered filesystems. Can you layer them as much as you like. smb - zip - jar etc. Time to go to sleep. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] JDBC FileSystem: planning an implementation
Hi Mario Wouter, Could you still clarify what is the purpose of making a relational database work as filesystem. This is just for curiosity (and maybe good for docs). I had the idea that someone might want to attach more metadata to files through attributes. Another reason that came to my mind was to have transactions to prevent corruption when many users use the same file (but personally I think that this kind of integrity control is rather weak since the database is ignorant of the structure if the file's contents.) Anyway, in your first email Wouter mentioned that he needs it for ranking and reporting system for climbing contest. What is missing in a regular filesystem? Or why not just build your app on rel.db using JDBC? - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[vfs] JDBC FileSystem: planning an implementation
-storing files into databases, including some metadata (the easy part) I would see the mapping from relational model to filesystem: relation (table) = filesystem (defines a set of files that share the same set of attributes). tuple (row) = file that has the content value and attribute values attribute = one of the attributes has to be decided as content and the rest of the attributes are file attributes. Problems: You have to store the decision of which attribute is to be the content attribute in vfs. Messy. It will get lost if it is not in the database. Someone might want to see one database as one filesystem. Then files in that same filesystem would have completely different attributes and content stored in different attribute too. All tables should not contain any duplicates and have key attribute that would uniquely identify the tuple (row, file). Unfortunately sql (and current databases) don't enforce this. If you want to see one database as one filesystem then the path would be schema/tableName/uniqueId If one table is one filesystem then the path would be only uniqueId. I can see the benefit of this project in the possibility of adding attributes freely to filesystem. But it is going to me a bit messy because of the heterogenous nature of database products, sql dialects, and jdbc driver implementations. Also relational model is stronger than simple hierarchical filesystem model so that needs to be kept in mind. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
URI spec dudes talk about canonical form of the URI. This is left for the schema to define. Now if vfs is in control of the uri's that come in and go out then it would be possible to canonicalize the URI when it enters the core areas of vfs that is not provider (schema) specific. Cache I believe in that core area. Let's say someone points to a file with URI webdav:/anydir/%74est%0d.txt This is canonicalized into webdav:/anydir/test%0D.txt So when someone points next time to uri webdav:/anydir/tes%74%0D.txt then he will get the cached file. Note: Canonicalization could be provider specific so that different schemas could escape different set of characters. What do you think? Hello! Now that it is possible to safely pass uris we could have a look how we should encode. I will try to figure out how local-file, ftp, http, webdav, smb, sft will handle filenames with special characters. During my tests I found some sideeffects which needs some thoughts: 1) The cache The cache uses the filename as key - now if I try to resolve a file named webdav:/anydir/test%0d.txt the webdav will return a file named webdav:/anydir/test\r.txt (\r = the unencode %0d) As you might see, both filenames are different and thus it will create two different entries in the cache (which is not acceptable). If i ask wedav to return the escaped form of the name it will return webdav:/anydir/test%0D.txt (notice the uppercase D) - again a different name. However, what if one is funny and tries to resolve webdav:/anydir/%74est%0d.txt In this case the filename from the fileprovider is different - regardless if I get the normal or escaped form. So my conclusion is to always use a decoded form of the filename for the cache key - knowing that in the very very rare cases where the decoding is not symmetric I might have a problem with the cache. 2) German Umlauts ... and any other non ascii character. I cant use the encoded form of the filename from the filesystem provider as I have to know the encoding then (ISO, UTF-8). Currently the filesystem libraries are responsible for the correct decoding - and I dont want to enter a charset war - again, its best to use the decoded filename. Result: VFS should not introduce its own encoding, only the % (and ! for the layered filesystem) needs some addressing and to allow the case where one needs to pass down a special url to the filesystem. Comments? --- Mario - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
Again quoting the RFC: For original character sequences that contain non-ASCII characters, however, the situation is more difficult. Internet protocols that transmit octet sequences intended to represent character sequences are expected to provide some way of identifying the charset used, if there might be more than one [RFC2277]. However, there is currently no provision within the generic URI syntax to accomplish this identification. An individual URI scheme may require a single charset, define a default charset, or provide a way to indicate the charset used. It is expected that a systematic treatment of character encoding within URI will be developed as a future modification of this specification. I quess http schema sticks to US-ASCII for now. But maybe with escapes you could access on some web servers pages like http://aku.suomi.fi/k%E4%E4k.html = http://aku.suomi.fi/kääk.html To be honest I don't know. Also I don't know if the systemic treatment has already happened or when it will happen. So it is up to us to decide how we deal with charsets. Since vfs is written in java it would make sense to first turn the character sequence of to 16 bit unicode (UTF-16?) And then encode every character above US-ASCII (7 bit) or ISO-LATIN-1 (8 bit). But this would not make the visual representation of URI very nice. According to URI spec one should be able to read URI on the radio :-) If you are in japan every character would be encoded and very difficult to read for the announcer. But if you don't encode then that URI would look to westerner a sequence of those boxes that represent character for which there is no font. Let's get practical. Someone wrote the following uri in ant build file (and some ant task uses vfs). webdav:/höh/kääk.ini Ant when reading the string knows that it is encoded in iso-latin-1 But the string in jvm is in unicode. Ant gives this string (uri) to vfs that encodes all character above us-ascii. so it is now webdav:/h%F6h/k%E4%E4k.ini Now webdav provider makes http request let's say to tomcat. Question arises: Can tomcat handle (or the webdav protocol spec) unicode characters in resource names? I don't know. But maybe webdav provider implementor knows. So if webdav names only handle us-ascii then the provider can right away say when it is asked to canonicalize the uri that this is not a proper webdav uri. Or maybe this is not specified. And some webdav servers could handle the uri and some could not. Maybe webdav provider then could ask the server what it supports. But maybe there is no one standardized single way to ask this. At this point a sane person starts to give up and thinks:Whatever! Just pass the string and let the user handle errors. But let's say that webdav can handle iso-latin-1 and the request is sent to server. The server's filesystem is encoded in some other coding (EBCDIC?) that maps ö and ä to a different number. So in order to do the mapping the webdav server would need to know what character encoding vfs uses (UTF-16) in order to do this. But since this is not specified (at least in the rfc I am quoting) then it would probably unescape using it's own encoding and request a wrong resource from it's filesystem. This state of affairs makes me wonder do the standard makers really want to make standards or do they just pretend. The answer is of course that industry wants to make standards to a point. Because confusion and protectionism makes IT business thrive. That being said I think one pragmatic approach could be to treat uri characters to be in from unicode character set. When transported they would be in US-ASCII where everything above us-ascii is escaped. So to answer your question ü = %FB But all this is just assuming and making things up. I quess the decision is in your hands since you write the code. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
I wrote the previous email too quickly so there are many errors in details. So please read it without too much attention to the details. I quess those uris with non us characters get always sent in some encoding. It would work nicely if it could be us-ascii But the interpretation problem is just lifted one level up with encoding since we don't know how to negotiate the encoding. So I quess minimal encoding policy is better because it works as well and is of course much less hassle. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
The File-URI codec on unix encodes \foo\bar -- %5Cfoo%5Cbar This is to be interpreted as file or dir named \foo\bar If you send this uri to jvm on windows you get new File(new URI(uriStr)) which is interpreted as file or dir bar under dir foo which is under root. So it seems that %5C is not interpreted as having special meaning but on windows it is. The other alternative on windows would be to throw exception because a file with the given path can't be created. So it is thought that it makes life easier if the %5C is interpreted as path separator on windows. The same question applies to . (dot = current dir) double dot ( = parent dir) and any other characters that we might want to assign some special meaning to ( eg. ~ tilde) When do we interpret a special charater to have it's special meaning and how do we escape away that special meaning? Well the answer is so simple and according to what you think is right. %xx notation ESCAPES the character and NEGATES the possible special meaning it might have. So therefore I think it would be more correct if %5Cfoo%5Cbar on windows would throw an exception. And your intuition is correct. But note: If I have a path ../xtc then the corresponding uri should be ../xtc. Because in this case we want the dots to have their special meaning. But what if % character would have a special meaning (let's imagine it points to the parent of the parent if one exists or else to root) Then path %/xtc should be uri %/xtc BUT this is not possible because % has a special meaning in URI as escape character. All the other excluded characters MUST be encoded because of URI spec. The reasons being eg. that uri could be printed on paper and new line characters would be hard to read if they were not escaped. So let's recap the excluded character list ctrl-chars | space | | | # | % | None of these have any special meaning in any filesystems Thus we are saved. Rest of the encodings are because of the schema specific rules and serve the purpose of escaping the schema specific meaning of the character. Therefore the uri corresponding the path @foo/%bar/+xtc should be @foo/%25bar/+xtc Do these thoughts clarify ? :-) - rami Hello! Sounds like a long night today :-) Hard work - it might take some time until I can commit the new naming stuff. The whole procedure of parsing a uri needs to be refactored, currently I fight agains the Layered stuff e.g. tar:tar:file:/dir/first.tar!/second.tar!/entry And I already implemented some incompatibilites between the old and the new VFS naming: Current: file = getManager().resolveFile(%2e); resolves to the current Directory New: resolves to a file or directory NAMED . Current: file = getManager().resolveFile(dir%2fchild); resolves to a file child in directory dir New: resolves to a file or directory named dir/child Current: file = getManager().resolveFile(dir%5cchild); resolves to a file child in directory dir New: resolves to a file or directory named dir\child I leave it up to the filesystem if such a file or directory could be created. The above examples are those from the unit-test, so the old behaviour was wanted. But I think the new one is the right one. I think it is very unlikely that those constructs can be found in the wild life, but if one used VFS that way it IS broken. Any comments? --- Mario - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
[EMAIL PROTECTED] wrote: I'm unsure that the URI specs intend to distinguish a string from it's encoded form for the purposes of naming. I believe they are to be interpreted equivalently, and that the encoding exists only to permit uncorrupted transmission of forbidden characters. Quote from RFC 2396 Uniform Resource Identifiers (URI): Generic Syntax 2.4.2. When to Escape and Unescape A URI is always in an escaped form, since escaping or unescaping a completed URI might change its semantics. /_*Normally, the only time escape encodings can safely be made is when the URI is being created from its component parts; each component may have its own set of characters that are reserved, so only the mechanism responsible for generating or interpreting that component can determine whether or not escaping a character will change its semantics. Likewise, a URI must be separated into its components before the escaped characters within those components can be safely decoded.*_/ In some cases, data that could be represented by an unreserved character may appear escaped; for example, some of the unreserved mark characters are automatically escaped by some systems. If the given URI scheme defines a canonicalization algorithm, then unreserved characters may be unescaped according to that algorithm. For example, %7e is sometimes used instead of ~ in an http URL path, but the two are equivalent for an http URL. Because the percent % character always has the reserved purpose of being the escape indicator, it must be escaped as %25 in order to be used as data within a URI. Implementers should be careful not to escape or unescape the same string more than once, since unescaping an already unescaped string might lead to misinterpreting a percent data character as another escaped character, or vice versa in the case of escaping an already escaped string. Important passage: /each component may have its own set of characters that are reserved, so only the mechanism responsible for generating or interpreting that component can determine whether or not escaping a character will change its semantics At this point the RFC indirectly says that only % MUST be always encoded. But later it excludes other characters from ever existing in URI for reasons of readability when uri is eg. printed. Think if you see somewhere URI: foo Is this URI foo or foo ? The same applies to URI: foo That being said I think that the absolute minimum is only % But what is the practical minimum is left in the air. - rami / You have found something interesting to encoded URIs if a difference exists, but yours is a lot of work and I'd double-check the assumption before proceeding further. Laziness is one of the three virtues. :-) Cheers, --binkley - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
btw. you catched a vespiary - usign the '%' as valid filename character turns out to be a problem through all archive like filesystem providers (tar, zip, ..). Also the FileObject.getName().getURI() didnt correctly encode the path i.e. one cant use its result to resolve a file again. I have to investigate this in more detail. Already one year ago when I was fiddling with classloaders I found out how URL encoding (eg. in URLClassLoader) is completely flawed in Java. There are bug reports about this in java.sun.com but the official answer to this seems to be that the way URL encoding is done now is too central to be changed since big software has been written that assume that URLs are encoded wrongly. Therefore encode the file path to URL in vfs. It's not hard and it is the only way. This theme brings up an interesting topic about the set of characters that are allowed to appear in file name. As we know the set of prohibited characters on different operating systems is - well different. Since vfs is cross-platform file-system it should define it's own set of prohibited characters. Maybe union of prohibited characters on win/unix/mac. But that is impossible since it will find files on unix that do have characters that are prohibited - say on windows. Maybe FileSystemProvider when instantiated has to be able to tell which characters are allowed. Of course vfs can be completely neutral about the issue and let the os / network protocol tell that something is wrong when illegal filename was used. Nevertheless it would be excellent to document these kinds of issues as part of the vfs project. Then it would be easier also to say for sure which characters need to be encoded for URL. Also I think decodeURI and encodeURI should be symmetrical. Maybe we don't need to know anything about filenames. We only need to know about URI. What is the set of characters that need to be encoded in URI. Well let's see RFC 2396 /reserved = ; | / | ? | : | @ | | = | + | $ | ,/ These are reserved characters because they have a special meaning in URI They work as delimiters between different components. and the schema finally decides if they are delimiters or not (I think) They should be escaped but note: /2.4.2. When to Escape and Unescape A URI is always in an escaped form, since escaping or unescaping a completed URI might change its semantics. Normally, the only time escape encodings can safely be made is when the URI is being created from its component parts; each component may have its own set of characters that are reserved, so only the mechanism responsible for generating or interpreting that component can determine whether or not escaping a character will change its semantics. Likewise, a URI must be separated into its components before the escaped characters within those components can be safely decoded./ So when I have a path like /foo/%bar I should encode % but not / Looking at the reserved character set in case of file: schema I think none of them should be escaped. /2.4.3. Excluded US-ASCII Characters / /control = US-ASCII coded characters 00-1F and 7F hexadecimal/ /space = US-ASCII coded character 20 hexadecimal delims = | | # | % | The angle-bracket and and double-quote () characters are excluded because they are often used as the delimiters around URI in text documents and protocol fields. The character # is excluded because it is used to delimit a URI from a fragment identifier in URI references (Section 4). The percent character % is excluded because it is used for the encoding of escaped characters./ I think these should always be encoded in URI There exists also unwise characters /Other characters are excluded because gateways and other transport agents are known to sometimes modify such characters, or they are used as delimiters. unwise = { | } | | | \ | ^ | [ | ] | ` / But I don't think these should be encoded. So all in all for file URI schema I think the characters to encode are: *control = US-ASCII coded characters 00-1F and 7F hexadecimal* *space = US-ASCII coded character 20 hexadecimal* *delims = | | # | % | * On my Linux I can create directory /#%/ I just need write mkdir \\#%\ Also it has happened to me that a program has created a file name that contains newlines and some other non-printable characters. Copying this folder to some other os would result (probably) in exception. // If I could I would assign you 12 points (the maximium) for catching this problem ;-) Why can't you ?-) - rami
Re: [vfs] parsing uri
Rami 12 points. I'm honored. - Rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
What does it does not work mean? That is, what is an example failure case? Good question. Because it does work :) All I can say to my defense is that my library management is a mess! Therefore I decided to make the simplest possible class for testing how file.toURI().toString() It encodes all excluded characters (space, %, #, ...) From reserved character it encodes (on my linux) only ? (question mark) Then from unwise characters ({}|\\^[]`) it encodes all. But maybe it is not necessary to know how it encodes because the inverse operation can be done too. new File(new URI( (new File($%[EMAIL PROTECTED]|\\^[]`$)).toURI().toString() )).getPath() Returns $%[EMAIL PROTECTED]|\\^[]`$ Which is correct. Once again all this confusion was produced because I have my library management in state of flux and I have had bad experiences with this issue in the past. Also I remembered the bug about this encoding issue but this really seems to work. My java -version returns 1.4.2_06-b03 This might not work on 1.3 but I am not sure. Like I said before, the URI encoding is schema specific, so it should be done separately for different providers. And it seems that for local files URI and File classes could work as the codec. Thanks binkley! - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
Slight correction new File(new URI( (new File($%[EMAIL PROTECTED]|\\^[]`$)).toURI().toString() )).getPath() Returns $%[EMAIL PROTECTED]|\\^[]`$ Return value is $%[EMAIL PROTECTED]|\^[]`$ (Only one backslash) - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[vfs] parsing uri
In DefaultLocalFileProvider is a method findLocalFile. It's idea is to convert File object into FileObject object. public FileObject findLocalFile(final File file) throws FileSystemException { // TODO - tidy this up, should build file object straight from the file return findFile(null, file: + file.getAbsolutePath(), null); } It calls findFile that is in AbstractOriginatingFileProvider The signature of the method is findFile(final FileObject baseFile, final String uri, final FileSystemOptions fileSystemOptions) throws FileSystemException Notice the name of the second argument: 'uri' Here's the problem: Let's say I have file whose absolute path is /foo/%bar It's uri should be file:/foo/%25bar but now it just is file:/foo/%bar which is not a correct uri leading to an exception later when the system tries to decode the uri and complains that Invalid URI escape sequence %ba So the method should be public FileObject findLocalFile(final File file) throws FileSystemException { // TODO - tidy this up, should build file object straight from the file return findFile(null, file: + ENCODE_URI_SOMEHOW(file.getAbsolutePath()), null); } the same remark applies to public FileObject findLocalFile(final String name) throws FileSystemException { // TODO - tidy this up, no need to turn the name into an absolute URI, // and then straight back again return findFile(null, file: + name, null); } - Rami Ojares Ps. I hope this remark is valid since I haven't updated the sources for a long time. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
Here is my proposal using idea from binkley /** * Finds a local file, from its local name. */ public FileObject findLocalFile(final String name) throws FileSystemException { // TODO - tidy this up, no need to turn the name into an absolute URI, // and then straight back again return findFile(null, (new File(name)).toURI().toString(), null); } /** * Finds a local file. */ public FileObject findLocalFile(final File file) throws FileSystemException { // TODO - tidy this up, should build file object straight from the file return findFile(null, file.getAbsoluteFile().toURI().toString(), null); } I tried it and it worked. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] parsing uri
file.toURI().toString() is not the way to go. The reason is simple. It does not work. I don't know why. So I think we should use ParseUtil.encode(..) which does work and decide which characters to include as special ones. I did this and it works (last time I said this I was wrong because a jar did not get updated ..) But now I'm home so I will submit it tomorrow. Anyway there is nothing to it so mario can probably make the fix right away. But the list of special characters needs still to be addressed. I think at least {'#', ' '} - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [VFS] FileSystem close
Hi, I asked about this a little while ago - when you resolve a file for a SFTP url, the SFTP file system maintains the session until the file manager is closed - the solution you propsed (see below) works well, but this will not work if I try to use this in a multi-threaded environment. What are the general thoughts about this? One sftp filesystem keeps at most one idle connection open. So it is like connection pool whose max size is one. Now you have multiple threads accessing the same sftp server but you would not like to have an idle connection open, right? One possible direction to take would be to develop the pooling mechanism (maybe use commons-pooling?) and give an option to the sftp provider pooling=true|false. Would this solve your problem? - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] FileSystemManager construction rethought
Hi Mario, Here I have listed my current understanding of VFS in connection with the configuration issues. Could you please correct any mistakes you find. - Manager contains FileProviders - Manager supports all schemas for which it has configured a provider - FileProviders create a FileSystem the first time a file is resolved against them - FileProviders create a new FileSystem everytime a new FileSystemOptions is encountered - FileSystem is in a way active meaning that it contains a live connection over some protocol if that is needed to access the files on the filesystem. - User can create layered filesystems on top of originating filesystems - User can create virtual filesystems and add junctions to them So if these points are correct the configuration data stucture contains the following: - Manager specific configuration independent of providers Q: What are these? - Provider specific default configuration (manager instantiates and configures the providers it supports) - ad hoc configuration when resolveFile is called And then my understanding of the current configuration mechanism - StandardFileSystemManager configures the manager and providers from xml - resolveFile method provides FileSystemOptions that results in creation of FileSystem if the options were never given before. - every invocation of resolveFile can potentially result in creation of a FileSystem - FileSystemConfigBuilders provide type safety to FileSystemOptions - FileProvider must provide a FileSystemConfigBuilder that is used to create FileSystemOptions suitable for that provider's filesystems == FileSystemOptions are the only configuration data structure in place at the moment and it is the duty of FileSystemManager and FileProvider implementations to provide for their own configuration. Let's consider configuration of a logger to FileSystemManager. setLogger() is a method in DefaultFileSystemManager. Further it utilizes setLogger method in VfsComponent interface that is implemented by FileProviders and FileSystems (among others). So from the point of view of ant task that wants to set a logger to the manager it uses, there should be setLogger() in FileSystemManager interface. Because if it can choose manager implementation then it would have to use introspection to find setLogger() and it would have no guarentee that such method exists in the implementation. This might not be seen as problem. But let's take it further. Consider ant fragment vfs-manager refid=mymanager class=MyManager configuration=path/to/conf/ vfs-copy todir=sftp://host/usr/local/var; vfs-manager ref=mymanager/ /vfs-copy In the configuration file there would already be pointers to knownHosts and privateKey files so one would not have to tell them inside the uri. But if one would tell them for example sftp://id=/path/to/.private.key;knownHosts=some/[EMAIL PROTECTED]/usr/local/var then that would just override the providers defaults set from the configuration file. Now the configuration (file) can be passed as path, File, FileObject, DOM or whatever. But it would be nice if FileSystemManager interface would dictate at least some kind of configuration framework so the ant tasks would not have to mention any implementations. - Rami Ojares - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] FileSystemManager construction rethought
Now the configuration (file) can be passed as path, File, FileObject, DOM or whatever. Here the word Now does not mean that now the implementation is ... It means In general it is possible ... - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Daniel wrote: The Java regex syntax is almost a superset of Perl, which is why I don't see the impact of using a Perl engine for JDK 1.3 and java.util.regex for J2SE 1.4 as being major. The expression Rami gave was straight Perl 5.005. jakarta-oro's Perl5Compiler/Perl5Matcher implements zero-width look-ahead assertions from Perl 5.003 but does not implement the zero-width look-behind assertions from 5.005 and future versions (if you don't ask for it ...). This can be added. The other difference is that in Perl \Q and \E are not part of the regex syntax. They are part of Perl string handling, so we didn't implement them in Perl5Compiler (instead quotmeta() is provided), but support them in the Perl5Util convenience class. This can be moved into Perl5Compiler if desired. There has to be a user driver for these small things to happen. Very true. It is also obvious that java has followed in the footsteps of Perl that has much longer history with regexes. The reason they are not compatible is the lack of standardisation on the perl side. Since Java folks have always put much effort on internationalization I think Java regexes have made extra effort with handling of Unicode. If regexes would be standardized then Perl deserves to have the biggest word in that committee. However for that standard I feel that all the aspects of the language should be encoded inside the language rather than outside (like embedded sql or quotemeta() in regexes) Else the language will never be defined exactly but will have loose boundaries. In general, most regular expressions you see in the wild can be simplified and don't require unusual constructs. For example, why write \\Q**\\E when \\*\\* will do (you would usually want to use \Q and \E for longer sequences or for dynamically generated strings you want to escape; but quotemeta works equally well)? I am using quoting with dynamic input so I need the feature. Now I have been told that I need to support JAVA, PERL5 and POSIX syntaxes. So in case of Java I have to use \\Q and \\E In case of PERL5 I have to use quotemeta() And in case of POSIX I have no clue ! Why use a negative look-behind assertion in ((?!^)|[^/]) when [^/] will suffice (the negative look-behind assertion is redundant because if there's a character present that's not a slash, then it's not the start of the input)? Thanks for the tip! I am an occasional regex user :=) Of course, you can't always simplify your expressions and I think Rami's point is that you shouldn't be bothered with the finer points and stuff should just work. Thank you for understanding my intention so well ! I think the answer is that as long as you stick to Perl5 syntax (which most people using java.util.regex are unknowingly doing), you'll rarely run into differences; but that oro doesn't implement most of the stuff added after Perl 5.003 for lack of demand (there's not that much stuff). (And from above) There has to be a user driver for these small things to happen. I think there is a user driver for the fact that users could read one well written documentation about regexes and use them worry free. Don't you think? - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] FileSystemManager construction rethought
But i have to say i dont like to add all those configuration stuff to the uri - maybe this might be a handy place, but adding e.g. path elements like those in your example might be a real pain. It might be really hard to construct a correct uri. I agree. It was just an example. Now there could be a class e.g. GeneralFileSystemConfigBuilder with (vfs-options name= value=) setConfigValue(String scheme, String name, String value) (vfs-options name= value/value) setConfigValue(String scheme, String name, String[] values) (vfs-options name= className=) setConfigClass(String scheme, String name, String className) Which tries to locate the correct *FileSystemConfigBuilder, tries to convert the given value/class to the expected parameter and calls the method. All this can be done using reflection and a configuration mistake is fastly shown. I will implement this class if we find a consent. Nice. Very nice. Now there is the problem how to connect the options to a given uri. Ant can always call resolveFile(String name, FileSystemOptions fileSystemOptions) to resolve files if the task or datatype has been provided with some options. vfs-copy todir=[EMAIL PROTECTED]://host/usr/local/var vfs-manager ref=mymanager/ /vfs-copy Then the ant task should strip the @options1 and lookup a table to find the FileSystemOptions for resolvFile() Maybe not. :) To do the same with the FileSystemManager - as you have said - we have to create a separate data structure first. I will have a look at it, but for now we can stick on the xml file. The addition that I was thinking would be minor. Keep XML configuration the way it is but add configure(File/FileObject/InputStream) to FileSystemManager interface. And document the syntax of the xml file that is used to configure manager and providers. Might this find your acceptance? Full acceptance with satisfaction guaranteed. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] FileSystemManager construction rethought
I tried to replace the parsing of the xml file by commons-digester beanutils but failed as it comes to the point to add those dependencies to vfs. Other people vetoed against it (Too large jars) That is a shame. But isn't it so that we should first come up with a configuration structure (data structure and configuration policy) for vfs independent of where we get the actual data to this data structure. Since it takes a lot of data to configure FileSystemManager it is obvious that we need xml file to hold this data. And the current xml structure is a good start. Now the idea behind this construct was to have a typesafe configuration framework. The configuration container could IMHO be not type safe. Because once you call configure() right after instantiation the filesystem manager passes provider specific configuration to providers and all the providers would configure themselves too. And any missing or erroneous stuff would be reported at that point. Which to me would be a very natural point for letting the user know that something is wrong in his configuration file. The configuration data structure could mention the providers and their data types making it type safe. Anyway, now it seems to me overly complex when the issue is quite simple. In the simplest case we would just need a naming convention and meaning for different configuration elements. Example: Sftp provider accepts the following configuration elements - sftp.known-hosts A path (String) that must point to a valid known-hosts file - sftp.private-key A path (String) that must point to a valid private key file in XXX encoding ... This same configuration data structure class could be used when overriding in case of resolveFile() One question that arose is that why would someone want to write a different FileSystemManager implementation. Could the DefaultFileSystemManager be made so configurable that this would not be needed? One reason could be not to use the xml file for configuration, but write a custom FileSystemManager which does the whole configuration in code. Is this really reason enough for the added complexity? - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] FileSystemManager construction rethought
Here are some small changes that I came across when making the changes compilable. I changed configure method to init because there was already private configure(String) method in DefaultFileSystemManager but the existing init method was without arguments. I saw also that there is quite a lot of configuration code in place (GlobalConfiguration class and the configure method mentioned above that parses configuration from XML). But I did not see the overall idea so you'll have to explain it to me Mario. But it all seems somehow still coupled with the DefaultFileSystemManager. One question that arose is that why would someone want to write a different FileSystemManager implementation. Could the DefaultFileSystemManager be made so configurable that this would not be needed? Anyways, here is a compilable listing of the code I sent in the previous email. - rami * VfsTask.java * package org.apache.commons.vfs.ant; import org.apache.commons.vfs.*; import org.apache.tools.ant.*; /** * Super class of Vfs Ant tasks that takes care of FileSystemManager handling. */ public class VfsTask extends Task { private FileSystemManager manager; private AntLogger logger; protected FileSystemManager getManager() { if (manager == null) setManager(null); return manager; } protected void setManager(FileSystemConfiguration conf) { if (conf == null) { conf = new FileSystemConfiguration( org.apache.commons.vfs.impl.StandardFileSystemManager ); } // put AntLogger always as logger conf.setParameter(logger, getLogger()); this.manager = AntHelper.getManager(this, conf); } protected AntLogger getLogger() { if (logger == null) logger = new AntLogger(this); return logger; } /** * Resolves a URI to a file, relative to the project's base directory. * * @param uri The URI to resolve. */ protected FileObject resolveFile(final String uri) { try { return getManager().resolveFile(getProject().getBaseDir(), uri); } catch (FileSystemException fse) { throw new BuildException(fse); } } } * VfsDataType.java * package org.apache.commons.vfs.ant; import org.apache.commons.vfs.*; import org.apache.tools.ant.types.*; import org.apache.tools.ant.BuildException; /** * Super class of Vfs Ant data types that takes care of FileSystemManager * handling. */ public class VfsDataType extends DataType { private FileSystemManager manager; private AntLogger logger; protected FileSystemManager getManager() { if (manager == null) setManager(null); return manager; } protected void setManager(FileSystemConfiguration conf) { if (conf == null) { conf = new FileSystemConfiguration( org.apache.commons.vfs.impl.StandardFileSystemManager ); } // put AntLogger always as logger conf.setParameter(logger, getLogger()); this.manager = AntHelper.getManager(this, conf); } protected AntLogger getLogger() { if (logger == null) logger = new AntLogger(this); return logger; } /** * Resolves a URI to a file, relative to the project's base directory. * * @param uri The URI to resolve. */ protected FileObject resolveFile(final String uri) { try { return getManager().resolveFile(getProject().getBaseDir(), uri); } catch (FileSystemException fse) { throw new BuildException(fse); } } } * AntHelper.java * package org.apache.commons.vfs.ant; import org.apache.commons.vfs.*; import org.apache.tools.ant.*; import java.util.*; /** * Holds a map of FileSystemKey - FileSystemManger * When a Task or DataType asks for FileSystemManger, AntHelper looks it up * in the map. If it does not exist, it is created, added to the map and a * BuildListener is added to the project so that when the project finishes it * closes the FileSystemManger and removes it from the map. */ public class AntHelper { private static Map managers = new HashMap(); public static FileSystemManager getManager(ProjectComponent projectComponent, FileSystemConfiguration conf) { FileSystemKey key = new FileSystemKey(projectComponent.getProject(), conf); FileSystemManager manager = (FileSystemManager) managers.get(key); if (manager == null) { try { manager = VFS.createManager(conf); } catch (FileSystemException fse) { throw new BuildException(fse); } projectComponent.getProject().addBuildListener(new CloseListener(key));
[vfs] VfsTask
I have pronlems with VfsTask. First of all I need to instantiate AntLogger in ant tasks so AntLogger needs to be made public. But that is not enough since I need to instantiate it also in DataTypes (FileSet in particular). Also resolveFile needs to be accessed from FileSet DataType. Therefore I would propably need VfsDataType that would be very similar to VfsTask. Which led me to wondering about FileSystemManager. Now ant tasks instantiate the manager if they invoke resolveFile method. And when the build completes the manager is closed. When ant tasks call some components (like toolbox components) that are not related to ant they have to open their own manager. So I was wondering why not just use VFS class to get the manager also in ant tasks? And make the AntLogger public and independent class like this: /** * A commons-logging wrapper for Ant logging. */ public class AntLogger implements Log { private ProjectComponent pc; public AntLogger(ProjectComponent pc) { this.pc = pc; } public void debug(final Object o) { pc.log(String.valueOf(o), Project.MSG_DEBUG); } public void debug(Object o, Throwable throwable) { debug(o); } public void error(Object o) { pc.log(String.valueOf(o), Project.MSG_ERR); } public void error(Object o, Throwable throwable) { error(o); } public void fatal(Object o) { pc.log(String.valueOf(o), Project.MSG_ERR); } public void fatal(Object o, Throwable throwable) { fatal(o); } public void info(Object o) { pc.log(String.valueOf(o), Project.MSG_INFO); } public void info(Object o, Throwable throwable) { info(o); } public void trace(Object o) { } public void trace(Object o, Throwable throwable) { } public void warn(Object o) { pc.log(String.valueOf(o), Project.MSG_WARN); } public void warn(Object o, Throwable throwable) { warn(o); } public boolean isDebugEnabled() { return true; } public boolean isErrorEnabled() { return true; } public boolean isFatalEnabled() { return true; } public boolean isInfoEnabled() { return true; } public boolean isTraceEnabled() { return false; } public boolean isWarnEnabled() { return true; } } Then I can instantiate AntLogger also from DataTypes. And I can resolveFiles from DataTypes by getting a manager. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] VfsTask
What happens if FileSystemManager is not closed and JVM exits. In other words what does close method do? And is it a problem if there are many FileSystemManagers? This can happen when instantiating outside of VFS class. Can there be any problems if Ant task uses one manager instance and some other class that ant task uses, uses some other manager instance? - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] VfsTask
What of your toolbox component do need the filesystemmanager? I looked into them, but they only use the passed fileobjects. Yeah, that was more like a what if question :) I like the idea of the current ant task implementation, they instantiate their own manager in the context of the current ant project. ok I am not sure if we should use the static way (VFS.getManager()) as this might have bad sideeffects when using ant embedded. ok If the user use VFS.getManager() too and we close this instance after ant is done, we close this manager (and all open files) too (it is the same manager instance). Also we can not simply use setLogger() on this instance, there might already be another logger set by the user on this instance (in the embedded case) Yep. However I need to get AntLogger in FileSet and I need to resolve files there. So there are 2 options. - Duplicating VfsTask to VfsDataType or - separating AntLogger as shown and making resolveFile method to resolveFile(Project p, String uri) Might it be possible to implement a special ant task - say vfs-manager id=vfs class=StandardFileSystemManager / this allows us to add some configuration options later (e.g. proxy settings, ssh settings) using this task and refer to this instance within the other tasks and maybe from within DataType (FileSet) Maybe a construct like vfs-manager id=vfs class=global / could be possible which will use the VFS.getManager(). And when using ant embedded, the developer could inject the manager instance. Not sure if all this could work, as i havent written an ant task till now. Do you think this could be an option? I don't understand fully but I quess you want a way to configure the manager in ant for different tasks. This could be a DataType. eg. vfs-copy toDir=path vfs-manager proxy=foo replicator=bar/ vfs-fileset dir=path include name=*.b?t/ /vfs-fileset /vfs-copy And then a task could support the datatype and use a filemanager with given configuration. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] VfsTask
vfs-manager id=default class=StandardFileSystemManager vfs-option but this is to specify and implement later ../ /vfs-manager vfs-manager id=special class=MyHyperVFSFileSystemManager / vfs-copy manager=default toDir=path vfs-fileset dir=path include name=*.b?t/ /vfs-fileset /vfs-copy vfs-copy manager=special toDir=path vfs-fileset dir=pathXX include name=*.b?t/ /vfs-fileset /vfs-copy Yes, looks very good. vfs-manager is here a data-type that sets a reference to itself to project project.addReference(String name, Object value); Then when vfs-copy sees manager attribute it asks an object from project that is registered with the given refid. And expects it to be vfs-manager datatype. The same task can also accept that type nested inside, in which case the type definition is local to the task. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Again, you use a common syntax, which is translated to the appropriate syntax for the implementation. If some specific feature of the common syntax is not supported in the implementation, then your RE will fail to translate, and should throw some sort of exception. Doesn't seem too difficult to deal with IMHO. What is common syntax? Could someone point me to the documentation. And what is this talk about translating? I thought there are only separate syntaxes (= dialect) and corresponding interpreters (= implementations). - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Since the whole point of the VFS discussion appears to be to support users who aren't using J2SE 1.4, all you have to do is use the syntax subset shared by Perl5 and java.util.regex, which is rather rich and useful. Anyway, that's my take given my understanding of what's being discussed. Ok. I am using \Q \E for quoting segments that should not be considered regexes. Then I use ((?!^)|[^/])\\Q**\\E((?!$)|[^/]) construct when looking for illegal patterns. (this means you can only have 2 stars in a row if they are delimited by - start of pattern OR slash AND - end of pattern OR slash Do these things work in all different regex flavors? And if not then how can I do it otherwise? - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs][all][poll]regular expression library or jdk1.4 as minimum requirement
Hi, The problem with having a generic interface for different regex implementations is that the syntax and semantics of regexes are different. I want to know EXACTLY what my regexes match and what constucts/syntax I can use. The implementations are not only implementations but they define also the form and meaning of the regexes that they use. Even though most of the constructs are the same there are differences. Many packages let themselves be configured to understand different dialects which proves my point. Let's call the set of allowed regexes as regex language. The best candidate in my humble opinion for regex language is the one defined in jdk 1.4. What would be needed is a separate package that would implement jdk 1.4 regex lang and could be used together with older jdk's. Could ORO do this? Therefore one can never avoid dependency to the regex language he uses. [X] Dont bother and use jdk1.4 as minimum requirement OR write separate package that works exactly like jdk 1.4 on earlier jdk's - Rami Ojares - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Do attachments go through the mailing lists?
Hi, Trying to send some contributions with zip/jar attachments but they don't seem to go through. Thus I'm testing. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Do attachments go through the mailing lists?
Hi, Trying to send some contributions with zip/jar attachments but they don't seem to go through. Thus I'm testing. Answering to my self here. They don't so how can I send posting with jar/zip. I have heard that these kinds of things should be done with bugzilla. But I did not see any upload possibilities. So how could I do it. I am sending 35 file addition for evaluation and it would be convinient in a nice small package. - rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[vfs] FileObject.delete
USING EXTERNAL EDITOR - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[vfs] FileObject.delete
FileObject.delete() calls FileObject.delete(Selectors.SELECT_SELF) So all the juice is in delete(FileSelector) There we find section SNIP // If the file is a folder, make sure all its children have been deleted if (file.type == FileType.FOLDER file.getChildren().length != 0) { // TODO - fail?? // Skip continue; } SNAP So the call to file.getChildren().length != 0 is already made during delete for every folder. Then a call to provider is made only after ensuring that folder does not have any children. The best way would be to move the children check to deleteSelf() method where the check for IMAGINARY is done. Then delete(FileSelector) could return the amount of files it deleted. And then delete() could return whatever it gets from delete(FileSelector). If it is 1 file was deleted otherwise it would be 0. -- Rami I will have a look at it, but i think i will implement it that way. Since I don't know the imlementations of the providers (yet) I can't say what would be the most efficient strategy but one idea could be that delete would return boolean that would tell whether something was deleted or not. - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] FileObject.delete
This will be done AFTER the children are removed, just to check if the folder is really empty before deleting it. I think the indention was to check if any other process created a file within the folder structure - and thus to avoid a error message during recursive delete. Not really. Think if you have structure dir1 dir1.1 a.txt dir1.2 b.txt Then calling [FileObject:dir1].delete() would not have deleted it's children. Because that is not the intention of this invocation. Further if you would have FilePattern selector **/*/b.txt Then dir1.1 would not be empty when trying to delete it Then delete(FileSelector) could return the amount of files it deleted. I already implemented a simple boolean, if you think this is not enough, please tell me ! I have had no time to checkin my changes as i have to write a test-case too, but you could expect it soon. I think it is common (for example in JDBC) to return so called delete count. I don't need this change but it is more like suggestion for improvement. And I don't think there is any hurry :) I can also send you a patch if that would help you. But maybe it is the writing of the test case that seems cumbersome. I tried to run the test cases but it did not seem to work. To me it seemed like build.xml was not running any of the tests. And when I changed it I got some errors and was discouraged to write test cases. So what is needed would be some kind of instructions of how to write and run succesfully junit tests. -- Rami - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[vfs] Patch: small documentation add
Hi, I have been playing around with vfs and it seems really nice! Here as a small addition to javadoc === retrieving revision 1.25 diff -B -b -u -r1.25 FileObject.java @@ -189,12 +189,12 @@ FileObject[] findFiles(FileSelector selector) throws FileSystemException; /** - * Deletes this file. Does nothing if this file does not exist. Does - * not delete any descendents of this file, use [EMAIL PROTECTED] #delete(FileSelector)} - * for that. + * Deletes this file. Does nothing if this file does not exist or if it is + * a folder that has children. Does not delete any descendents of this + * file, use [EMAIL PROTECTED] #delete(FileSelector)} for that. * - * @throws FileSystemException If this file is a non-empty folder, or if this file is read-only, - * or on error deleteing this file. + * @throws FileSystemException If this file is a non-empty folder, or if + * this file is read-only, or on error deleteing this file. */ void delete() throws FileSystemException; I don't know if this is intended but at least that's how it works on local filesystem. - Rami Ojares - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: [vfs] FileObject.delete
If I delete a FileObject I don't get any information about whether that FileObject was actually deleted or not. If file is FOLDER with children or IMAGINARY nothing is deleted but there is no way to know that. Actually you can query the file type and know for sure in case of IMAGINARY but in case of FOLDER the only way is to call fileObject.getChildren().length == 0 And I quess this can be an expensive operation if you just want to know whether the delete did something or not. Since I don't know the imlementations of the providers (yet) I can't say what would be the most efficient strategy but one idea could be that delete would return boolean that would tell whether something was deleted or not. - Rami Ojares - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]