Another RFC document of interest is http://www.rfc-editor.org/rfc/rfc2640.txt. I at first thought that this document might be heading down the paths I was looking at, but alas, it is approaching i18n only from the point of view of character sets used, it has nothing to do with date formats. In any case, its recommendations have not been widely adopted.
-----Original Message----- From: Steve Cohen Sent: Sun 3/9/2003 2:18 PM To: Jakarta Commons Developers List; Jakarta Commons Developers List Cc: Subject: RE: [NET] Here's an Ant bug that we should look into fixing I had thought I might hear some replies to this. The silence has been deafening. I have been thinking about the issue, though, in particular where commons-net.ftp might have to go in order to really implement the ambitious spec laid out for it by clients such as ant, which have chosen to use it. Of particular note here is the "depends" (or synonym "newer") attribute of the ant <ftp> task. This runs aground on the issue of parsing the date. In the first place, there are the issues of general listing format (unix, NT, VMS, etc.). In the second place, though, within these categories are issues of date format. This devolves into a thicket of locale-type issues: Does month come before date? In which language are the names of the months coded? To solve this, the scope of parser definition needs to be significantly expanded. Things might be better if there was any mechanism within the FTP specification for the server to expose its format to a client. No such mechanism exists, however. In fact RFC959, the FTP spec is intentionally vague on this point: "Since the information on a file may vary widely from system to system, this information may be hard to use automatically in a program, but may be quite useful to a human user." http://www.ietf.org/rfc/rfc959.txt In other words, FTP was never meant to be used in such an automated fashion. Nonetheless, with the specification of parameters easily passed in by something like an ant task, it might be possible to define a parser sufficiently to perform this task. These parameters include: 1) os type of FTP server(unix, NT, OS2, VMS, etc.) 2) date format - to define ordering of date components - "MMM dd" or "dd MMM", etc. as in simple date format 3) locale - to define actual abbreviations of the months. >From 2 and 3 it is possible to build a Locale-specific SimpleDateFormat capable of parsing dates on a particular system. This object contains the names and abbreviations of the month. This immediately raises the question of how to divvy up the parsing duties between the regular expression and the SimpleDateFormat. It seems as if the format string must be used to construct the part of the regex in the correct order. Then the SimpleDateFormat would be used to actually parse the date. All "optimizations" such as assuming a constant character width of 3 for month abbreviations are out the window here - they work for many languages, but not for all. French, for example, uses periods and varying lengths. A cautionary note: one would have to inspect actual ftp sites to determine whether they actually the abbreviations specified in java Locales. Comments? Is this a Pandora's box that we don't want to open? -----Original Message----- From: Steve Cohen Sent: Wed 3/5/2003 1:53 PM To: Jakarta Commons Developers List Cc: Subject: [NET] Here's an Ant bug that we should look into fixing The <ftp> task of ant doesn't work right because we don't parse non-english date formats. http://nagoya.apache.org/bugzilla/show_bug.cgi?id=14333 ----------------------------------------------- Steve Cohen Sr. Software Engineer Sportvision Inc. [EMAIL PROTECTED] http://www.sportvision.com Please note: As a result of the merger of Ignite Sports and Sportvision, my email address has changed to [EMAIL PROTECTED]
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]