Another RFC document of interest is http://www.rfc-editor.org/rfc/rfc2640.txt.
I at first thought that this document might be heading down the paths I was looking 
at, but alas, it is approaching i18n only from the point of view of character sets 
used, it has nothing to do with date formats.  In any case, its recommendations have 
not been widely adopted.


-----Original Message-----
From:   Steve Cohen
Sent:   Sun 3/9/2003 2:18 PM
To:     Jakarta Commons Developers List; Jakarta Commons Developers List
Cc:     
Subject:        RE: [NET] Here's an Ant bug that we should look into fixing

I had thought I might hear some replies to this.  The silence has been deafening.  

I have been thinking about the issue, though, in particular where commons-net.ftp 
might have to go in order to really implement the ambitious spec laid out for it by 
clients such as ant, which have chosen to use it.

Of particular note here is the "depends" (or synonym "newer") attribute of the ant 
<ftp> task.  This runs aground on the issue of parsing the date.  In the first place, 
there are the issues of general listing format (unix, NT, VMS, etc.).  In the second 
place, though, within these categories are issues of date format.  This devolves into 
a thicket of locale-type issues:

Does month come before date?  
In which language are the names of the months coded?

To solve this, the scope of parser definition needs to be significantly expanded.

Things might be better if there was any mechanism within the FTP specification for the 
server to expose its format to a client.  No such mechanism exists, however.  In fact 
RFC959, the FTP spec is intentionally vague on this point: 

"Since the information on a file may vary widely from system
to system, this information may be hard to use automatically
in a program, but may be quite useful to a human user."

http://www.ietf.org/rfc/rfc959.txt

In other words, FTP was never meant to be used in such an automated fashion.

Nonetheless, with the specification of parameters easily passed in by something like 
an ant task, it might be possible to define a parser sufficiently to perform this 
task.  These parameters include:

1) os type of FTP server(unix, NT, OS2, VMS, etc.)
2) date format - to define ordering of date components - 
"MMM dd" or "dd MMM", etc. as in simple date format
3) locale - to define actual abbreviations of the months.

>From 2 and 3 it is possible to build a Locale-specific SimpleDateFormat
capable of parsing dates on a particular system.  This object contains the names and 
abbreviations of the month.

This immediately raises the question of how to divvy up the parsing duties between the 
regular expression and the SimpleDateFormat.   It seems as if the format string must 
be used to construct the part of the regex in the correct order.  Then the 
SimpleDateFormat would be used to actually parse the date.  All "optimizations" such 
as assuming a constant character width of 3 for month abbreviations are out the window 
here - they work for many languages, but not for all.  French, for example, uses 
periods and varying lengths.

A cautionary note: one would have to inspect actual ftp sites to determine whether 
they actually the abbreviations specified in java Locales.

Comments?  Is this a Pandora's box that we don't want to open?






-----Original Message-----
From:   Steve Cohen
Sent:   Wed 3/5/2003 1:53 PM
To:     Jakarta Commons Developers List
Cc:     
Subject:        [NET] Here's an Ant bug that we should look into fixing

The <ftp> task of ant doesn't work right because we don't parse
non-english date formats.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=14333
-----------------------------------------------
Steve Cohen
Sr. Software Engineer
Sportvision Inc.
[EMAIL PROTECTED]
http://www.sportvision.com

Please note: As a result of the merger of 
Ignite Sports and Sportvision, my email address 
has changed to [EMAIL PROTECTED]




---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to