Craig A. Berry wrote:
At 9:15 AM -0500 8/29/07, John E. Malmberg wrote:

I have been working on trying to get the CPANPLUS tests to pass on VMS.

Right now 04_cpanplus-Module is failing for at least the following reasons:

rmtree() is broken on VMS as it can not handle a file named '.;'.
This  is noticed because a bug somewhere else in it is creating that file.

Is the problem specifically with rmtree(), or is the problem with
File::Find or readdir() not reporting it properly?

It is a problem with rmtree. If readdir returns the file ".", rmtree appends it to the UNIX format of the current working directory, and then recurses into it self to try to delete the contents. It will recurse until some resource runs out.

I have a band aid for it. The real fix is that it needs to be working with VMS format directory and filenames on VMS. That is beyond what I am trying to do for this pass..

The next set of failures are a result of several problems:

The file:spec routines are basically broken on VMS when given UNIX
pathnames as input.

One of the problems with them is that several of them are
unexpectedly  returning VMS format files, even though the documentation
implies that they should not be. Unfortunately fixing that is a problem
>> as it appears MakeMaker and other modules are dependent on this behavior,
>> usually in a VMS specific section.

Well, "unexpectedly" may need a bit of qualification here.  On the
face of it, returning a valid native spec should always be kosher,

The general expected rule is that the output format should be UNIX if the input is UNIX, and in VMS if the input if VMS.

This is also stated somewhere in the pod. I would have to search to find it.

However this does not cover the case where input specifications are either ambiguous, or some are in UNIX and some are in VMS.

and, as you note, some things -- probably a lot of things -- depend
on that.  I don't know what documentation you are referring to, but I
can assure you that the expected behavior of File::Spec->catdir('foo',
'bar') is "foo/bar" on UNIX, "foo\bar" on Windows, and "[.foo.bar]"
on VMS.  What would be unexpected would be if you got some foreign
format.

And there are a number of perl scripts that depend on UNIX input resulting in UNIX output.

Leaving it as is though is also a problem because unless the code
calling the File::Spec routines is special cased on VMS it will
typically break the perl script. So while this is fine for modules
>> that have previously been debugged for VMS, it still is basically
>> broken behavior.

Again, it just doesn't pass the common sense test that returning a
valid native spec is "basically broken behavior."

If someone is specially calling the OS specific method, then it could be expected to get the OS specific result, following the general rules that is usually used by the routines by that OS. And with the VMS C library, the general rule is also to return results in the same syntax as was input. Having the VMS specific Perl routines do differently would be unexpected.

If someone is calling the generic method with a UNIX format name, then it is likely to break if the filename is converted to a native format.

From what I have seen, in all but a few cases, the automatic conversion of UNIX input to native has just resulted in a lot more VMS specific code, typically to convert the output back to UNIX

It was also noticed on a previous thread about Archive::Extract
that  on Microsoft windows, one of the File::Spec routines was
>> also returning an incorrect result by unexpectedly translating one
>> of the returned components into a native path when given a UNIX input.
>> They are also returning incorrect results on VMS, which results in
>> another set of special case conditions, the very thing that they are
>> supposed to be preventing.

It is not easy to pick apart and reassemble filespecs in a completely
cross-platform way.  A lot of scripts and modules don't even try.
Some try very hard and still don't get it quite right, often because
the authors don't have the resources to test everywhere.  Some of the
special casing may not be needed anymore, but fixing it involves
rewriting stable -- and often difficult -- code without breaking
anything on multiple platforms.  Sometimes special casing works
around bugs that have been fixed, but the module in question supports
older versions of Perl or other modules.

Again, this was unambiguously a UNIX pathname, and it was in blead that is is returning the unexpected result. It returns some of the components in UNIX format and some in the common Windows format. By logic, it should return in one format or the other.

Having said all that, it is possible to do pretty sophisticated
manipulations using File::Spec without ever knowing or caring what
platform you are on or what format the pieces are in.

Yes, for many cases it just works.

The other problem is that on VMS the File::Spec routines are simply
returning wrong results when given a number of UNIX format pathnames.
Module Extract is producing a sample of this when untaring the file for
test 54 of the above.

I don't doubt there are unhandled cases.  If you are talking about
escaped characters (such as when you have multiple dots), then yes,
the File::Spec::VMS routines need some work.

No, this is on plain vanilla UNIX filenames that do not contain any special characters.

One issue is that the VMS specific File::Spec routines should never
be  calling vmsify() or unixify().

Well, they have to or a lot of things will break.  Of course if you
can eliminate unnecessary conversions without changing the final
result, so much the better.

Yes, a few things will break. However everything that I have found so far that will break will also not work with ODS-5 file specifications if those applications are not fixed. And if they are fixed to handle ODS-5 specifications properly, they will no longer be dependent on the automatic conversions, even when used on ODS-2 volumes.

One example is that anything that is sending a filename to a DCL shell command or MMS/MAKE must make sure that it is a VMS format pathname that is less than 255 characters. vmsify()/vmspath() does not do that conversion, and probably should not. RMSEXPAND however will do that conversion, because that is what RMS does automatically now.

My plan is that if the DECC$EFS_CHARSET feature is not enabled, that the existing behavior will be preserved unless a case is found where it is clearly broken.

Then if DECC$EFS_CHARSET is enabled, then the File::Spec routines will only use vmsify/unixify to resolve cases where mixed input of VMS format or UNIX format is given. They will use the DECC$FILENAME_REPORT_UNIX to decide if those ambiguous cases should result in VMS or UNIX file output.

This preserves the existing behavior, yet allows moving forward.

And then documenting that applications should not be written to depend on the old behavior, so that they will also work on ODS-5 or when run under GNV.

As I have a side build of Perl 5.8.7 that does this now, so I have a roadmap of what needs to be done.

The unresolved issue is how to have some of the dual life modules detect what the DECC or VMS features are set to, because they need to work with older version of Perl.

These routines are not reversible, even if they make some cases of
parsing more convenient, there are a number of VMS path specifications
that can not make the round trip intact.

This is better now than it was for some time in the pre-5.9.5 time
frame.  I fixed some bugs where escaped characters became double and
triple escaped after multiple round trips.  If there are remaining
cases, let's fix them.  If there are cases where it's theoretically
impossible to make the round trip, then let's document them.  If
there are places where we can eliminate the round trip without
changing the final result, that's fine by me.

I have VMS.C mostly fixed in this regard now, but still have a lot more to do, especially with the UTF-8/VTF-7. Perl may end up being the only VMS application that knows how to translate a file specifcation between the two modes.

I thought that before tampering with the routines any more though, I should get blead passing all the tests, otherwise I might not detect if I accidently break something.

VMS has a mode (and has had it for a long time) where it will cause
programs written in C to only see UNIX file specifications. Currently
many portions of Perl are failing in this mode, most often because
File::Spec is unexpectedly converting the file specification back to VMS.


If you're talking about the C run-time setting
DECC$FILENAME_UNIX_ONLY, then I don't think File::Spec::VMS is what
File::Spec should invoke when that is in effect.  When you flip a
switch on VMS so it can't understand VMS filespecs, then it doesn't
make a lot of sense to try to manipulate filespecs in a VMS-friendly
way.  I'm not sure that $^O should even return 'VMS' in this case.

DECC$FILENAME_UNIX_REPORT is the one that I am refering to. With that one, VMS filespecs are still legal, but by default file specifications are returned by the API in UNIX format of if the input is ambiguous.

It is a mode that I will be trying to get work. In that mode, since VMS file specifications are still legal input, the VMS specific methods are still needed to determine what they do next.

Also realpath() needs to be handled specially on VMS for performance reasons. But I need to do some testing there, as my preliminary testing with it is showing that it is not working as expected, but I do not have a sample good enough for someone to file a bug report.


DECC$FILENAME_UNIX_ONLY indicates that the CRTL will only see UNIX file specifications on input. I think it will be very difficult to get Perl to work with that setting.

Of the new POSIX compliant modes, I do not expect to see any program written in C that uses the UNIX only or the VMS only. Until RMS supports logical name translations, I do not expect to see any applications actually use any of the modes.

The next issue that comes up is that there is a path specification
in  the tar archive for the test that has multiple dots in it.
>> Currently the File::Spec routines can not deal with that, as
>> they are using an obsolete parsing algorithm.

See:

http://search.cpan.org/~clane/VMS-FileUtils_0.014/safename/safename.pm

I will look at that. The other thing that might be nice is if Perl had a way of supporting Pathworks ODS-2 name encoding, both the V5 and V6 algorithms.

When you catch up with the thread, you will see that I have a different solution, which matches with the default behavior of Perl on VMS to replace "." in directory names with "_".

Probably what there needs to be is to install a VFS layer. In that the routines that access filenames can be overridden with new methods. That would be an interesting cross-platform project.

-John
[EMAIL PROTECTED]
Personal Opinion Only

Reply via email to