[Fwd: Re: [Module::Build] ANNOUNCE: 0.2601 -> CPAN]

Please forgive my lack or proper etiquette, but I want have access to a Windows machine for a few days, and I want to try to resolve this issue.

For the Module::Build project we need to be able to parse strings like the command shell. I'm not sure of all details of how the shell interprets commands, but I think the algorithm below is close. Can anyone verify this and maybe help provide a routine that implements it. I'll make sure you're credited for the contribution. =)

Thanks,
Randy.


-------- Original Message --------
Subject: Re: [Module::Build] ANNOUNCE: 0.2601 -> CPAN
Date: Sun, 07 Nov 2004 14:17:14 -0500
From: Randy W. Sims <[EMAIL PROTECTED]>
Reply-To: [EMAIL PROTECTED]
To: Ken Williams <[EMAIL PROTECTED]>
CC: [EMAIL PROTECTED], [EMAIL PROTECTED], Perl - Module-Build <[EMAIL PROTECTED]>
References: <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]> <[EMAIL PROTECTED]>


Ken Williams wrote:

On Nov 7, 2004, at 11:32 AM, Randy W. Sims wrote:
If necessary we can either try to emulate it be experimenting or maybe we can dig some code out of ReactOS (a Windows NT clone) and FreeDOS (a DOS clone) and translate it to perl.


Okay - I'm happy with just a subset of it for now. Let's do what we can with 2-3 lines of regexes and such in Windows->split_like_shell(), and wait until we get some legitimate complaints. I think we've already fixed the most common problems.

Actually, I think I just stumbled across the algorithm. After sending my
previous message, I was looking at the results from the brief
experiments I posted [see sig], and I think I found an algorithm that works for those cases. But I'm not by a Windows box to test at the moment.


Scanning the string char by char:

Quote mode is alternately entered and exited upon find a d-quote
  unless the d-quote is preceeded with a b-slash or
  unless the d-quote is preceeded with a d-quote that terminates quote-mode
    in both cases the d-quote is a literal part of the argument.

Given the previous used example: "foo"\"bar" "foo\"bar"

"  discard and enter quote-mode
f  shift
o  shift
o  shift
"  discard and exit quote-mode
\  literal unless next char is d-quote
"   escaped d-quote, discard b-slash, shift d-quote
b  shift
a  shift
r  shift
"  discard and enter quote-mode
   quoted space
"  discard and exit quote-mode
f  shift
o  shift
o  shift
\  literal unless next char is d-quote
"   escaped d-quote, discard b-slash, shift d-quote
b  shift
a  shift
r  shift
"  discard and enter quote-mode

Correctly produces the result:
foo"bar foo"bar

Open questions:
Does b-slash escape any other characters? spaces?
Does all versions follow this basic algorithm?
NT has circumflex as an additional escape char, we probably don't care.

Anybody by a Windoze box to try this out? If not, I'll try in a few days
when I get a chance.

Randy.

__END__

--

#!/usr/bin/perl
use strict;
use warnings;

print "<$_>\n" for @ARGV;
__END__

and I got the same results. I'm not quite sure how that line is being interpreted. What I do know about the cmd.exe shell is that:

1. circumflex is the escape character
2. you can get a quote inside a quoted string by repeating it x3, i.e. "foo"""bar" => foo"bar && "foo""""""bar" => foo""bar


experimenting (w/ echoarg.pl):

The inner quotes quote the space, making it one arg:

echoarg "foo"\"bar foo\"bar"
=> <foo"bar>
=> <foo"bar>


The outer quotes have the same effect:

echoarg foo"\"bar" "foo\"bar
=> <foo"bar>
=> <foo"bar>


Removing outer and inner produces a single arg:

echoarg foo"\"bar foo\"bar
=> <foo"bar foo"bar>


No backslashes:

echoarg "foo""bar" "foo"bar"
=> <foo"bar foobar>


Effects of inner quotes

w/ outer quotes         w/o outer quotes
=====================   =====================
echoarg "foo"bar"       echoarg foo"bar
=> <foobar>             => <foobar>

echoarg "foo""bar"      echoarg foo""bar
=> <foo"bar>            => <foobar>

echoarg "foo"""bar"     echoarg foo"""bar
=> <foo"bar>            => <foo"bar>

echoarg "foo""""bar"    echoarg foo""""bar
=> <foo"bar>            => <foo"bar>

echoarg "foo"""""bar"   echoarg foo"""""bar
=> <foo""bar>           => <foo"bar>

echoarg "foo""""""bar"  echoarg foo""""""bar
=> <foo""bar>           => <foo""bar>


Ick, Ick, Ick. There are also differences between cmd.exe and command.com, not to mention differences in releases and differences that depend on extensions being enabled. Ick.



-- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] <http://learn.perl.org/> <http://learn.perl.org/first-response>




Reply via email to