[REBOL] hostname parse rule Re:(4)

2000-02-15 Thread peoyli

 Hi /PeO
...

These were the rules I was using...

let-char: charset[#"a" - #"z" #"A" - #"Z"]
let-digit-char: charset[#"a" - #"z" #"A" - #"Z" #"0" - #"9"]
let-digit-hyph-char: charset[#"a" - #"z" #"A" - #"Z" #"0" - #"9" #"-"]

 The problem is combining them to a working ruleset...
 
 The problem is how you formulate your specification. You say:
 ; may have any number of letters, digits and
 ; hyphens in between, but this sequence must
 ; end with a letter or digit if it exists
 
 This is too general. You need to redefine your rules so that you precisely
 state that a hyphen - if it occurs - must be followed by a letter or digit.
 I.e., I believe you have a let-digit-char rule, your let-digit-hyph-char
 rule then must be:
 
The hyphen must not be followed by a letter or a digit unless it is the
first of the two last characters.. According to the RFC "a--a" should be a
valid host name (part)... 

BNF rule was.. (not retyped from the RFC, but what I want to implement as a
parse rule)..

let-char [*[let-digit-hyph-char] let-digit-char]

with "[" and "]" marking optional parts as in BNF

That is.. valid host names begins with a letter, ends with a letter or a
digit (which could be the same as the first letter, since one-character
host names are allowed), in between there could be any number of letters,
digits and hyphens in any order.

 let-digit-hyph-char: [ "-" let-digit-char]
 
That rule will require every hyphen to be followed by a letter or a digit..

 This rule only evaluates to true if a hyphen is immediately followed by a
 character. It permits that a hyphen appear, provided that the hyphen is
 followed by a an obligatory character and thereby excludes the possibility
 of a trailing hyphen.
 
 Hope this helps
 
 
 ;- Elan  [: - )]
 
 



[REBOL] hostname parse rule Re:

2000-02-15 Thread KGD03011


Hi there,

This might be a good chance to plug my regular expression emulator,
search-text.r :)

http://www.rebol.org/utility/search-text.r

As far as I understand it, this should be the direct translation of the
BNF rule for hostnames as given by /PeO :

   let-char [*[let-digit-hyph-char] let-digit-char]

should be:

 l: to-bits #!a-z; TO-BITS is included in search-text.r
== make bitset! #{ ; ! means upper and lower case
FE07FE07
}
 ld: to-bits #!a-z0-9
== make bitset! #{
FF03FE07FE07
}
 ldh: to-bits #!a-z0-9\- ; \ escapes the hyphen
== make bitset! #{
0020FF03FE07FE07
}
 name-rule: [l 0 1 [any ldh ld]]
== [l 0 1 [any ldh ld]]

Unfortunately this almost never returns true when it should:

 parse/all "abc-ef" name-rule
== false
 parse/all "abcef" name-rule
== false
 parse/all "a" name-rule
== true


This is very easy to do with SEARCH from search-text.r, and you don't
even have to prepare the bitsets beforehand:

 search "abcdef" [head #!a-z  maybe [any #!a-z0-9\- #!a-z] tail]
== [1 6 "abcdef"]
 search "abc-def" [head #!a-z  maybe [any #!a-z0-9\- #!a-z] tail]
== [1 7 "abc-def"]
 search "abc--def" [head #!a-z  maybe [any #!a-z0-9\- #!a-z] tail]
== [1 8 "abc--def"]
 search "abcdef-" [head #!a-z  maybe [any #!a-z0-9\- #!a-z] tail]
== none
 search "0abcdef" [head #!a-z  maybe [any #!a-z0-9\- #!a-z] tail]
== none



SEARCH emulates the backtracking behavior of regular expressions.
PARSE on the other hand will match to the end of the line with
LET-DIGIT-HYPH-CHAR, leaving nothing for the last LET-DIGIT-CHAR to
match.


Actually, I have to admit this isn't so difficult with PARSE either.
You just have to look for sequences of any number of optional hyphens
followed by one or more alphanumerics:

 name-rule: [l any [any "-" some ld ]]
== [l any [any "-" ld]]
 parse/all "abcdef" name-rule
== true
 parse/all "abc-def" name-rule
== true
 parse/all "abc--def" name-rule
== true
 parse/all "abcdef-" name-rule
== false
 parse/all "0abcdef" name-rule
== false


See you,
Eric



[REBOL] hostname parse rule Re:(2)

2000-02-13 Thread peoyli

  name-rule: [let-char [none | [[some let-digit-hyph-char] let-digit-char]]
 to end]
 
 Try:
 let-char any [let-digit-hyp=char]
 instead.
 
The problem will be that it would then accept a "-" as the last character
in a part of the hostname...

Valid names are these starting with a letter, followed by an optional part
consisting of any number of letters and "-":es, but must end with a letter
or digit...

The problem with my original rule was that the
[some let-digit-hyph-char]

part matched all the way up to the end of the string that I tried to
parse.. (some should be "any", but that didn't help)

I came up with this rule that seems to be working:

name-rule: [ let-char [[any [[let-digit-hyph-char] let-digit-char]] | none]]

 parse "a" name-rule
== true
 parse "1" name-rule
== false
 parse "a-" name-rule
== false
 parse "a-1" name-rule
== true
 parse "aaa-1" name-rule
== true

/PeO



[REBOL] hostname parse rule Re:(2)

2000-02-13 Thread peoyli

 Hello [EMAIL PROTECTED]!
 
 On 13-Feb-00, you wrote:
 
 
  p hname-rule: [name-rule some ["." name-rule]] name-rule:
  p [let-char [none | [[some let-digit-hyph-char] let-digit-char]]
  p to end]
 
  p The problem is that it returns "true" too often...
 
 This is caused by the "to end" above. It should work if you remove
 it.

Removing the "to end" did break the rule even more...

 name-rule: [let-char [none | [[some let-digit-hyph-char] let-digit-char]]]
 parse "a" name-rule   
== true
OK  a single letter hostname part is ok

 parse "a1" name-rule
== false
ERR!two chars which the first one is a letter and the last is
letter or digit is ok

 parse "a1a" name-rule
== false
ERR!should be ok

replacing "some" with "any" above caused the same error..

And.. the rule I came up with that I posted earlier...

name-rule: [ let-char [[any [[let-digit-hyph-char] let-digit-char]] | none]]

didn't work with a two character name...

If I break down the rules...

; name must start with a letter
let-char

; may have any number of letters, digits and
; hyphens in between, but this sequence must
; end with a letter or digit if it exists
0 1 [any let-digit-hyph-char let-digit-char]

The problem is combining them to a working ruleset...

parse "a" name-rule should return "true"
parse "a1" name-rule should return "true"
parse "aa" name-rule should return "true"
parse "a-" name-rule should return "false"
parse "a-1" name-rule should return "true"

/PeO



[REBOL] hostname parse rule Re:(3)

2000-02-13 Thread icimjs

Hi /PeO

you wrote:
; name must start with a letter
let-char

; may have any number of letters, digits and
; hyphens in between, but this sequence must
; end with a letter or digit if it exists
0 1 [any let-digit-hyph-char let-digit-char]

The problem is combining them to a working ruleset...

The problem is how you formulate your specification. You say:
; may have any number of letters, digits and
; hyphens in between, but this sequence must
; end with a letter or digit if it exists

This is too general. You need to redefine your rules so that you precisely
state that a hyphen - if it occurs - must be followed by a letter or digit.
I.e., I believe you have a let-digit-char rule, your let-digit-hyph-char
rule then must be:

let-digit-hyph-char: [ "-" let-digit-char]

This rule only evaluates to true if a hyphen is immediately followed by a
character. It permits that a hyphen appear, provided that the hyphen is
followed by a an obligatory character and thereby excludes the possibility
of a trailing hyphen.

Hope this helps


;- Elan  [: - )]



[REBOL] hostname parse rule Re:

2000-02-12 Thread icimjs

Hi [EMAIL PROTECTED]

you wrote:
let-char: charset[#"a" - #"z" #"A" - #"Z"]
let-digit-char: charset[#"a" - #"z" #"A" - #"Z" #"0" - #"9"]
let-digit-hyph-char: charset[#"a" - #"z" #"A" - #"Z" #"0" - #"9" #"-"]
hname-rule: [name-rule some ["." name-rule]]
name-rule: [let-char [none | [[some let-digit-hyph-char] let-digit-char]]
to end]
 parse "asdfg-" name-rule
== true
ERR!last character is not one of the allowed ones

 parse "a%$#" name-rule  
== true
ERR!none of the characters following the "a" are allowed

What is wrong with the rule generating the too-many true's ?

Your name-rule requres a let-char then it permits for none and then goes to
end. In both ERR! cases let-char is fulfilled, then at some point the none
part of the following bracket is fulfilled and then to end is performed. No
wonder you end up with true in both cases. Compare to:

 parse "abcdefg" [none to end]
== true



;- Elan  [: - )]



[REBOL] hostname parse rule Re:

2000-02-12 Thread Al . Bri

 name-rule: [let-char [none | [[some let-digit-hyph-char] let-digit-char]]
to end]

Try:
let-char any [let-digit-hyp=char]
instead.

Andrew Martin
ICQ: 26227169
http://members.xoom.com/AndrewMartin/
--