Re: Help with grammar

2020-05-23 Thread David Santiago
Thank you all for your replies.

I was able to fix it and better understanding grammars :-)

Regards,
David Santiago

Patrick R. Michaud  escreveu no dia quinta,
21/05/2020 à(s) 21:05:
>
> On Thu, May 21, 2020 at 08:40:08PM +, David Santiago wrote:
> > Can someone explain me why my grammar isn't working? Unfortunately i
> > can't figure it out :-(
> >
> > |  headers
> > |  |  header
> > |  |  * MATCH "Proxy-Connection"
> > |  |  header-value
> > |  |  * MATCH "keep-alive\n"
> > |  |  crlf
> > |  |  * FAIL
> > |  * FAIL
> > * FAIL
> > Nil
>
> Notice how  is capturing the newline in "keep-alive\n"?  That 
> means there's not a newline for the <.crlf> subrule that follows, and thus 
> the match fails.
>
> Try changing "rule header-value" to be a "token" instead.  That will prevent 
> it from consuming any whitespace immediately following the + sequence. 
>  When I tried your script with header-value defined as a token, it got a lot 
> farther into the match:
>
>   $ rakudo test.raku
>   TOP
>   |  request-line
>   |  |  method
>   |  |  * MATCH "CONNECT"
>   |  |  request-uri
>   |  |  * MATCH "ssl.gstatic.com:443"
>   |  |  http-version
>   |  |  * MATCH "HTTP/1.1"
>   |  |  crlf
>   |  |  * MATCH "\n"
>   |  * MATCH "CONNECT ssl.gstatic.com:443 HTTP/1.1\n"
>   |  headers
>   |  |  header
>   |  |  * MATCH "Proxy-Connection"
>   |  |  header-value
>   |  |  * MATCH "keep-alive"
>   |  |  crlf
>   |  |  * MATCH "\n"
>   |  * MATCH "Proxy-Connection: keep-alive\n"
>   * MATCH "CONNECT ssl.gstatic.com:443 HTTP/1.1\nProxy-Connection: keep-"
>   Nil
>
>
> Personally, I would likely define  to be something more like
>
> token header-value { \N+ }
>
> which gets any sequence of non-newline characters, since some of the headers 
> coming afterwards contain spaces and characters which aren't part of .
>
> Pm


Re: Help with grammar

2020-05-21 Thread Patrick R. Michaud
On Thu, May 21, 2020 at 08:40:08PM +, David Santiago wrote:
> Can someone explain me why my grammar isn't working? Unfortunately i
> can't figure it out :-(
> 
> |  headers
> |  |  header
> |  |  * MATCH "Proxy-Connection"
> |  |  header-value
> |  |  * MATCH "keep-alive\n"
> |  |  crlf
> |  |  * FAIL
> |  * FAIL
> * FAIL
> Nil

Notice how  is capturing the newline in "keep-alive\n"?  That 
means there's not a newline for the <.crlf> subrule that follows, and thus the 
match fails.

Try changing "rule header-value" to be a "token" instead.  That will prevent it 
from consuming any whitespace immediately following the + sequence.  
When I tried your script with header-value defined as a token, it got a lot 
farther into the match:

  $ rakudo test.raku
  TOP
  |  request-line
  |  |  method
  |  |  * MATCH "CONNECT"
  |  |  request-uri
  |  |  * MATCH "ssl.gstatic.com:443"
  |  |  http-version
  |  |  * MATCH "HTTP/1.1"
  |  |  crlf
  |  |  * MATCH "\n"
  |  * MATCH "CONNECT ssl.gstatic.com:443 HTTP/1.1\n"
  |  headers
  |  |  header
  |  |  * MATCH "Proxy-Connection"
  |  |  header-value
  |  |  * MATCH "keep-alive"
  |  |  crlf
  |  |  * MATCH "\n"
  |  * MATCH "Proxy-Connection: keep-alive\n"
  * MATCH "CONNECT ssl.gstatic.com:443 HTTP/1.1\nProxy-Connection: keep-"
  Nil


Personally, I would likely define  to be something more like

token header-value { \N+ }

which gets any sequence of non-newline characters, since some of the headers 
coming afterwards contain spaces and characters which aren't part of .

Pm


Re: Help with grammar

2020-05-21 Thread Gianni Ceccarelli
On 2020-05-21 David Santiago  wrote:
> Can someone explain me why my grammar isn't working? Unfortunately i
> can't figure it out :-(

Mixing ``rule``, ``token``, and ``regex`` apparently at random doesn't
make for a good grammar…

The text at
https://docs.raku.org/language/grammar_tutorial#The_technical_overview
is a bit confusing.

This https://docs.raku.org/language/regexes#Sigspace is more precise:
a ``rule`` inserts a ``<.ws>`` wherever there's whitespace in the
source code, so your::

   rule header-value { + }

is equivalent to::

  token header-value { + <.ws> }

which, as you saw in the trace, eats up the newline.

Short version: the only ``rule``s should be ``TOP``, ``request-line``,
and ``headers``, the others are all ``token``s

Extending the grammar to recognise more than one header is left as an
exercise.

-- 
Dakkar - 
GPG public key fingerprint = A071 E618 DD2C 5901 9574
 6FE2 40EA 9883 7519 3F88
key id = 0x75193F88


Help with grammar

2020-05-21 Thread David Santiago
Hi!

Can someone explain me why my grammar isn't working? Unfortunately i
can't figure it out :-(

Full script attached (42 lines) - the new lines in the script are
always only "\n"

The output:

TOP
|  request-line
|  |  method
|  |  * MATCH "CONNECT"
|  |  request-uri
|  |  * MATCH "ssl.gstatic.com:443"
|  |  http-version
|  |  * MATCH "HTTP/1.1"
|  |  crlf
|  |  * MATCH "\n"
|  * MATCH "CONNECT ssl.gstatic.com:443 HTTP/1.1\n"
|  headers
|  |  header
|  |  * MATCH "Proxy-Connection"
|  |  header-value
|  |  * MATCH "keep-alive\n"
|  |  crlf
|  |  * FAIL
|  * FAIL
* FAIL
Nil


It matches the request line's newline but not the headers.


Best regards,
David Santiago


test.raku
Description: Binary data