On Monday 13 August 2007 21:59, you wrote:
> Author: nextgens
> Date: 2007-08-13 20:59:10 +0000 (Mon, 13 Aug 2007)
> New Revision: 14664
> 
> Removed:
>    trunk/freenet/src/freenet/clients/http/filter/CSSTokenizerFilter.java
> Modified:
>    trunk/freenet/src/freenet/clients/http/filter/CSSTokenizerFilter.jflex
> Log:
> Fix last commit, unbreak the CSS filter (hopefully).
> 
> REVIEW IT!
> 
> Modified: 
trunk/freenet/src/freenet/clients/http/filter/CSSTokenizerFilter.jflex
> ===================================================================
> --- trunk/freenet/src/freenet/clients/http/filter/CSSTokenizerFilter.jflex    
2007-08-13 20:23:07 UTC (rev 14663)
> +++ trunk/freenet/src/freenet/clients/http/filter/CSSTokenizerFilter.jflex    
2007-08-13 20:59:10 UTC (rev 14664)
> @@ -237,9 +237,9 @@
>  STRING2=\'(\\{NL}|\"|(\\\')|{NONASCII}|{ESCAPE}|[^\'])*\'
>  
>  IDENT={NMSTART}{NMCHAR}*
> -UNOFFICIAL_IDENT="-[^0-9]"{IDENT}
> +UNOFFICIAL_IDENT="-"{IDENT}

Ok, this is just reverting it to the old version.

>  NAME={NMCHAR}+
> -NUM="-"([0-9]+|[0-9]*"."[0-9]+)
> +NUM=[-]([0-9]+|[0-9]*"."[0-9]+)

Need more brackets, no?

(([0-8]+)|([0-9]*"."[0-9]+)) ?

Also I don't get the [-].

>  STRING={STRING1}|{STRING2}
>  
>  // Not used any more. Was used in url(). Keep for now. Matches up to the 
end of a bracket.
> @@ -391,14 +391,6 @@
>       w.write(s);
>       if(debug) log("Matched ident: "+s);
>  }
> -{UNOFFICIAL_IDENT} {
> -     if(debug) log("Deleted unofficial ident: "+yytext());
> -     w.write("/* " + l10n("deletedUnofficialIdent") + " */");
> -}
> -{UNOFFICIAL_IDENT}{W}":"{W}{REALURL} {
> -     if(debug) log("Deleted unofficial ident with url: "+yytext());
> -     w.write("/* " + l10n("deletedUnofficialIdentWithURL") + " */");
> -}
>  "@page" {
>       String s = yytext();
>       w.write(s);
> @@ -445,7 +437,6 @@
>       w.write(s);
>       if(debug) log("Matched number: "+s);
>  }
> -
>  {MEDIUMS}{W}*";" {
>       if(postBadImportFlag) {
>               // Ignore
> @@ -458,7 +449,6 @@
>               if(debug) log("Matched and passing on mediums list: "+s);
>       }
>  }
> -
>  "@charset"{W}*{STRING}{W}*";" {
>       String s = yytext();
>       detectedCharset = s;
> @@ -511,6 +501,14 @@
>               // Ignore
>       }
>  }
> +{UNOFFICIAL_IDENT} {
> +     if(debug) log("Deleted unofficial ident: "+yytext());
> +     w.write("/* " + l10n("deletedUnofficialIdent") + " */");
> +}
> +{UNOFFICIAL_IDENT}{W}":"{W}{REALURL} {
> +     if(debug) log("Deleted unofficial ident with url: "+yytext());
> +     w.write("/* " + l10n("deletedUnofficialIdentWithURL") + " */");
> +}

Moving the unofficial ident matching down seems sensible although *it has no 
effect at all* - jflex always takes the longest match. The last rule is the 
fallback. Lexical states look quite interesting, maybe a way to do more 
sophisticated parsing rather than simple sequential lexing which really 
doesn't work that well for CSS.

>  // Default rule matches only one character
>  . {
>       String s = yytext();
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: 
<https://emu.freenetproject.org/pipermail/devl/attachments/20070821/993c59e4/attachment.pgp>

Reply via email to