> After some digging in various RFCs I have written a (complete) > grammar (in BNF) for parsing URIs (I'll append the grammar at the end > of this message).
While the complete URI grammar looks a complex, a URI string typically doesn't need to be fully parsed. You only need to fully parse the components that are requested. Note that the JDK 1.4 spec for the URI(String) constructor states that its parsing more relaxed than the BNF in RFC 2396. How relaxed it is can only be determined by black box testing against the JDK 1.4 implementation. If I was doing this, my first step would be to build some extensive Mauve test cases ... > So the URI parser can be implemented in either native (c code) or > java. Implementing it in java, will be quite hard and difficult to > maintain and keep up with potential URI changes. On the other hand, > if it is implemented in c, it will be *very* easy to implement and > maintain as I'll use flex and maximum parsing speed will be > achieved. Additionally, provided that the URI grammar is very simple, > bison (yacc) is not needed. It would be easy to implement the URI > parser in java if jlex is used (that's another option I'm > considering). I'd recommend hand building a pure Java parser. That way, the Classpath build process doesn't depend on an external parser or lexer generator, and the source code will be easier to understand. A hand-built parser for grammar as simple as this should be easy to implement / maintain. Especially considering Sun's documented deviations from the RFC grammar, and possible undocumented deviations. Finally, the chance that the RFC URI syntax will change radically is pretty small, IMO. -- Steve _______________________________________________ Classpath mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/classpath

