I can see that approach may work for certain restrictive fields like their postal code example but as you are finding it's pretty unworkable in multi-language unicode applications. I've always had to deal with input fields for notes, comments, descriptions etc. where there are no restrictions and special html characters like '<', '>' are allowed, with these you have no choice but to escape properly.
If you want to validate in Java you can use Character.isLetter(), Character.isDigit(), or org.apache.commons.lang.StringUtils, etc. and these work for all the unicode languages, but trying to do this with declarative validation using Regular Expressions... good luck. egetchell wrote: > > Greg, > > Thanks for the reply. > > The common approach for mitigating XSS is to provide a blacklist of XSS > enabling characters, enables would include "<", ">", "%3f", etc. However, > these filters are easily bypassed by clever encoding constructs, so the > blacklist concept quickly fails and the site is open for attack. > > By inverting the solution and supplying only the allowed set of > characters, the site remains secure no matter what clever encoding scheme > someone dreams up. > > The OWASP group provides some pretty extensive documentation around this. > Here is a direct link to some common validation strategies: > http://www.owasp.org/index.php/Data_Validation#Data_Validation_Strategies > > Their document, as a whole, is a very intereseting read. > > > Greg Lindholm wrote: >> >> Sorry, I've never heard of whitelisting of allowable characters as being >> a "normal" approach. <Remainder Removed> >> > > -- View this message in context: http://www.nabble.com/Using-POSIX-Regular-Expressions-for-Internationalized-Validation-tp19844314p19861490.html Sent from the Struts - User mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]