Thanks a Lot again , So i can use a Regexp string in my Java class For Example /d matches a numeric and /w matches a alpha numeric What is it for matching unicode characters . Hope I am underdstandable abt the question am I asking !!
Thsi is the code I have and to incorporate a regular expression I just give the type as parameters to this class For xmalple to have a regular expression that accepts only word characters I will have RegExFormatter("\\w") But when i say 'w ' here does it mean unicode also or the alpha numeric characters // jakarta apache imports for the ORO regular expression library import org.apache.oro.text.regex.Perl5Compiler; import org.apache.oro.text.regex.Perl5Matcher; import org.apache.oro.text.regex.Perl5Substitution; import org.apache.oro.text.regex.Pattern; import org.apache.oro.text.regex.Util; import org.apache.oro.text.regex.MatchResult; import org.apache.oro.text.regex.MalformedPatternException; // framework imports public class RegexFormatter implements Formatter { // Possible error codes public static final String MATCH_FAILURE_ERROR = "MatchFailure"; /** * <p>Regular expression compiler. Only one is needed, as it is stateless * factory class.</p> */ protected static final Perl5Compiler compiler = new Perl5Compiler(); /** * <p>Regular expression/pattern associated to this RegexFormatter for validation * purposes.</p> */ protected Pattern regex; /** * <p>The regular espression pattern that is used for matching for formatting. * The first match group will replace $1 in the output substitution expression.</p> */ protected Pattern outMatch; /** * <p>The substitution that will be used for output of the formatted input.</p> */ protected Perl5Substitution outSubstitution; /** * <p>Defines whether the substitution will be global.</p> */ protected boolean globalSubstitution; /** * <p>Creates a new RegexFormatter object with the specified regular * expression.</p> * * @param <code>expression</code> the regular expression used to validate * Strings with this instance of RegexFormatter. * @throws <code>InvalidConfigurationException</code> thrown when a * MalformedPatternException is caught due to an invalid * regular expression String being supplied. Thrown as a * RuntimeException due to the way Format objects are * usually instantiated (statically). */ public RegexFormatter(String expression) { try { // Compile the regular expression object this.regex = RegexFormatter.compiler.compile(expression); } catch (MalformedPatternException mpe) { Logger.error(Logger.PRODUCER_FORMAT, "RegexFormatter::<init>. Class " + "could not be instantiated due to malformed pattern supplied. " + "Pattern: " + expression); Logger.error(Logger.PRODUCER_FORMAT, "RegexFormatter::<init>. " + "MalformedPatternException is: " + mpe.toString()); // Rethrow the exception after logging the errors. throw new InvalidConfigurationException( "RegexFormatter::<init> caught a MalformedPatternException. ",mpe); } } /** * <p>Creates a new RegexFormatter object with the specified regular * expression.</p> * * @param <code>expression</code> the regular expression used to validate * Strings with this instance of RegexFormatter. * @param <code>outMatch</code> the regular expression used to match against * during substitution. The first match group will replace $1 in * the substitution, the second group will replace $2 and so on. * @param <code>outSubst</code> the output substitution expression. A '$1' in * this string will be replaced by the first matched group for data. * @param <code>globalSubstitution</code> defines whether the substitution will * be applied as many times as found or only the first time. * @throws <code>InvalidConfigurationException</code> thrown when a * MalformedPatternException is caught due to an invalid * regular expression String being supplied. Thrown as a * RuntimeException due to the way Format objects are * usually instantiated (statically). */ public RegexFormatter(String expression,String outMatch,String outSubst,boolean globalSubst) { try { // Compile the regular expression object this.regex = RegexFormatter.compiler.compile(expression); // Compile the out match this.outMatch = RegexFormatter.compiler.compile(outMatch); // Create a substituter with the substitution string this.outSubstitution = new Perl5Substitution(outSubst); // Define if this substitution will be global this.globalSubstitution = globalSubst; } catch (MalformedPatternException mpe) { Logger.error(Logger.PRODUCER_FORMAT, "RegexFormatter::<init>. Class " + "could not be instantiated due to malformed pattern supplied. " + "Pattern: " + expression); Logger.error(Logger.PRODUCER_FORMAT, "RegexFormatter::<init>. " + "MalformedPatternException is: " + mpe.toString()); // Rethrow the exception after logging the errors. throw new InvalidConfigurationException( "RegexFormatter::<init> caught a MalformedPatternException. ",mpe); } } /** * <p>Implementation of the abstract parseObject() method from * Formatter. Major method of this class. Applies a regular expression * to a String to determine if it matches. If a match is found, the * matching subsection of the string is returned. If a match is not found * then a ParsingException is thrown. See class level javadocs for * examples of usage.</p> * * @param <code>input</code> string to be matched * @return <code>Object</code> a String containing the matching section of * the String passed in. * @throws <code>ParsingException</code> thrown when the String does not * match the regular expression. */ public Object parseObject(String text) throws ParsingException { Perl5Matcher matcher = new Perl5Matcher(); if ( !matcher.contains(text, this.regex) ) { // If the string passed in does not match the regular expression // against which it is being validated, throw a ParsingException. // Since this can happen fairly frequently we don't want to do a lot // of expensive concatenation for the message. throw new ParsingException(MATCH_FAILURE_ERROR); } // Returns the section of the string which matched the regex. return matcher.getMatch().toString(); } /** * <p>Implemented for compatibility with the Formatter class. Return * the input without modifiction.</p> * * @param <code>input</code> an Object which should always be a String. * @return <code>String</code> the String passed in initially. */ public String format(Object input) { if (this.outMatch == null) { return input.toString(); } else { Perl5Matcher matcher = new Perl5Matcher(); if (matcher.contains(input.toString(), this.outMatch) ) { StringBuffer sbuf = new StringBuffer(); String output = Util.substitute(matcher, this.outMatch, this.outSubstitution, input.toString(), this.globalSubstitution?Util.SUBSTITUTE_ALL:1); return output; } } return null; } } -----Original Message----- From: Daniel F. Savarese [mailto:[EMAIL PROTECTED]] Sent: Monday, January 07, 2002 2:53 AM To: ORO Users List Subject: Re: Doubt about ORO In message <[EMAIL PROTECTED]>, Chan >The ORO packages work well for ASCII Character set >But my doubt does it work for UTF-8 also !! As someone else mentioned, UTF-8 is a method of encoding Unicode as a series of bytes, so the question doesn't make a lot of sense given that Java characters are always a 16-bit representation of Unicode. I assume you mean "Do the ORO packages work with Java character values greater than 255?" The answer is yes for everything except for the .awk package, which only works with character values 0-255. daniel -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>