I am afraid that this subject of String Escape Method is becoming a little
off-topic, but I would like to add my ideia and test results - I think
they are relevant to the discussion.

IMHO, if speed, rather that readability, is important, all
processing would be faster if two static arrays of chars were used, with a
final string creation with the result, if needed, 'a la' C.

Method invocation (with stack operations, class checks,etc) will probably
be slower than the method execution time, so I would also try to avoid
calling methods inside the escapeString method (append, charAt,etc)

I have implemented one method using these principles and the results
of calling the method 1000000 (1x10^6) times in my humble computer,
were (in seconds) :


                String with no \ or "          String with \ and "
My proposal           26                              42 
The "other"           34                              288                   


Here is the main() I used to test it:

public class exper
{
public static  String escapeString(String str)
{
//different implementations...
}


public static void main(String a[])
  {
        String result=null;

        for(int times=0;times<1000000;times++)
        {
                result = exper.escapeString(a[0]);      
        }

        System.out.println(result);

  }

}



This test has a serious drawback that the String to be escaped is always
the same. Anyway, in my proposal, as you will see, after a big string is
allocated, the arrays are not reallocated again, so this makes this test
realistic and valid. As for the "other" proposal, there is no difference.

Here is the output of the tests:

[smacedo@test133 tmp]$ date ; java exper "This is a message that has no strings to 
escape and should be long enough to test the classes." ; date
Thu Mar 16 16:05:42 GMT 2000
This is a message that has no strings to escape and should be long enough to test the 
classes.
Thu Mar 16 16:06:08 GMT 2000
[26 seconds]

[smacedo@test133 tmp]$ date ; java exper2 "This is a message that has no strings to 
escape and should be long enough to test the classes." ; date
Thu Mar 16 16:06:19 GMT 2000
This is a message that has no strings to escape and should be long enough to test the 
classes.
Thu Mar 16 16:06:53 GMT 2000
[34 seconds]


[smacedo@test133 tmp]$ date ; java exper "This is a message that has \\ some \" 
strings to escape and should be long enough to test the classes."; date
Thu Mar 16 16:07:45 GMT 2000
This is a message that has \\ some \" strings to escape and should be long enough to 
test the classes.
Thu Mar 16 16:08:27 GMT 2000
[42 seconds]


[smacedo@test133 tmp]$ date ; java exper2 "This is a message that has \\ some \" 
strings to escape and should be long enough to test the classes." ; date
Thu Mar 16 16:08:48 GMT 2000
This is a message that has \\ some \" strings to escape and should be long enough to 
test the classes.
Thu Mar 16 16:13:36 GMT 2000
[4 minutes and 48 seconds!!]



Finally, here is the code of my proposal for escapeString:


public class exper
{

private static char a1[];
private static char a2[];
private static int  n;
private static int  m;
private static int  len;
private static boolean ready = false;

public static  String escapeString(String str)
{
                        // n will be used to scan the string
        n=0;
                        // m will keep the number of added chars
        m=0;


                        // Check if str is null
        if(str==null) return str;


                        // Get its length and check if it is empty      
        len = str.length();
        if(len==0) return str;
        

                        // If the static arrays haven't been allocated yet
                        // or if we need a bigger one
        if( !ready || len>a1.length)
          {
           a1= new char[len];
           a2= new char[len*2];
           ready = true;
          }


                        // get str to array
                        // It would be great to be able to do
                        // System.arraycopy of the private var "value"
                        // without further checking...
        str.getChars(0,len,a1,0);


                        // do usual switch
        while(n<len)
        {

                switch(a1[n])
                {
                case '\\':
                        a2[n+(++m)]='\\';
                        a2[n+(++m)]='\\';
                        break;
                        
                case '"':
                        a2[n+(++m)]='\\';
                        a2[n+(++m)]='"';
                        break;
                default:
                        a2[n+m]=a1[n];
                }
        
        n++;
        }

                // avoid new String if no escaped were needed
        if(m==0)
                return str;
        else
                return new String(a2,0,len+m);  
}

...
}


Both implementations follow in attach.

I hope this contributes to the JDE effort.

Regards, Silvio


On Thu, 16 Mar 2000, Mark Gibson wrote:
> 
> This is exactly what I did in my original implementation, but the only
> real way of testing the efficient would be via profiling the method for
> each implementation, has anyone tried this?
> 
> On Thu, 16 Mar 2000, Brad Giaccio wrote:
> 
> [snip]
> > But to avoid all this the code as presented calculates the number of 
> > escapes to be caluclated
> > char [] array = new char[length + escapeCount];
> > int index;
> > for (int i = 0; i < length; i++) {
> > // do switch here and just to insertion into array here
> > }
> [snip]
> 
> 
> 

``````````` Silvio Emanuel Nunes Barbosa de Macedo (PhD Std) '''''''''''''
[EMAIL PROTECTED]                                         [EMAIL PROTECTED]
Intelligent and Interactive Systems                Telecom. and Multimedia
Imperial College, University of London                         INESC Porto 
Exhibition Road,                                       Pc da Republica, 93
London SW7 2AZ, England                            4050-497 Porto Portugal
Tel:+44 171 5946323                                    Tel:+351 22 2094220




public class exper
{

private static char a1[];
private static char a2[];
private static int  n;
private static int  m;
private static int  len;
private static boolean ready = false;

public static  String escapeString(String str)
{
                        // n will be used to scan the string
        n=0;
                        // m will keep the number of added chars
        m=0;


                        // Check if str is null
        if(str==null) return str;


                        // Get its length and check if it is empty      
        len = str.length();
        if(len==0) return str;
        

                        // If the static arrays haven't been allocated yet
                        // or if we need a bigger one
        if( !ready || len>a1.length)
          {
           a1= new char[len];
           a2= new char[len*2];
           ready = true;
          }


                        // get str to array
                        // It would be great to be able to do
                        // System.arraycopy of the private var "value"
                        // without further checking...
        str.getChars(0,len,a1,0);


                        // do usual switch
        while(n<len)
        {

                switch(a1[n])
                {
                case '\\':
                        a2[n+(++m)]='\\';
                        a2[n+(++m)]='\\';
                        break;
                        
                case '"':
                        a2[n+(++m)]='\\';
                        a2[n+(++m)]='"';
                        break;
                default:
                        a2[n+m]=a1[n];
                }
        
        n++;
        }

                // avoid new String if no escaped were needed
        if(m==0)
                return str;
        else
                return new String(a2,0,len+m);  
}


public static void main(String a[])
  {
        String result=null;

        for(int times=0;times<1000000;times++)
        {
                result = exper.escapeString(a[0]);      
        }

        System.out.println(result);

  }

}


public class exper2
{

  /**
   * Prefix \ escapes to all \ and " characters in a string so that
   * the quoted string can be printed rereadably. For efficiency,
   * if no such characters are found, the argument String itself
   * is returned.
   *
   * @param  str   String to be prefixed.
   * @return A String.
   *
   * @author David Hay
   * @author Mark Gibson
   * @author Steve Haflich
   * @author Charles Hart
   */
  public static String escapeString (String str) {

    int escCount = 0;
    int len = str.length();
    char c;

    // Count number of chars that need escaping.
    for (int i=0; i<len; i++) {
      c = str.charAt(i);
      if (c == '\\' || c == '\"')
        escCount++;
    }

    if (escCount > 0) {

      StringBuffer buf = new StringBuffer(str.length() + escCount);

      for ( int idx = 0; idx < str.length(); idx++ )  {
        char ch = str.charAt( idx );
        switch ( ch ) {
        case '"':  buf.append( "\\\"" ); break;
        case '\\': buf.append( "\\\\" ); break;
        default:   buf.append( ch );     break;
        }
      }

      return buf.toString();
    }
    else
      return str;
  }




public static void main(String a[])
{

        String result=null;

        for(int times=0;times<1000000;times++)
        {
                result=exper2.escapeString(a[0]);       
        }


        System.out.println(result);

}

}




[smacedo@test133 tmp]$ date ; java exper "This is a message that has no strings
to escape and should be long enough to test the classes." ; date
Thu Mar 16 16:05:42 GMT 2000
This is a message that has no strings to escape and should be long enough to test the 
classes.
Thu Mar 16 16:06:08 GMT 2000
[smacedo@test133 tmp]$ date ; java exper2 "This is a message that has no strings to 
escape and should be long enough to test the classes." ; date
Thu Mar 16 16:06:19 GMT 2000
This is a message that has no strings to escape and should be long enough to test the 
classes.
Thu Mar 16 16:06:53 GMT 2000
[smacedo@test133 tmp]$ date ; java exper "This is a message that has \\ some \"
strings to escape and should be long enough to test the classes." ; date
Thu Mar 16 16:07:45 GMT 2000
This is a message that has \\ some \" strings to escape and should be long enough to 
test the classes.
Thu Mar 16 16:08:27 GMT 2000
[smacedo@test133 tmp]$ date ; java exper2 "This is a message that has \\ some \" 
strings to escape and should be long enough to test the classes." ; date
Thu Mar 16 16:08:48 GMT 2000
This is a message that has \\ some \" strings to escape and should be long enough to 
test the classes.
Thu Mar 16 16:13:36 GMT 2000

Reply via email to