I think this should work:
(Written in C# originally - so someone please check if it compiles - I don't
have a java compiler here)
private String discardEscapeChar(String input)
{
char[] caSource = input.toCharArray();
char[] caDest = new char[caSource.length];
int j = 0;
for (int i = 0; i < caSource.length; i++)
{
if (caSource[i] == '\\')
{
if (caSource.length == ++i)
break;
}
caDest[j++]=caSource[i];
}
return new String(caDest, 0, j);
}
Regarding your UnitTest - It think it's wrong:
> assertEquals("\\\\\\\\192.168.0.15\\\\public",
> discardEscapeChar ("\\\\192.168.0.15\\\\public"));
It should be: assertEquals("\\\\192.168.0.15\\\\public", discardEscapeChar
("\\\\\\\\192.168.0.15\\\\public"));
I would also suggest to add the following:
String s="\\\\some.host.name\\dir+:+-!():^[]\{}~*?";
assertEquals(s,discardEscapeChar(escape(s)));
Eyal
> -----Original Message-----
> From: Erik Hatcher [mailto:[EMAIL PROTECTED]
> Sent: Wednesday, July 20, 2005 22:38 PM
> To: [email protected]
> Subject: Re: QueryParser handling of backslash characters
>
>
> On Jul 19, 2005, at 11:19 AM, Jeff Davis wrote:
>
> > Hi,
> >
> > I'm seeing some strange behavior in the way the QueryParser handles
> > consecutive backslash characters. I know that backslash is
> the escape
> > character in Lucene, and so I would expect "\\\\" to match
> fields that
> > have two consecutive backslashes, but this does not seem to be the
> > case.
> >
> > The fields I'm searching are UNC paths, e.g.
> "\\192.168.0.15\public".
> > The only way I can get my query to find the record containing that
> > value is to type "FieldName:\\\192.168.0.15\\public" (three
> slashes).
> > Why is the third backslash character not treated as an
> escape? Is it
> > just that any backslash that is preceded by a backslash is
> interpreted
> > as a literal backslash character, regardless of whether the "escape"
> > backslash was itself escaped?
> >
> > I can code around this, but it seems inconsistent with the way that
> > escape characters usually work. Is this a bug, or is it
> intentional,
> > or am I missing something?
>
> I've waited until I had a chance to experiment with this
> before replying. I say that this is a bug. There is a
> private method in QueryParser called discardEscapeChar (shown
> below). I copied it to a JUnit test case and gave it this assert:
>
> assertEquals("\\\\\\\\192.168.0.15\\\\public",
> discardEscapeChar ("\\\\192.168.0.15\\\\public"));
>
> This test fails with:
>
> Expected:\\\\192.168.0.15\\public
> Actual :\192.168.0.15\public
>
> Which is wrong in my opinion. (though my head hurts thinking
> about metaescaping backslashes in Java code to make this a
> proper test)
>
> The bug is isolated to the discardEscapeChar() method where
> it eats too many backslashes. Could you have a shot at
> tweaking that method to do the right thing and submit a patch?
>
> private String discardEscapeChar(String input) {
> char[] caSource = input.toCharArray();
> char[] caDest = new char[caSource.length];
> int j = 0;
> for (int i = 0; i < caSource.length; i++) {
> if ((caSource[i] != '\\') || (i > 0 && caSource[i-1]
> == '\\')) {
> caDest[j++]=caSource[i];
> }
> }
> return new String(caDest, 0, j);
> }
>
> Erik
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>
>
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]