Re: Cannot parse ":", using HTTP-URL as id

Ahmet Arslan Wed, 12 Sep 2012 09:59:29 -0700

Hello,

term query parser is your friend in this case. With this you don't need to 
escape anything.


SolrQuery query = new SolrQuery();

query.setQuery("{!term f=id}bar_http://bar.com/?doc=452";);

--- On Wed, 9/12/12, sy...@web.de <sy...@web.de> wrote:

> From: sy...@web.de <sy...@web.de>
> Subject: Cannot parse ":", using HTTP-URL as id
> To: solr-user@lucene.apache.org
> Date: Wednesday, September 12, 2012, 7:40 PM
> Hi,
> 
> I defined a field "id" in my schema.xml and use it as an
> <uniqueKey>:
>   <field name="id" type="string" indexed="true"
> stored="true" required="true" />
>   <uniqueKey>id</uniqueKey>
> 
> I want to store URLs with a prefix in this field to be sure
> that every id is unique among websites. For example:
>   domain_http://www.domain.com/?p=12345
>   foo_http://foo.com
>   bar_http://bar.com/?doc=452
> I wrote a Java app, which uses Solrj to communicate with a
> running Solr instance. Solr (or Solrj, not sure about this)
> complains that it can't parse ":":
>   Exception in thread "main"
> org.apache.solr.common.SolrException:
>  
> org.apache.lucene.queryparser.classic.ParseException:
>   Cannot parse 'id:domain_http://www.domain.com/?p=12345': Encountered " ":" 
> ":
> "" at line 1, column 14.
> 
> How should I handle characters like ":" to solve this
> problem?
> 
> I already tried to escape the ":" like this:
>   String id = "domain_http://www.domain.com/?p=12345".replaceAll(":",
> "\\\\:"));
>   ...
>   document.addField("id", id);
>   ...
> But then Solr (or Solrj) complains again:
>   Exception in thread "main"
> org.apache.solr.common.SolrException:
>  
> org.apache.lucene.queryparser.classic.ParseException:
>   Cannot parse
> 'id:domain_http\://www.domain.com/?p=12345': Lexical error
> at line 1, column 42.  Encountered: <EOF> after :
> "/?p=12345"
> I use 4 backslashes (\\\\) for double-escape. The first
> escape is for Java itself, the second is for Solr to handle
> it (I guess).
> 
> So what is the correct or usual way to deal with special
> characters like ":" in Solr (or Solrj)? I don't know if Solr
> or Solrj is the problem, but I guess it is Solrj?
>

Re: Cannot parse ":", using HTTP-URL as id

Reply via email to