This is perhaps more clear:

Assuming you have a schema where:

  <field name="collection_id" type="integer" indexed="true" stored="false" 
required="true" omitTermFreqAndPositions="true"/>

Then:

  void testSamplePrint()throws IOException, SAXException, 
ParserConfigurationException{

      SolrConfig config = new SolrConfig("solrconfig.xml");
      IndexSchema schema = new IndexSchema(config, "schema.xml", null);

      TermQuery aTerm=new TermQuery(new Term("TestString","123456"));
      TermQuery bTerm=new TermQuery(new Term("TestString",
              
schema.getField("collection_id").getType().readableToIndexed("123456")));

      System.out.printf("%s\n", aTerm.toString());
      System.out.printf("%s\n", bTerm.toString());

      assertEquals(aTerm.toString(),bTerm.toString());

  }

The test output is: 

java.lang.AssertionError: 
Expected :TestString:123456
Actual   :TestString:`

I believe that this is because the Term does not know that it contains an 
encoded integer, and thus cannot parse it.  If the TermQuery knew the type, it 
could also decode it.  But w/o a query to the schema, I don't know how to get 
the toString to function correctly.


-----Original Message-----
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

If you simply attach &debug=all to your URL, you should see the query come back 
in your response, XML, JSON, whatever. If that also shows bizarre characters, 
then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You may 
be getting into trouble with character sets (although I'd find that quite odd, 
but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has 
nothing at all to do with Terms, it's just the query string passed in. So I'm 
really puzzled as to what you're doing to get this kind of output, it almost 
looks like you're trying to print out the _results_ of a query, not the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren <lundg...@familysearch.org
> wrote:

> I am sorry, I don't follow what you mean by debug=query.  Can you 
> elaborate on that a bit?
>
> Thanks!
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Sunday, March 17, 2013 8:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Query.toString printing binary in the output...
>
> Hmmm, without looking at the code, somehow when you specify 
> debug=query you get readable results, maybe that code would be a place to 
> start?
>
> And are you looking for the parsed output? Otherwise you could print 
> original query.
>
> Not much help....
> Erick
>
>
> On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
> <lundg...@familysearch.org>wrote:
>
> > We use the toString call on the query in our logs.  For some numeric 
> > types, the encoded form of the number is being printed instead of 
> > the readable form.
> >
> > This makes tail and some other tools very unhappy...
> >
> > Here is a partial example of a query.toString() that would have had 
> > binary in it.  As a short term work around I replaced all 
> > non-printable characters in the string with an '_'.
> >
> > (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> > collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> > collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> > collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> > collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> > collection_id:`__PP_^0.016999999 collection_id:`__Ysv^0.015999999
> > collection_id:`__Oe_^0.014999999 collection_id:`__Ysw^0.013999999
> > collection_id:`__Wi_^0.012999998 collection_id:`__fLi^0.011999998
> > collection_id:`__XRk^0.010999998 collection_id:`__Uz[^0.009999998
> > collection_id:`__SE_^0.008999998 collection_id:`__Ysx^0.007999998
> > collection_id:`__Ysh^0.0069999974 collection_id:`__fLh^0.0059999973 
> > collection_id:`__f _^0.004999997 collection_id:`__`^C^0.003999997
> > collection_id:`__fKM^0.002999997 collection_id:`__Szo^0.001999997 
> > collection_id:`__f ]^9.99997E-4)
> >
> > But, as you can see, that is less than useful...
> >
> > I spent some time looking at the source and found that Term does not 
> > contain the type of the embedded data.  Any possible solutions to 
> > this short of walking the query and getting the type of each field 
> > from the schema and creating my own print function?
> >
> > Thanks!
> >
> > --
> > Andrew
> >
> >
> >
> >
> >  NOTICE: This email message is for the sole use of the intended
> > recipient(s) and may contain confidential and privileged information.
> > Any unauthorized review, use, disclosure or distribution is 
> > prohibited. If you are not the intended recipient, please contact 
> > the sender by reply email and destroy all copies of the original message.
> >
> >
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. 
> Any unauthorized review, use, disclosure or distribution is 
> prohibited. If you are not the intended recipient, please contact the 
> sender by reply email and destroy all copies of the original message.
>
>


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.

Reply via email to