RE: Making fields unavailable for return to specific end points.

2013-04-18 Thread Andrew Lundgren
Hmm...  Just found this JIRA:  https://issues.apache.org/jira/browse/SOLR-3191

I think I have answered my question.

-Original Message-
From: Andrew Lundgren [mailto:lundg...@familysearch.org] 
Sent: Thursday, April 18, 2013 1:21 PM
To: solr-user@lucene.apache.org
Subject: Making fields unavailable for return to specific end points.

We have a few internal fields that we would like to restrict from being 
returned in result sets.

I have seen how fl is used in specify fields that you do what returned, I am 
kind of looking for the opposite.  There are just a few fields that don't make 
sense to return to our clients.

Is there any functionality for a blocked-fl?

Thank you!

--
Andrew


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



Making fields unavailable for return to specific end points.

2013-04-18 Thread Andrew Lundgren
We have a few internal fields that we would like to restrict from being 
returned in result sets.

I have seen how fl is used in specify fields that you do what returned, I am 
kind of looking for the opposite.  There are just a few fields that don't make 
sense to return to our clients.

Is there any functionality for a blocked-fl?

Thank you!

--
Andrew


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



RE: Query.toString printing binary in the output...

2013-03-20 Thread Andrew Lundgren
I have not.  Just guessing, but that looks like code that walks a query and 
uses the schema to figure out what the types should be.

That looks like the call I should be using.  Any idea of how much of 
performance impact this has compared to just the Query.toString call (that 
admittedly doesn't always work)?

I haven't used the debug option either, but I don't think that is the right 
path because we are currently logging all of the queries, and that seems to be 
targeted more at a one off operation.  (Still helpful to know for those cases 
though.  Thank you.)


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Tuesday, March 19, 2013 5:20 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

Did you try QueryParsing.toString? As in:

logger.info("db retrieve time=" + (System.currentTimeMillis() - start) + ", 
query=" +
 QueryParsing.toString(rb.getQuery(), rb.req.getSchema()) + ", 
indexIds=" + getIndexIds(rb));

-- Jack Krupansky

-----Original Message-
From: Andrew Lundgren
Sent: Tuesday, March 19, 2013 11:52 AM
To: solr-user@lucene.apache.org
Subject: RE: Query.toString printing binary in the output...

Thank you for clarifying.

The logging line is this:

logger.info("db retrieve time=" + (System.currentTimeMillis() - start) + ", 
query=" +
 rb.getQuery().toString().replaceAll("\\p{Cntrl}", "_") + ", indexIds=" 
+ getIndexIds(rb));

(The replaceAll call is used to clean out the binary.)

The a complete log looks like this:  (I removed some values and inserted
Zs.)

2013-03-19 01:36:58,648 INFO
[org.apache.solr.handler.component.DatabaseComponent] (http-8080-19) [] [] 
[] [] [] ip-10-212-91-229/10.212.91.229   db retrieve time=53, 
query=+(+(givenname:^1.8 | givenname_standard:^1.08 |
givenname:?^-3.6179998 | givenname:Z^0.1799) +(surname:^1.8 |
surname_standard:^1.08) +(birth_year:1855^0.495 | birth_year:1856^0.495 
| (-marriage_year:[1850 TO 1854]^1.0E-4 -death_year:[1850 TO
1854]^1.0E-4 -residence_year:[1850 TO 1854]^1.0E-4 -other_year:[1850 TO
1854]^1.0E-4 +est_birth_year_range:[180 TO 185]^-1.005)) 
+((+(birth_place:amherst,1929953 |
birth_place_ancestors:amherst,1929953^0.99 | birth_place:amherst,6279984 |
birth_place_ancestors:amherst,6279984^0.99 |
birth_place:novascotia,1927164^0.7 |
birth_place_ancestors:novascotia,1927164^0.69 |
birth_place:cumberland,1929953^0.7 |
birth_place_ancestors:cumberland,1929953^0.69 | birth_place:canada,-1^0.2)) 
| (+birth_place:?^-2.01 +((record_place:amherst,1929953^0.7 |
record_place_ancestors:amherst,1929953^0.6926 |
record_place:amherst,6279984^0.7 |
record_place_ancestors:amherst,6279984^0.6926 |
record_place:novascotia,1927164^0.4898 |
record_place_ancestors:novascotia,1927164^0.4828 |
record_place:cumberland,1929953^0.4898 |
record_place_ancestors:cumberland,1929953^0.4828 |
record_place:canada,-1^0.14) is_principal:T^0.01
(collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
collection_id:`__PP_^0.01699 collection_id:`__Ysv^0.01599
collection_id:`__Oe_^0.01499 collection_id:`__Ysw^0.01399
collection_id:`__Wi_^0.01298 collection_id:`__fLi^0.01198
collection_id:`__XRk^0.01098 collection_id:`__Uz[^0.00998
collection_id:`__SE_^0.00898 collection_id:`__Ysx^0.00798
collection_id:`__Ysh^0.006974 collection_id:`__fLh^0.005973 
collection_id:`__f _^0.00497 collection_id:`__`^C^0.00397
collection_id:`__fKM^0.00297 collection_id:`__Szo^0.00197 
collection_id:`__f ]^9.7E-4) record_type:`_^0.11
record_country:Canada^0.1 record_subcountry:Canada,Nova Scotia^0.1, 
indexIds=5649621248770, 5649707485955, 5649774056450, 5650368372995, 
5650800358658, 40314148353, 17914147586, 77849158944, 77849158945, 77849158946, 
77849158947, 77849158948, 77849158949, 77849158950, 77849158951, 77849158952, 
77849158953, 77849158954, 77849158955, 77849158956


We have seen these types of issues (though the opposite) when querying with 
non-encoded ints.

When preparing the query we have to encode the collection IDs like this:

Query q = new TermQuery(new Term(SolrTag.COLLECTION_ID.getName(),
type.readableToIndexed(Integer.toString(collectionId;

So perhaps I am using the wrong term when I used encoded, maybe it should have 
been Indexed?  But that seems to have other meanings would be potentially more 
confusing.  These are the Terms that are being printed above that remain in the 
non-readable format when toString is called. 
(Perhaps we should be using something other than readableToIndexed?)


Thanks!


-Original Message-
From:

RE: Query.toString printing binary in the output...

2013-03-19 Thread Andrew Lundgren
This is perhaps more clear:

Assuming you have a schema where:

  

Then:

  void testSamplePrint()throws IOException, SAXException, 
ParserConfigurationException{

  SolrConfig config = new SolrConfig("solrconfig.xml");
  IndexSchema schema = new IndexSchema(config, "schema.xml", null);

  TermQuery aTerm=new TermQuery(new Term("TestString","123456"));
  TermQuery bTerm=new TermQuery(new Term("TestString",
  
schema.getField("collection_id").getType().readableToIndexed("123456")));

  System.out.printf("%s\n", aTerm.toString());
  System.out.printf("%s\n", bTerm.toString());

  assertEquals(aTerm.toString(),bTerm.toString());

  }

The test output is: 

java.lang.AssertionError: 
Expected :TestString:123456
Actual   :TestString:`

I believe that this is because the Term does not know that it contains an 
encoded integer, and thus cannot parse it.  If the TermQuery knew the type, it 
could also decode it.  But w/o a query to the schema, I don't know how to get 
the toString to function correctly.


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

If you simply attach &debug=all to your URL, you should see the query come back 
in your response, XML, JSON, whatever. If that also shows bizarre characters, 
then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You may 
be getting into trouble with character sets (although I'd find that quite odd, 
but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has 
nothing at all to do with Terms, it's just the query string passed in. So I'm 
really puzzled as to what you're doing to get this kind of output, it almost 
looks like you're trying to print out the _results_ of a query, not the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren  wrote:

> I am sorry, I don't follow what you mean by debug=query.  Can you 
> elaborate on that a bit?
>
> Thanks!
>
> -Original Message-
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Sunday, March 17, 2013 8:09 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Query.toString printing binary in the output...
>
> Hmmm, without looking at the code, somehow when you specify 
> debug=query you get readable results, maybe that code would be a place to 
> start?
>
> And are you looking for the parsed output? Otherwise you could print 
> original query.
>
> Not much help
> Erick
>
>
> On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
> wrote:
>
> > We use the toString call on the query in our logs.  For some numeric 
> > types, the encoded form of the number is being printed instead of 
> > the readable form.
> >
> > This makes tail and some other tools very unhappy...
> >
> > Here is a partial example of a query.toString() that would have had 
> > binary in it.  As a short term work around I replaced all 
> > non-printable characters in the string with an '_'.
> >
> > (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> > collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> > collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> > collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> > collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> > collection_id:`__PP_^0.01699 collection_id:`__Ysv^0.01599
> > collection_id:`__Oe_^0.01499 collection_id:`__Ysw^0.01399
> > collection_id:`__Wi_^0.01298 collection_id:`__fLi^0.01198
> > collection_id:`__XRk^0.01098 collection_id:`__Uz[^0.00998
> > collection_id:`__SE_^0.00898 collection_id:`__Ysx^0.00798
> > collection_id:`__Ysh^0.006974 collection_id:`__fLh^0.005973 
> > collection_id:`__f _^0.00497 collection_id:`__`^C^0.00397
> > collection_id:`__fKM^0.00297 collection_id:`__Szo^0.00197 
> > collection_id:`__f ]^9.7E-4)
> >
> > But, as you can see, that is less than useful...
> >
> > I spent some time looking at the source and found that Term does not 
> > contain the type of the embedded data.  Any possible solutions to 
> > this short of walking the query and getting the type of each field 
> > from the schema and creating my own print function?
> >
> > Thanks!
> >
> > --
&g

RE: Query.toString printing binary in the output...

2013-03-19 Thread Andrew Lundgren
Thank you for clarifying.

The logging line is this:

logger.info("db retrieve time=" + (System.currentTimeMillis() - start) + ", 
query=" +
 rb.getQuery().toString().replaceAll("\\p{Cntrl}", "_") + ", indexIds=" 
+ getIndexIds(rb));

(The replaceAll call is used to clean out the binary.)

The a complete log looks like this:  (I removed some values and inserted Zs.)

2013-03-19 01:36:58,648 INFO  
[org.apache.solr.handler.component.DatabaseComponent] (http-8080-19) [] [] [] 
[] [] ip-10-212-91-229/10.212.91.229   db retrieve time=53, 
query=+(+(givenname:^1.8 | givenname_standard:^1.08 | 
givenname:?^-3.6179998 | givenname:Z^0.1799) +(surname:^1.8 | 
surname_standard:^1.08) +(birth_year:1855^0.495 | birth_year:1856^0.495 | 
(-marriage_year:[1850 TO 1854]^1.0E-4 -death_year:[1850 TO 1854]^1.0E-4 
-residence_year:[1850 TO 1854]^1.0E-4 -other_year:[1850 TO 1854]^1.0E-4 
+est_birth_year_range:[180 TO 185]^-1.005)) +((+(birth_place:amherst,1929953 | 
birth_place_ancestors:amherst,1929953^0.99 | birth_place:amherst,6279984 | 
birth_place_ancestors:amherst,6279984^0.99 | birth_place:novascotia,1927164^0.7 
| birth_place_ancestors:novascotia,1927164^0.69 | 
birth_place:cumberland,1929953^0.7 | 
birth_place_ancestors:cumberland,1929953^0.69 | birth_place:canada,-1^0.2)) | 
(+birth_place:?^-2.01 +((record_place:amherst,1929953^0.7 | 
record_place_ancestors:amherst,1929953^0.6926 | 
record_place:amherst,6279984^0.7 | 
record_place_ancestors:amherst,6279984^0.6926 | 
record_place:novascotia,1927164^0.4898 | 
record_place_ancestors:novascotia,1927164^0.4828 | 
record_place:cumberland,1929953^0.4898 | 
record_place_ancestors:cumberland,1929953^0.4828 | 
record_place:canada,-1^0.14) is_principal:T^0.01 
(collection_id:`__z_[^0.027 collection_id:`__nB+^0.026 
collection_id:`__Zl_^0.025 collection_id:`__i49^0.024 
collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022 
collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02 collection_id:`__UF&^0.019 
collection_id:`__I2g^0.018 collection_id:`__PP_^0.01699 
collection_id:`__Ysv^0.01599 collection_id:`__Oe_^0.01499 
collection_id:`__Ysw^0.01399 collection_id:`__Wi_^0.01298 
collection_id:`__fLi^0.01198 collection_id:`__XRk^0.01098 
collection_id:`__Uz[^0.00998 collection_id:`__SE_^0.00898 
collection_id:`__Ysx^0.00798 collection_id:`__Ysh^0.006974 
collection_id:`__fLh^0.005973 collection_id:`__f _^0.00497 
collection_id:`__`^C^0.00397 collection_id:`__fKM^0.00297 
collection_id:`__Szo^0.00197 collection_id:`__f ]^9.7E-4) 
record_type:`_^0.11 record_country:Canada^0.1 record_subcountry:Canada,Nova 
Scotia^0.1, indexIds=5649621248770, 5649707485955, 5649774056450, 
5650368372995, 5650800358658, 40314148353, 17914147586, 77849158944, 
77849158945, 77849158946, 77849158947, 77849158948, 77849158949, 77849158950, 
77849158951, 77849158952, 77849158953, 77849158954, 77849158955, 77849158956  


We have seen these types of issues (though the opposite) when querying with 
non-encoded ints.  

When preparing the query we have to encode the collection IDs like this:

Query q = new TermQuery(new Term(SolrTag.COLLECTION_ID.getName(), 
type.readableToIndexed(Integer.toString(collectionId;

So perhaps I am using the wrong term when I used encoded, maybe it should have 
been Indexed?  But that seems to have other meanings would be potentially more 
confusing.  These are the Terms that are being printed above that remain in the 
non-readable format when toString is called.  (Perhaps we should be using 
something other than readableToIndexed?)


Thanks!


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

If you simply attach &debug=all to your URL, you should see the query come back 
in your response, XML, JSON, whatever. If that also shows bizarre characters, 
then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You may 
be getting into trouble with character sets (although I'd find that quite odd, 
but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has 
nothing at all to do with Terms, it's just the query string passed in. So I'm 
really puzzled as to what you're doing to get this kind of output, it almost 
looks like you're trying to print out the _results_ of a query, not the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren  wrote:

> I am sorry, I don't follow what you mean by debug=query.  Can you 
> elabor

RE: Query.toString printing binary in the output...

2013-03-18 Thread Andrew Lundgren
I am sorry, I don't follow what you mean by debug=query.  Can you elaborate on 
that a bit?

Thanks!

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, March 17, 2013 8:09 AM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

Hmmm, without looking at the code, somehow when you specify debug=query you get 
readable results, maybe that code would be a place to start?

And are you looking for the parsed output? Otherwise you could print original 
query.

Not much help
Erick


On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
wrote:

> We use the toString call on the query in our logs.  For some numeric 
> types, the encoded form of the number is being printed instead of the 
> readable form.
>
> This makes tail and some other tools very unhappy...
>
> Here is a partial example of a query.toString() that would have had 
> binary in it.  As a short term work around I replaced all 
> non-printable characters in the string with an '_'.
>
> (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
> collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
> collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
> collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
> collection_id:`__UF&^0.019 collection_id:`__I2g^0.018
> collection_id:`__PP_^0.01699 collection_id:`__Ysv^0.01599
> collection_id:`__Oe_^0.01499 collection_id:`__Ysw^0.01399
> collection_id:`__Wi_^0.01298 collection_id:`__fLi^0.01198
> collection_id:`__XRk^0.01098 collection_id:`__Uz[^0.00998
> collection_id:`__SE_^0.00898 collection_id:`__Ysx^0.00798
> collection_id:`__Ysh^0.006974 collection_id:`__fLh^0.005973 
> collection_id:`__f _^0.00497 collection_id:`__`^C^0.00397
> collection_id:`__fKM^0.00297 collection_id:`__Szo^0.00197 
> collection_id:`__f ]^9.7E-4)
>
> But, as you can see, that is less than useful...
>
> I spent some time looking at the source and found that Term does not 
> contain the type of the embedded data.  Any possible solutions to this 
> short of walking the query and getting the type of each field from the 
> schema and creating my own print function?
>
> Thanks!
>
> --
> Andrew
>
>
>
>
>  NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information. 
> Any unauthorized review, use, disclosure or distribution is 
> prohibited. If you are not the intended recipient, please contact the 
> sender by reply email and destroy all copies of the original message.
>
>


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



Query.toString printing binary in the output...

2013-03-15 Thread Andrew Lundgren
We use the toString call on the query in our logs.  For some numeric types, the 
encoded form of the number is being printed instead of the readable form.

This makes tail and some other tools very unhappy...

Here is a partial example of a query.toString() that would have had binary in 
it.  As a short term work around I replaced all non-printable characters in the 
string with an '_'.

(collection_id:`__z_[^0.027 collection_id:`__nB+^0.026 
collection_id:`__Zl_^0.025 collection_id:`__i49^0.024 
collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022 
collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02 collection_id:`__UF&^0.019 
collection_id:`__I2g^0.018 collection_id:`__PP_^0.01699 
collection_id:`__Ysv^0.01599 collection_id:`__Oe_^0.01499 
collection_id:`__Ysw^0.01399 collection_id:`__Wi_^0.01298 
collection_id:`__fLi^0.01198 collection_id:`__XRk^0.01098 
collection_id:`__Uz[^0.00998 collection_id:`__SE_^0.00898 
collection_id:`__Ysx^0.00798 collection_id:`__Ysh^0.006974 
collection_id:`__fLh^0.005973 collection_id:`__f _^0.00497 
collection_id:`__`^C^0.00397 collection_id:`__fKM^0.00297 
collection_id:`__Szo^0.00197 collection_id:`__f ]^9.7E-4)

But, as you can see, that is less than useful...

I spent some time looking at the source and found that Term does not contain 
the type of the embedded data.  Any possible solutions to this short of walking 
the query and getting the type of each field from the schema and creating my 
own print function?

Thanks!

--
Andrew




 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



RE: Best way to convert a field in a query to a fq?

2011-12-13 Thread Andrew Lundgren
Thanks for the confirmation!

> -Original Message-
> From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
> Sent: Tuesday, December 13, 2011 1:02 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Best way to convert a field in a query to a fq?
> 
> Hi,
> 
> We've done similar query rewriting in a custom SearchComponent that
> runs before QueryComponent.
> 
> Otis
> 
> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
> Lucene ecosystem search :: http://search-lucene.com/
> 
> 
> >
> > From: Andrew Lundgren 
> >To: "solr-user@lucene.apache.org" 
> >Sent: Tuesday, December 13, 2011 11:58 AM
> >Subject: Best way to convert a field in a query to a fq?
> >
> >I want to modify incoming queries such that a field is always
> transformed to a filter query.  For example, I want to convert a query
> field like q= ... part_page=3 ...  to a filter query like q= ...
> fq=partpage(3) .
> >
> >Is the right way to do this in a custom component, or is there
> someplace else where this should be handled?
> >
> >We have several clients and would like to protect the server from this
> field being queried on even if they make a mistake.
> >
> >Thank you.
> >
> >--
> >Andrew Lundgren
> >lundg...@familysearch.org
> >
> >
> >NOTICE: This email message is for the sole use of the intended
> recipient(s) and may contain confidential and privileged information.
> Any unauthorized review, use, disclosure or distribution is prohibited.
> If you are not the intended recipient, please contact the sender by
> reply email and destroy all copies of the original message.
> >
> >
> >
> >
> >


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




Best way to convert a field in a query to a fq?

2011-12-13 Thread Andrew Lundgren
I want to modify incoming queries such that a field is always transformed to a 
filter query.  For example, I want to convert a query field like q= ... 
part_page=3 ...  to a filter query like q= ... fq=partpage(3) .

Is the right way to do this in a custom component, or is there someplace else 
where this should be handled?

We have several clients and would like to protect the server from this field 
being queried on even if they make a mistake.

Thank you.

--
Andrew Lundgren
lundg...@familysearch.org


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




Possible to configure the fq caching settings on the server?

2011-12-12 Thread Andrew Lundgren
Is it possible to configure solr such that the filter query cache settings is 
set to fq={!cache=false} by default?

--
Andrew Lundgren
lundg...@familysearch.org


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




filterQuery (fq=) vs q differences other than scoring.

2011-12-09 Thread Andrew Lundgren
I know that fq's are used to improve performance by reducing the data set that 
you score.

I have read the documentation that says that non-cached fq's are created in 
parallel to your query, but would like to know more about how that is done.

Does it do a match on all the FQ's, then AND the resulting doc sets and then 
once that is done score the query based on the resulting subset of documents?


--
Andrew Lundgren
lundg...@familysearch.org


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




RE: Solr cache size information

2011-12-01 Thread Andrew Lundgren
> For Filter cache
> 
> size in memory = size in solrconfig.xml * WHAT (the size of an id) ???
> (I
> don't use facet.enum method)
>

As I understand it, size is the number queries that will be cached.  My short 
experience means that the memory consumed will be data dependent.  If you have 
a huge number of documents matched in a FQ, then the size consumed will be very 
large, if you get a single match then the cached result will take much less 
memory. 

I don't know if there is a way you can bound the cache by memory rather than 
results.  I think all of the solr caches behave this way, but I am not sure.


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




RE: Solr filterCache size settings...

2011-11-21 Thread Andrew Lundgren
Thank you for your reply.

One clarification, is the maxdocs the max docs in the set, or the matched docs 
from the set?

If there are 1000 docs and 19 of them match, is the maxdocs 1000, or 19?

--
Andrew

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, November 20, 2011 8:10 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr filterCache size settings...

Each fq will create a bitmap that is bounded by (maxdocs / 8) bytes.

You can think of the entries in the fiterCache as a map where the key is the
filter query you specify and the value is the aforementioned bitmap. The
number of entries specified in the config file is the number of entries
in that map. So the cache can take up roughly (assuming the size if 512)
512 * maxDocs / 8 bytes.

Best
Erick

On Fri, Nov 18, 2011 at 6:49 PM, Andrew Lundgren
 wrote:
> I am new to solr in general and trying to get a handle on the memory 
> requirements for caching.   Specifically I am looking at the filterCache 
> right now.  The documentation on size setting seems to indicate that it is 
> the number of values to be cached.  Did I read that correctly, or is it 
> really the amount of memory that will be set aside for the cache?
>
> How do you determine how much cache each fq will consume?
>
> Thank you!
>
> --
> Andrew Lundgren
> lundg...@familysearch.org
>
>
>  NOTICE: This email message is for the sole use of the intended recipient(s) 
> and may contain confidential and privileged information. Any unauthorized 
> review, use, disclosure or distribution is prohibited. If you are not the 
> intended recipient, please contact the sender by reply email and destroy all 
> copies of the original message.
>
>
>


Solr filterCache size settings...

2011-11-18 Thread Andrew Lundgren
I am new to solr in general and trying to get a handle on the memory 
requirements for caching.   Specifically I am looking at the filterCache right 
now.  The documentation on size setting seems to indicate that it is the number 
of values to be cached.  Did I read that correctly, or is it really the amount 
of memory that will be set aside for the cache?

How do you determine how much cache each fq will consume?

Thank you!

--
Andrew Lundgren
lundg...@familysearch.org


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.