Making fields unavailable for return to specific end points.

2013-04-18 Thread Andrew Lundgren
We have a few internal fields that we would like to restrict from being 
returned in result sets.

I have seen how fl is used in specify fields that you do what returned, I am 
kind of looking for the opposite.  There are just a few fields that don't make 
sense to return to our clients.

Is there any functionality for a blocked-fl?

Thank you!

--
Andrew


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



RE: Making fields unavailable for return to specific end points.

2013-04-18 Thread Andrew Lundgren
Hmm...  Just found this JIRA:  https://issues.apache.org/jira/browse/SOLR-3191

I think I have answered my question.

-Original Message-
From: Andrew Lundgren [mailto:lundg...@familysearch.org] 
Sent: Thursday, April 18, 2013 1:21 PM
To: solr-user@lucene.apache.org
Subject: Making fields unavailable for return to specific end points.

We have a few internal fields that we would like to restrict from being 
returned in result sets.

I have seen how fl is used in specify fields that you do what returned, I am 
kind of looking for the opposite.  There are just a few fields that don't make 
sense to return to our clients.

Is there any functionality for a blocked-fl?

Thank you!

--
Andrew


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



RE: Query.toString printing binary in the output...

2013-03-20 Thread Andrew Lundgren
I have not.  Just guessing, but that looks like code that walks a query and 
uses the schema to figure out what the types should be.

That looks like the call I should be using.  Any idea of how much of 
performance impact this has compared to just the Query.toString call (that 
admittedly doesn't always work)?

I haven't used the debug option either, but I don't think that is the right 
path because we are currently logging all of the queries, and that seems to be 
targeted more at a one off operation.  (Still helpful to know for those cases 
though.  Thank you.)


-Original Message-
From: Jack Krupansky [mailto:j...@basetechnology.com] 
Sent: Tuesday, March 19, 2013 5:20 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

Did you try QueryParsing.toString? As in:

logger.info(db retrieve time= + (System.currentTimeMillis() - start) + , 
query= +
 QueryParsing.toString(rb.getQuery(), rb.req.getSchema()) + , 
indexIds= + getIndexIds(rb));

-- Jack Krupansky

-Original Message-
From: Andrew Lundgren
Sent: Tuesday, March 19, 2013 11:52 AM
To: solr-user@lucene.apache.org
Subject: RE: Query.toString printing binary in the output...

Thank you for clarifying.

The logging line is this:

logger.info(db retrieve time= + (System.currentTimeMillis() - start) + , 
query= +
 rb.getQuery().toString().replaceAll(\\p{Cntrl}, _) + , indexIds= 
+ getIndexIds(rb));

(The replaceAll call is used to clean out the binary.)

The a complete log looks like this:  (I removed some values and inserted
Zs.)

2013-03-19 01:36:58,648 INFO
[org.apache.solr.handler.component.DatabaseComponent] (http-8080-19) [] [] 
[] [] [] ip-10-212-91-229/10.212.91.229   db retrieve time=53, 
query=+(+(givenname:^1.8 | givenname_standard:^1.08 |
givenname:?^-3.6179998 | givenname:Z^0.1799) +(surname:^1.8 |
surname_standard:^1.08) +(birth_year:1855^0.495 | birth_year:1856^0.495 
| (-marriage_year:[1850 TO 1854]^1.0E-4 -death_year:[1850 TO
1854]^1.0E-4 -residence_year:[1850 TO 1854]^1.0E-4 -other_year:[1850 TO
1854]^1.0E-4 +est_birth_year_range:[180 TO 185]^-1.005)) 
+((+(birth_place:amherst,1929953 |
birth_place_ancestors:amherst,1929953^0.99 | birth_place:amherst,6279984 |
birth_place_ancestors:amherst,6279984^0.99 |
birth_place:novascotia,1927164^0.7 |
birth_place_ancestors:novascotia,1927164^0.69 |
birth_place:cumberland,1929953^0.7 |
birth_place_ancestors:cumberland,1929953^0.69 | birth_place:canada,-1^0.2)) 
| (+birth_place:?^-2.01 +((record_place:amherst,1929953^0.7 |
record_place_ancestors:amherst,1929953^0.6926 |
record_place:amherst,6279984^0.7 |
record_place_ancestors:amherst,6279984^0.6926 |
record_place:novascotia,1927164^0.4898 |
record_place_ancestors:novascotia,1927164^0.4828 |
record_place:cumberland,1929953^0.4898 |
record_place_ancestors:cumberland,1929953^0.4828 |
record_place:canada,-1^0.14) is_principal:T^0.01
(collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
collection_id:`__UF^0.019 collection_id:`__I2g^0.018
collection_id:`__PP_^0.01699 collection_id:`__Ysv^0.01599
collection_id:`__Oe_^0.01499 collection_id:`__Ysw^0.01399
collection_id:`__Wi_^0.01298 collection_id:`__fLi^0.01198
collection_id:`__XRk^0.01098 collection_id:`__Uz[^0.00998
collection_id:`__SE_^0.00898 collection_id:`__Ysx^0.00798
collection_id:`__Ysh^0.006974 collection_id:`__fLh^0.005973 
collection_id:`__f _^0.00497 collection_id:`__`^C^0.00397
collection_id:`__fKM^0.00297 collection_id:`__Szo^0.00197 
collection_id:`__f ]^9.7E-4) record_type:`_^0.11
record_country:Canada^0.1 record_subcountry:Canada,Nova Scotia^0.1, 
indexIds=5649621248770, 5649707485955, 5649774056450, 5650368372995, 
5650800358658, 40314148353, 17914147586, 77849158944, 77849158945, 77849158946, 
77849158947, 77849158948, 77849158949, 77849158950, 77849158951, 77849158952, 
77849158953, 77849158954, 77849158955, 77849158956


We have seen these types of issues (though the opposite) when querying with 
non-encoded ints.

When preparing the query we have to encode the collection IDs like this:

Query q = new TermQuery(new Term(SolrTag.COLLECTION_ID.getName(),
type.readableToIndexed(Integer.toString(collectionId;

So perhaps I am using the wrong term when I used encoded, maybe it should have 
been Indexed?  But that seems to have other meanings would be potentially more 
confusing.  These are the Terms that are being printed above that remain in the 
non-readable format when toString is called. 
(Perhaps we should be using something other than readableToIndexed?)


Thanks!


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org

RE: Query.toString printing binary in the output...

2013-03-19 Thread Andrew Lundgren
Thank you for clarifying.

The logging line is this:

logger.info(db retrieve time= + (System.currentTimeMillis() - start) + , 
query= +
 rb.getQuery().toString().replaceAll(\\p{Cntrl}, _) + , indexIds= 
+ getIndexIds(rb));

(The replaceAll call is used to clean out the binary.)

The a complete log looks like this:  (I removed some values and inserted Zs.)

2013-03-19 01:36:58,648 INFO  
[org.apache.solr.handler.component.DatabaseComponent] (http-8080-19) [] [] [] 
[] [] ip-10-212-91-229/10.212.91.229   db retrieve time=53, 
query=+(+(givenname:^1.8 | givenname_standard:^1.08 | 
givenname:?^-3.6179998 | givenname:Z^0.1799) +(surname:^1.8 | 
surname_standard:^1.08) +(birth_year:1855^0.495 | birth_year:1856^0.495 | 
(-marriage_year:[1850 TO 1854]^1.0E-4 -death_year:[1850 TO 1854]^1.0E-4 
-residence_year:[1850 TO 1854]^1.0E-4 -other_year:[1850 TO 1854]^1.0E-4 
+est_birth_year_range:[180 TO 185]^-1.005)) +((+(birth_place:amherst,1929953 | 
birth_place_ancestors:amherst,1929953^0.99 | birth_place:amherst,6279984 | 
birth_place_ancestors:amherst,6279984^0.99 | birth_place:novascotia,1927164^0.7 
| birth_place_ancestors:novascotia,1927164^0.69 | 
birth_place:cumberland,1929953^0.7 | 
birth_place_ancestors:cumberland,1929953^0.69 | birth_place:canada,-1^0.2)) | 
(+birth_place:?^-2.01 +((record_place:amherst,1929953^0.7 | 
record_place_ancestors:amherst,1929953^0.6926 | 
record_place:amherst,6279984^0.7 | 
record_place_ancestors:amherst,6279984^0.6926 | 
record_place:novascotia,1927164^0.4898 | 
record_place_ancestors:novascotia,1927164^0.4828 | 
record_place:cumberland,1929953^0.4898 | 
record_place_ancestors:cumberland,1929953^0.4828 | 
record_place:canada,-1^0.14) is_principal:T^0.01 
(collection_id:`__z_[^0.027 collection_id:`__nB+^0.026 
collection_id:`__Zl_^0.025 collection_id:`__i49^0.024 
collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022 
collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02 collection_id:`__UF^0.019 
collection_id:`__I2g^0.018 collection_id:`__PP_^0.01699 
collection_id:`__Ysv^0.01599 collection_id:`__Oe_^0.01499 
collection_id:`__Ysw^0.01399 collection_id:`__Wi_^0.01298 
collection_id:`__fLi^0.01198 collection_id:`__XRk^0.01098 
collection_id:`__Uz[^0.00998 collection_id:`__SE_^0.00898 
collection_id:`__Ysx^0.00798 collection_id:`__Ysh^0.006974 
collection_id:`__fLh^0.005973 collection_id:`__f _^0.00497 
collection_id:`__`^C^0.00397 collection_id:`__fKM^0.00297 
collection_id:`__Szo^0.00197 collection_id:`__f ]^9.7E-4) 
record_type:`_^0.11 record_country:Canada^0.1 record_subcountry:Canada,Nova 
Scotia^0.1, indexIds=5649621248770, 5649707485955, 5649774056450, 
5650368372995, 5650800358658, 40314148353, 17914147586, 77849158944, 
77849158945, 77849158946, 77849158947, 77849158948, 77849158949, 77849158950, 
77849158951, 77849158952, 77849158953, 77849158954, 77849158955, 77849158956  


We have seen these types of issues (though the opposite) when querying with 
non-encoded ints.  

When preparing the query we have to encode the collection IDs like this:

Query q = new TermQuery(new Term(SolrTag.COLLECTION_ID.getName(), 
type.readableToIndexed(Integer.toString(collectionId;

So perhaps I am using the wrong term when I used encoded, maybe it should have 
been Indexed?  But that seems to have other meanings would be potentially more 
confusing.  These are the Terms that are being printed above that remain in the 
non-readable format when toString is called.  (Perhaps we should be using 
something other than readableToIndexed?)


Thanks!


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

If you simply attach debug=all to your URL, you should see the query come back 
in your response, XML, JSON, whatever. If that also shows bizarre characters, 
then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You may 
be getting into trouble with character sets (although I'd find that quite odd, 
but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has 
nothing at all to do with Terms, it's just the query string passed in. So I'm 
really puzzled as to what you're doing to get this kind of output, it almost 
looks like you're trying to print out the _results_ of a query, not the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren lundg...@familysearch.org
 wrote:

 I am sorry, I don't follow what you mean by debug=query.  Can you 
 elaborate on that a bit?

 Thanks!

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com

RE: Query.toString printing binary in the output...

2013-03-19 Thread Andrew Lundgren
This is perhaps more clear:

Assuming you have a schema where:

  field name=collection_id type=integer indexed=true stored=false 
required=true omitTermFreqAndPositions=true/

Then:

  void testSamplePrint()throws IOException, SAXException, 
ParserConfigurationException{

  SolrConfig config = new SolrConfig(solrconfig.xml);
  IndexSchema schema = new IndexSchema(config, schema.xml, null);

  TermQuery aTerm=new TermQuery(new Term(TestString,123456));
  TermQuery bTerm=new TermQuery(new Term(TestString,
  
schema.getField(collection_id).getType().readableToIndexed(123456)));

  System.out.printf(%s\n, aTerm.toString());
  System.out.printf(%s\n, bTerm.toString());

  assertEquals(aTerm.toString(),bTerm.toString());

  }

The test output is: 

java.lang.AssertionError: 
Expected :TestString:123456
Actual   :TestString:`

I believe that this is because the Term does not know that it contains an 
encoded integer, and thus cannot parse it.  If the TermQuery knew the type, it 
could also decode it.  But w/o a query to the schema, I don't know how to get 
the toString to function correctly.


-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Monday, March 18, 2013 7:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

If you simply attach debug=all to your URL, you should see the query come back 
in your response, XML, JSON, whatever. If that also shows bizarre characters, 
then that will give you some idea whether it's in Solr or not.

But you haven't given us much info about how/where you call toString. You may 
be getting into trouble with character sets (although I'd find that quite odd, 
but its a possibility.

What I'm really finding confusing is that you're mentioning Term alongside
query.toString() (at least that's what I think you're saying), which has 
nothing at all to do with Terms, it's just the query string passed in. So I'm 
really puzzled as to what you're doing to get this kind of output, it almost 
looks like you're trying to print out the _results_ of a query, not the query.

So some clarification would be helpful...

Best
Erick


On Mon, Mar 18, 2013 at 12:01 PM, Andrew Lundgren lundg...@familysearch.org
 wrote:

 I am sorry, I don't follow what you mean by debug=query.  Can you 
 elaborate on that a bit?

 Thanks!

 -Original Message-
 From: Erick Erickson [mailto:erickerick...@gmail.com]
 Sent: Sunday, March 17, 2013 8:09 AM
 To: solr-user@lucene.apache.org
 Subject: Re: Query.toString printing binary in the output...

 Hmmm, without looking at the code, somehow when you specify 
 debug=query you get readable results, maybe that code would be a place to 
 start?

 And are you looking for the parsed output? Otherwise you could print 
 original query.

 Not much help
 Erick


 On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
 lundg...@familysearch.orgwrote:

  We use the toString call on the query in our logs.  For some numeric 
  types, the encoded form of the number is being printed instead of 
  the readable form.
 
  This makes tail and some other tools very unhappy...
 
  Here is a partial example of a query.toString() that would have had 
  binary in it.  As a short term work around I replaced all 
  non-printable characters in the string with an '_'.
 
  (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
  collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
  collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
  collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
  collection_id:`__UF^0.019 collection_id:`__I2g^0.018
  collection_id:`__PP_^0.01699 collection_id:`__Ysv^0.01599
  collection_id:`__Oe_^0.01499 collection_id:`__Ysw^0.01399
  collection_id:`__Wi_^0.01298 collection_id:`__fLi^0.01198
  collection_id:`__XRk^0.01098 collection_id:`__Uz[^0.00998
  collection_id:`__SE_^0.00898 collection_id:`__Ysx^0.00798
  collection_id:`__Ysh^0.006974 collection_id:`__fLh^0.005973 
  collection_id:`__f _^0.00497 collection_id:`__`^C^0.00397
  collection_id:`__fKM^0.00297 collection_id:`__Szo^0.00197 
  collection_id:`__f ]^9.7E-4)
 
  But, as you can see, that is less than useful...
 
  I spent some time looking at the source and found that Term does not 
  contain the type of the embedded data.  Any possible solutions to 
  this short of walking the query and getting the type of each field 
  from the schema and creating my own print function?
 
  Thanks!
 
  --
  Andrew
 
 
 
 
   NOTICE: This email message is for the sole use of the intended
  recipient(s) and may contain confidential and privileged information.
  Any unauthorized review, use, disclosure or distribution is 
  prohibited. If you are not the intended recipient, please contact 
  the sender by reply email and destroy all copies of the original message.
 
 


  NOTICE: This email message is for the sole use

RE: Query.toString printing binary in the output...

2013-03-18 Thread Andrew Lundgren
I am sorry, I don't follow what you mean by debug=query.  Can you elaborate on 
that a bit?

Thanks!

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, March 17, 2013 8:09 AM
To: solr-user@lucene.apache.org
Subject: Re: Query.toString printing binary in the output...

Hmmm, without looking at the code, somehow when you specify debug=query you get 
readable results, maybe that code would be a place to start?

And are you looking for the parsed output? Otherwise you could print original 
query.

Not much help
Erick


On Fri, Mar 15, 2013 at 3:24 PM, Andrew Lundgren
lundg...@familysearch.orgwrote:

 We use the toString call on the query in our logs.  For some numeric 
 types, the encoded form of the number is being printed instead of the 
 readable form.

 This makes tail and some other tools very unhappy...

 Here is a partial example of a query.toString() that would have had 
 binary in it.  As a short term work around I replaced all 
 non-printable characters in the string with an '_'.

 (collection_id:`__z_[^0.027 collection_id:`__nB+^0.026
 collection_id:`__Zl_^0.025 collection_id:`__i49^0.024
 collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022
 collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02
 collection_id:`__UF^0.019 collection_id:`__I2g^0.018
 collection_id:`__PP_^0.01699 collection_id:`__Ysv^0.01599
 collection_id:`__Oe_^0.01499 collection_id:`__Ysw^0.01399
 collection_id:`__Wi_^0.01298 collection_id:`__fLi^0.01198
 collection_id:`__XRk^0.01098 collection_id:`__Uz[^0.00998
 collection_id:`__SE_^0.00898 collection_id:`__Ysx^0.00798
 collection_id:`__Ysh^0.006974 collection_id:`__fLh^0.005973 
 collection_id:`__f _^0.00497 collection_id:`__`^C^0.00397
 collection_id:`__fKM^0.00297 collection_id:`__Szo^0.00197 
 collection_id:`__f ]^9.7E-4)

 But, as you can see, that is less than useful...

 I spent some time looking at the source and found that Term does not 
 contain the type of the embedded data.  Any possible solutions to this 
 short of walking the query and getting the type of each field from the 
 schema and creating my own print function?

 Thanks!

 --
 Andrew




  NOTICE: This email message is for the sole use of the intended
 recipient(s) and may contain confidential and privileged information. 
 Any unauthorized review, use, disclosure or distribution is 
 prohibited. If you are not the intended recipient, please contact the 
 sender by reply email and destroy all copies of the original message.




 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



Query.toString printing binary in the output...

2013-03-15 Thread Andrew Lundgren
We use the toString call on the query in our logs.  For some numeric types, the 
encoded form of the number is being printed instead of the readable form.

This makes tail and some other tools very unhappy...

Here is a partial example of a query.toString() that would have had binary in 
it.  As a short term work around I replaced all non-printable characters in the 
string with an '_'.

(collection_id:`__z_[^0.027 collection_id:`__nB+^0.026 
collection_id:`__Zl_^0.025 collection_id:`__i49^0.024 
collection_id:`__Pq%^0.023 collection_id:`__VCS^0.022 
collection_id:`__WbH^0.021 collection_id:`__Yu_^0.02 collection_id:`__UF^0.019 
collection_id:`__I2g^0.018 collection_id:`__PP_^0.01699 
collection_id:`__Ysv^0.01599 collection_id:`__Oe_^0.01499 
collection_id:`__Ysw^0.01399 collection_id:`__Wi_^0.01298 
collection_id:`__fLi^0.01198 collection_id:`__XRk^0.01098 
collection_id:`__Uz[^0.00998 collection_id:`__SE_^0.00898 
collection_id:`__Ysx^0.00798 collection_id:`__Ysh^0.006974 
collection_id:`__fLh^0.005973 collection_id:`__f _^0.00497 
collection_id:`__`^C^0.00397 collection_id:`__fKM^0.00297 
collection_id:`__Szo^0.00197 collection_id:`__f ]^9.7E-4)

But, as you can see, that is less than useful...

I spent some time looking at the source and found that Term does not contain 
the type of the embedded data.  Any possible solutions to this short of walking 
the query and getting the type of each field from the schema and creating my 
own print function?

Thanks!

--
Andrew




 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.



Best way to convert a field in a query to a fq?

2011-12-13 Thread Andrew Lundgren
I want to modify incoming queries such that a field is always transformed to a 
filter query.  For example, I want to convert a query field like q= ... 
part_page=3 ...  to a filter query like q= ... fq=partpage(3) .

Is the right way to do this in a custom component, or is there someplace else 
where this should be handled?

We have several clients and would like to protect the server from this field 
being queried on even if they make a mistake.

Thank you.

--
Andrew Lundgren
lundg...@familysearch.org


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




RE: Best way to convert a field in a query to a fq?

2011-12-13 Thread Andrew Lundgren
Thanks for the confirmation!

 -Original Message-
 From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com]
 Sent: Tuesday, December 13, 2011 1:02 PM
 To: solr-user@lucene.apache.org
 Subject: Re: Best way to convert a field in a query to a fq?
 
 Hi,
 
 We've done similar query rewriting in a custom SearchComponent that
 runs before QueryComponent.
 
 Otis
 
 Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
 Lucene ecosystem search :: http://search-lucene.com/
 
 
 
  From: Andrew Lundgren lundg...@familysearch.org
 To: solr-user@lucene.apache.org solr-user@lucene.apache.org
 Sent: Tuesday, December 13, 2011 11:58 AM
 Subject: Best way to convert a field in a query to a fq?
 
 I want to modify incoming queries such that a field is always
 transformed to a filter query.  For example, I want to convert a query
 field like q= ... part_page=3 ...  to a filter query like q= ...
 fq=partpage(3) .
 
 Is the right way to do this in a custom component, or is there
 someplace else where this should be handled?
 
 We have several clients and would like to protect the server from this
 field being queried on even if they make a mistake.
 
 Thank you.
 
 --
 Andrew Lundgren
 lundg...@familysearch.org
 
 
 NOTICE: This email message is for the sole use of the intended
 recipient(s) and may contain confidential and privileged information.
 Any unauthorized review, use, disclosure or distribution is prohibited.
 If you are not the intended recipient, please contact the sender by
 reply email and destroy all copies of the original message.
 
 
 
 
 


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




Possible to configure the fq caching settings on the server?

2011-12-12 Thread Andrew Lundgren
Is it possible to configure solr such that the filter query cache settings is 
set to fq={!cache=false} by default?

--
Andrew Lundgren
lundg...@familysearch.org


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




filterQuery (fq=) vs q differences other than scoring.

2011-12-09 Thread Andrew Lundgren
I know that fq's are used to improve performance by reducing the data set that 
you score.

I have read the documentation that says that non-cached fq's are created in 
parallel to your query, but would like to know more about how that is done.

Does it do a match on all the FQ's, then AND the resulting doc sets and then 
once that is done score the query based on the resulting subset of documents?


--
Andrew Lundgren
lundg...@familysearch.org


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




RE: Solr cache size information

2011-12-01 Thread Andrew Lundgren
 For Filter cache
 
 size in memory = size in solrconfig.xml * WHAT (the size of an id) ???
 (I
 don't use facet.enum method)


As I understand it, size is the number queries that will be cached.  My short 
experience means that the memory consumed will be data dependent.  If you have 
a huge number of documents matched in a FQ, then the size consumed will be very 
large, if you get a single match then the cached result will take much less 
memory. 

I don't know if there is a way you can bound the cache by memory rather than 
results.  I think all of the solr caches behave this way, but I am not sure.


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.




RE: Solr filterCache size settings...

2011-11-21 Thread Andrew Lundgren
Thank you for your reply.

One clarification, is the maxdocs the max docs in the set, or the matched docs 
from the set?

If there are 1000 docs and 19 of them match, is the maxdocs 1000, or 19?

--
Andrew

-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com] 
Sent: Sunday, November 20, 2011 8:10 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr filterCache size settings...

Each fq will create a bitmap that is bounded by (maxdocs / 8) bytes.

You can think of the entries in the fiterCache as a map where the key is the
filter query you specify and the value is the aforementioned bitmap. The
number of entries specified in the config file is the number of entries
in that map. So the cache can take up roughly (assuming the size if 512)
512 * maxDocs / 8 bytes.

Best
Erick

On Fri, Nov 18, 2011 at 6:49 PM, Andrew Lundgren
lundg...@familysearch.org wrote:
 I am new to solr in general and trying to get a handle on the memory 
 requirements for caching.   Specifically I am looking at the filterCache 
 right now.  The documentation on size setting seems to indicate that it is 
 the number of values to be cached.  Did I read that correctly, or is it 
 really the amount of memory that will be set aside for the cache?

 How do you determine how much cache each fq will consume?

 Thank you!

 --
 Andrew Lundgren
 lundg...@familysearch.org


  NOTICE: This email message is for the sole use of the intended recipient(s) 
 and may contain confidential and privileged information. Any unauthorized 
 review, use, disclosure or distribution is prohibited. If you are not the 
 intended recipient, please contact the sender by reply email and destroy all 
 copies of the original message.





Solr filterCache size settings...

2011-11-18 Thread Andrew Lundgren
I am new to solr in general and trying to get a handle on the memory 
requirements for caching.   Specifically I am looking at the filterCache right 
now.  The documentation on size setting seems to indicate that it is the number 
of values to be cached.  Did I read that correctly, or is it really the amount 
of memory that will be set aside for the cache?

How do you determine how much cache each fq will consume?

Thank you!

--
Andrew Lundgren
lundg...@familysearch.org


 NOTICE: This email message is for the sole use of the intended recipient(s) 
and may contain confidential and privileged information. Any unauthorized 
review, use, disclosure or distribution is prohibited. If you are not the 
intended recipient, please contact the sender by reply email and destroy all 
copies of the original message.