Antw: Re: SolrJ 8.2: Too many Connection evictor threads

2020-02-11 Thread Andreas Kahl
Erick, 


Thanks, that's why we want to upgrade our clients to the same Solr(J) version 
as the server has. But I am still fighting the uncontrolled creation of those 
Connection evictor threads in my tomcat. 


Best Regards

Andreas


>>> Erick Erickson  11.02.20 15.06 Uhr >>>
Are you running a 5x SolrJ client against an 8x server? There’s no
guarantee at all that that would work (or vice-versa for that matter).

Most generally, SolrJ clients should be able to work with version X-1, but X-3
is unsupported.

Best,
Erick

> On Feb 11, 2020, at 6:36 AM, Andreas Kahl  wrote:
> 
> Hello everyone, 
> 
> we just updated our Solr from former 5.4 to 8.2. The server runs fine,
> but in our client applications we are seeing issues with thousands of
> threads created with the name "Connection evictor". 
> Can you give a hint how to limit those threads? 
> Should we better use HttpSolrClient or Http2SolrClient?
> Is another version of SolrJ advisable?
> 
> Thanks & Best Regards
> Andreas
> 




SolrJ 8.2: Too many Connection evictor threads

2020-02-11 Thread Andreas Kahl
Hello everyone, 

we just updated our Solr from former 5.4 to 8.2. The server runs fine,
but in our client applications we are seeing issues with thousands of
threads created with the name "Connection evictor". 
Can you give a hint how to limit those threads? 
Should we better use HttpSolrClient or Http2SolrClient?
Is another version of SolrJ advisable?

Thanks & Best Regards
Andreas



Use of facet.pivot possible when there are multiple paths per document

2017-04-03 Thread Andreas Kahl
Hello everyone,

we intend to index a set of documents with a monohierarchical
classification. For the classification we need hierarchical facets in
our UI. We would like to use Pivot facets because they are more flexible
than hierarchical facets; but we are wondering if it is possible to
index multiple hierarchical entries for a single document?
E.g.
doc{
class1:123/456/789
class2:abc/def/ghi
}
-> index:
pivotLevel1{123, abc}
pivotLevel2{456, def}
pivotLevel3{789, ghi}

In the results, the numbers should not mix up with the letters. Could
this be achieved with dynamic fields? Does the parameter facet.pivot
support wildcards as field names? Are there any other ideas we do not
consider at the moment?

Thanks & Best Regards
Andreas



Antw: RE: Access Solr via Apache's mod_proxy_balancer or mod_jk (AJP)

2016-07-06 Thread Andreas Kahl
Thanks, Shawn and Daniel for your feedback. We will consider that and see what 
fits best into our environment. 

Regards
Andreas




>>> "Davis, Daniel (NIH/NLM) [C]"  05.07.16 19.36 Uhr >>>
Because access to Solr is typically to an API, rather than to webapps having 
images and static files that can be served directly, I think you can use 
mod_proxy_http just as well as mod_jk.   I would suggest you not pursue trying 
to get AJP to work.

mod_proxy_balancer will work with mod_proxy_http, but you may also want to 
consider using varnish as a front-end cache rather than Apache httpd.   I’m not 
sure about that architecture myself, because varnish’s strength is in caching 
the data from the backend systems, and Solr’s data should primarily not be 
cached.   However, varnish is very commonly used for this sort of thing – and 
if you also have other things behind the balancer (such as WordPress or 
Drupal), then varnish is becoming a better a way to go.

Hope this helps,

Dan Davis, Systems/Applications Architect (Contractor),
Office of Computer and Communications Systems,
National Library of Medicine, NIH



From: Andreas Kahl [mailto:k...@bsb-muenchen.de]
Sent: Monday, July 04, 2016 5:54 AM
To: solr-user@lucene.apache.org
Subject: Access Solr via Apache's mod_proxy_balancer or mod_jk (AJP)

Hello everyone,

we've setup two Solr servers (not SolrCloud) which shall be accessed via Apache 
webserver's load balancing (either mod_proxy_balancer or mod_jk).

1. Is it possible to configure Solr >5 to enable an AJP port as this was the 
case in earlier versions when running in Tomcat?

2. If AJP/mod_jk is not possible, how should I set up mod_proxy_balancer? At 
the moment I run into the error "All workers are in error state". This is my 
current Apache config:

BalancerMember http://server1:
BalancerMember http://server2:

ProxyPass /solrCluster balancer://solrCluster/solr
ProxyPassReverse /solrCluster balancer://solrCluster/solr

Accessing a single server with a non balanced ReverseProxy works perfectly, but 
somehow mod_proxy_balancer's health checks do get negative responses from Solr. 
Any ideas what's going wrong? (I already tried putting /solr into the 
BalancerMembers to avoid the redirect from / to /solr)

Thanks
Andreas




Access Solr via Apache's mod_proxy_balancer or mod_jk (AJP)

2016-07-04 Thread Andreas Kahl
Hello everyone, 

we've setup two Solr servers (not SolrCloud) which shall be accessed via
Apache webserver's load balancing (either mod_proxy_balancer or mod_jk).


1. Is it possible to configure Solr >5 to enable an AJP port as this was
the case in earlier versions when running in Tomcat? 

2. If AJP/mod_jk is not possible, how should I set up
mod_proxy_balancer? At the moment I run into the error "All workers are
in error state". This is my current Apache config: 

BalancerMember http://server1:
BalancerMember http://server2:

ProxyPass /solrCluster balancer://solrCluster/solr
ProxyPassReverse /solrCluster balancer://solrCluster/solr

Accessing a single server with a non balanced ReverseProxy works
perfectly, but somehow mod_proxy_balancer's health checks do get
negative responses from Solr. Any ideas what's going wrong? (I already
tried putting /solr into the BalancerMembers to avoid the redirect from
/ to /solr)

Thanks
Andreas




Is there a way to see if a JOIN retrieved any results from the secondary index?

2015-08-09 Thread Andreas Kahl
Hello everyone,
 
we have two cores in our Solr Index (Solr 5.1). The primary index contains 
metadata, the secondary fulltexts. We use JOINs to query the primary index and 
include results from the secondary.
Now we are trying to find a way to see in the results whether a result document 
has hits in the secondary fulltext index (because then we need to do some 
follow up queries to retrieve snippets).
Is this possible?
 
Thanks
Andreas


Re: Antw: Re: How to retrieve field contents as UTF-8 from Solr-Index with SolrJ

2012-10-19 Thread Andreas Kahl
Fetching the same records using a raw Http-Request works fine and
characters are OK. I am actually considering to fetch the data in Java
via raw Http-Requests + XSLTResponsWriter as a workaround, but I want to
try it first using the 'native' way with SolrJ. 

Andreas
 
>>> "Jack Krupansky"  18.10.2012 21:36 >>> 
Have you verified that the data was indexed properly (UTF-8 encoding)?
Try a 
raw HTTP request using the browser or curl and see how that field looks
in 
the resulting XML.

-- Jack Krupansky

-----Original Message- 
From: Andreas Kahl
Sent: Thursday, October 18, 2012 1:10 PM
To: j...@basetechnology.com ; solr-user@lucene.apache.org
Subject: Antw: Re: How to retrieve field contents as UTF-8 from
Solr-Index 
with SolrJ

Jack,

Thanks for the hint, but we have already set URIEncoding="UTF-8" on
all
our tomcats, too.

Regards
Andreas

>>> "Jack Krupansky"  18.10.12 17.11 Uhr >>>
It may be that your container does not have UTF-8 enabled. For
example,
with
Tomcat you need something like:



Make sure your "Connector" element has URIEncoding="UTF-8" (for
Tomcat.)

-- Jack Krupansky

-Original Message- 
From: Andreas Kahl
Sent: Thursday, October 18, 2012 10:53 AM
To: solr-user@lucene.apache.org
Subject: How to retrieve field contents as UTF-8 from Solr-Index with
SolrJ

Hello everyone,

we are trying to implement a simple Servlet querying a Solr 3.5-Index
with SolrJ. The Query we send is an identifier in order to retrieve a
single record. From the result we extract one field to return. This
field contains an XML-Document with characters from several european
and
asian alphabets, so we need UTF-8.

Now we have the problem that the string returned by
marcXml = results.get(0).getFirstValue("marcxml").toString();
is not valid UTF-8, so the resulting XML-Document is not well formed.

Here is what we do in Java:
<<
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("q", query.toString());
params.set("fl", "marcxml");
params.set("rows", "1");
try {
QueryResponse result = server.query(params,
SolrRequest.METHOD.POST);
SolrDocumentList results = result.getResults();
if (!results.isEmpty()) {
marcXml =
results.get(0).getFirstValue("marcxml").toString();
}
} catch (Exception ex) {
Logger.getLogger(MarcServer.class.getName()).log(Level.SEVERE,
null, ex);
}
>>

Charset.defaultCharset() is "UTF-8" on both, the querying machine and
the Solr-Server. Also we tried BinaryResponseParser as well as
XMLResponseParser when instantiating CommonsHttpSolrServer.

Does anyone have a solution to this? Is this related to
https://issues.apache.org/jira/browse/SOLR-2034 ? Is there
eventually a workaround?

Regards
Andreas





Antw: Re: How to retrieve field contents as UTF-8 from Solr-Index with SolrJ

2012-10-18 Thread Andreas Kahl
Jack, 

Thanks for the hint, but we have already set URIEncoding="UTF-8" on all
our tomcats, too. 

Regards
Andreas

>>> "Jack Krupansky"  18.10.12 17.11 Uhr >>>
It may be that your container does not have UTF-8 enabled. For example,
with 
Tomcat you need something like:



Make sure your "Connector" element has URIEncoding="UTF-8" (for Tomcat.)

-- Jack Krupansky

-Original Message- 
From: Andreas Kahl
Sent: Thursday, October 18, 2012 10:53 AM
To: solr-user@lucene.apache.org
Subject: How to retrieve field contents as UTF-8 from Solr-Index with
SolrJ

Hello everyone,

we are trying to implement a simple Servlet querying a Solr 3.5-Index
with SolrJ. The Query we send is an identifier in order to retrieve a
single record. From the result we extract one field to return. This
field contains an XML-Document with characters from several european and
asian alphabets, so we need UTF-8.

Now we have the problem that the string returned by
marcXml = results.get(0).getFirstValue("marcxml").toString();
is not valid UTF-8, so the resulting XML-Document is not well formed.

Here is what we do in Java:
<<
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("q", query.toString());
params.set("fl", "marcxml");
params.set("rows", "1");
try {
QueryResponse result = server.query(params,
SolrRequest.METHOD.POST);
SolrDocumentList results = result.getResults();
if (!results.isEmpty()) {
marcXml =
results.get(0).getFirstValue("marcxml").toString();
}
} catch (Exception ex) {
Logger.getLogger(MarcServer.class.getName()).log(Level.SEVERE,
null, ex);
}
>>

Charset.defaultCharset() is "UTF-8" on both, the querying machine and
the Solr-Server. Also we tried BinaryResponseParser as well as
XMLResponseParser when instantiating CommonsHttpSolrServer.

Does anyone have a solution to this? Is this related to
https://issues.apache.org/jira/browse/SOLR-2034 ? Is there
eventually a workaround?

Regards
Andreas





How to retrieve field contents as UTF-8 from Solr-Index with SolrJ

2012-10-18 Thread Andreas Kahl
Hello everyone, 

we are trying to implement a simple Servlet querying a Solr 3.5-Index
with SolrJ. The Query we send is an identifier in order to retrieve a
single record. From the result we extract one field to return. This
field contains an XML-Document with characters from several european and
asian alphabets, so we need UTF-8. 

Now we have the problem that the string returned by 
marcXml = results.get(0).getFirstValue("marcxml").toString();
is not valid UTF-8, so the resulting XML-Document is not well formed. 

Here is what we do in Java: 
<<
ModifiableSolrParams params = new ModifiableSolrParams();
params.set("q", query.toString());
params.set("fl", "marcxml");
params.set("rows", "1");
try {
QueryResponse result = server.query(params,
SolrRequest.METHOD.POST);
SolrDocumentList results = result.getResults();
if (!results.isEmpty()) {
marcXml =
results.get(0).getFirstValue("marcxml").toString();
}
} catch (Exception ex) {
Logger.getLogger(MarcServer.class.getName()).log(Level.SEVERE,
null, ex);
}
>>

Charset.defaultCharset() is "UTF-8" on both, the querying machine and
the Solr-Server. Also we tried BinaryResponseParser as well as
XMLResponseParser when instantiating CommonsHttpSolrServer. 

Does anyone have a solution to this? Is this related to
https://issues.apache.org/jira/browse/SOLR-2034 ? Is there
eventually a workaround?

Regards
Andreas





Re: Looking for Best Practices: Analyzers vs. UpdateRequestProcessors?

2009-11-27 Thread Andreas Kahl

Am 26.11.09 11:07, schrieb Shalin Shekhar Mangar:

On Wed, Nov 25, 2009 at 9:52 PM, Andreas Kahl  wrote:

   

Hello,

are there any general criteria when to use Analyzers to implement an
indexing function and when it is better to use UpdateRequestProcessors?

The main difference I found in the documentation was that
UpdateRequestProcessors are able to manipulate several fields at once
(create, read, update, delete), while Analyzers operate on the contents of a
single field at once.


 

Analyzers can only change indexed content. If a field is marked as "stored",
then it is stored and retrieved un-modified. If you want to modify the
"stored" part as well, then only an UpdateRequestProcessor can do that. In
other words, the field's value after applying UpdateRequestProcessors is fed
into analyzers (for indexed field) and stored verbatim (for stored fields).

   

Thank you very much for your answer. That cleared my sight pretty much.

Andreas


Looking for Best Practices: Analyzers vs. UpdateRequestProcessors?

2009-11-25 Thread Andreas Kahl
Hello, 

are there any general criteria when to use Analyzers to implement an indexing 
function and when it is better to use UpdateRequestProcessors? 

The main difference I found in the documentation was that 
UpdateRequestProcessors are able to manipulate several fields at once (create, 
read, update, delete), while Analyzers operate on the contents of a single 
field at once. 

Is that correct so far? Are there more experiences helping to decide which type 
of module to use implementing indexing modules? Are there differences in 
processing performance? Is one of the two APIs easier to learn/debug etc?

If you have any Best Practices with that I would be very interested to hear 
about those. 

Andreas

P.S. My experience with Search Engines is mainly with FAST where one uses 
Stages in a Pipeline no matter which feature to implement. 


Re: Normalizing multiple Chars with MappingCharFilter possible?

2009-11-24 Thread Andreas Kahl


Am 24.11.09 12:30, schrieb Koji Sekiguchi:
> Andreas Kahl wrote:
>> Hello everyone,
>>
>> is it possible to normalize Strings like '`e' (2 chars) => 'e' (in
>> contrast to 'é' (1 char) => 'e') with
>> org.apache.lucene.analysis.MappingCharFilter?
>>
>> I am asking this because I am considering to index some multilingual
>> and multi-alphabetic data with Solr which uses such Strings as a
>> substitution for 'real' Unicode characters.
>> Thanks for your advice.
>> Andreas
>>
>>
>>   
> Yes. It should work.
> MappingCharFilter supports:
>
> * char-to-char
> * string-to-char
> * char-to-string
> * string-to-string
>
> without misalignment of original offsets (i.e. highlighter works
> correctly with MappingCharFilters).
>
> Koji
>
Thanks Koji. That was all I needed to know.

Andreas



signature.asc
Description: OpenPGP digital signature


Normalizing multiple Chars with MappingCharFilter possible?

2009-11-24 Thread Andreas Kahl
Hello everyone,

is it possible to normalize Strings like '`e' (2 chars) => 'e' (in contrast to 
'é' (1 char) => 'e') with org.apache.lucene.analysis.MappingCharFilter?

I am asking this because I am considering to index some multilingual and 
multi-alphabetic data with Solr which uses such Strings as a substitution for 
'real' Unicode characters. 

Thanks for your advice. 

Andreas