Re: SOLR 4.0 + ReversedWildcardFilterFactory + DefaultSolrHighlighter + multibyte chars => crash?

2012-10-29 Thread Tomas Zerolo
On Mon, Oct 29, 2012 at 08:55:27AM -0700, Ahmet Arslan wrote: > Hi Tomas, > > I think this is same case Marian reported before. > > https://issues.apache.org/jira/browse/SOLR-3193 > https://issues.apache.org/jira/browse/SOLR-3901 Thanks, Ahmet. Yes, by the descriptions they look very similar. I'

SOLR 4.0 + ReversedWildcardFilterFactory + DefaultSolrHighlighter + multibyte chars => crash?

2012-10-29 Thread Tomas Zerolo
Hi, SOLR gurus we're experiencing a crash with SOLR 4.0 whenever the results contain multibyte characters (more precisely: German umlauts, utf-8 encoded). The crashes only occur when using ReversedWildcardFilterFactory (which is necessary in 4.0 to be able to have wildcards at the beginning of th

Re: SOLR 4.0 / Jetty Security Set Up

2012-09-07 Thread Tomas Zerolo
On Fri, Sep 07, 2012 at 08:50:58AM +0200, Paul Libbrecht wrote: > Erick, > > I think that should be described differently... > You need to set-up protected access for some paths. > /update is one of them. > And you could make this protected at the jetty level or using Apache proxies > and rewrite

Re: AW: Indexing wildcard patterns

2012-08-13 Thread Tomas Zerolo
On Fri, Aug 10, 2012 at 12:38:46PM -0400, Jack Krupansky wrote: > "Doc1 has the pattern "AB%CD%" associated with it (somehow?!)." > > You need to clarify what you mean by that. I'm not the OP, but I think (s)he means the patterns are in the database and the string to match is given in the query.

Re: Lexical analysis tools for German language data

2012-04-12 Thread Tomas Zerolo
On Thu, Apr 12, 2012 at 03:46:56PM +, Michael Ludwig wrote: > > Von: Walter Underwood > > > German noun decompounding is a little more complicated than it might > > seem. > > > > There can be transformations or inflections, like the "s" in > > "Weinachtsbaum" (Weinachten/Baum). > > I remembe

Re: Solr as an part of api to unburden databases

2012-02-15 Thread Tomas Zerolo
On Wed, Feb 15, 2012 at 11:48:14AM +0100, Ramo Karahasan wrote: > Hi, > > > > does anyone of the maillinglist users use solr as an API to avoid database > queries? [...] Like in a... cache? Why not use a cache then? (memcached, for example, but there are more). Regards -- tomás

Re: how to avoid OOM while merge index

2012-01-09 Thread Tomas Zerolo
On Mon, Jan 09, 2012 at 01:29:39PM +0800, James wrote: > I am build the solr index on the hadoop, and at reduce step I run the task > that merge the indexes, each part of index is about 1G, I have 10 indexes to > merge them together, I always get the java heap memory exhausted, the heap > size i

Re: Poor performance on distributed search

2011-12-20 Thread Tomas Zerolo
On Mon, Dec 19, 2011 at 01:32:22PM -0800, ku3ia wrote: > >>Uhm, either I misunderstand your question or you're doing > >>a lot of extra work for nothing > > >>The whole point of sharding it exactly to collect the top N docs > >>from each shard and merge them into a single result [...] > >>

Re: Don't snowball depending on terms

2011-11-29 Thread Tomas Zerolo
On Tue, Nov 29, 2011 at 01:53:44PM -0500, François Schiettecatte wrote: > It won't and depending on how your analyzer is set up the terms are most > likely stemmed at index time. > > You could create a separate field for unstemmed terms though, or use a less > aggressive stemmer such as EnglishM

Re: Filtering results based on a set of values for a field

2011-08-18 Thread Tomas Zerolo
On Thu, Aug 18, 2011 at 02:32:48PM -0400, Erick Erickson wrote: > Hmmm, I'm still not getting it... > > You have one or more lists. These lists change once a month or so. Are > you trying > to include or exclude the documents in these lists? In our specific case to include *only* the documents ha

Re: Filtering results based on a set of values for a field

2011-08-18 Thread Tomas Zerolo
On Thu, Aug 18, 2011 at 08:36:08AM -0400, Erick Erickson wrote: > How does this list of authors get selected? The reason I'm asking is > I'm wondering > if you can "define the problem away". In other words, I'm wondering if this > is an XY problem (http://people.apache.org/~hossman/#xyproblem). :-

Re: Filtering results based on a set of values for a field

2011-08-17 Thread Tomas Zerolo
On Tue, Aug 16, 2011 at 07:56:51AM +, tomas.zer...@axelspringer.de wrote: > Hello, Solrs > > we are trying to filter out documents written by (one or more of) the authors > from > a mediumish list (~2K). The document set itself is in the millions. [...] Sorry. Forgot to say that we are usin

Re: Faceted Search Patent Lawsuit - Please Read

2011-08-17 Thread Tomas Zerolo
On Tue, Aug 16, 2011 at 03:58:29PM -0400, Grant Ingersoll wrote: > I know you mean well and are probably wondering what to do next [...] Still, a short heads-up like Johnson's would seem OK? After all, this is of concern to us all. Regards -- tomás