Re: optimize boosting parameters

2020-12-07 Thread Radu Gheorghe
oosting parameters based on the requirements. > > For example, I am using 'if', 'map' and 'termfreq' functions in the bf > parameters. > > Is there a more efficient or simple function that can be use instead? Or > craft the 'formula' it in a more efficient way? > > On 7/12/2020 10:05 p

Re: doc for REQUESTSTATUS

2020-12-07 Thread Radu Gheorghe
Hi Elisabeth, This is the doc for REQUESTSTATUS, apparently only request ID is supported indeed: https://lucene.apache.org/solr/guide/8_6/coreadmin-api.html#coreadmin-requeststatus Best regards, Radu -- Sematext Cloud - Full Stack Observability - https://sematext.com Solr and Elasticsearch

Re: optimize boosting parameters

2020-12-07 Thread Radu Gheorghe
Hi Derek, It’s hard to tell whether your boosts can be made better without knowing your data and what users expect of it. Which is a problem in itself. I would suggest gathering judgements, like if a user queries for X, what doc IDs do you expect to get back? Once you have enough of these

Re: Proximity Search with phrases

2020-12-03 Thread Radu Gheorghe
Hi Mark, I don’t really get your use-case. Maybe you can provide another example? In either case, maybe the surround query parser would help? https://lucene.apache.org/solr/guide/8_4/other-parsers.html#surround-query-parser Or span queries in general via the XML query parser?

Re: Shard Lock

2020-12-03 Thread Radu Gheorghe
Wild shot here: two Solr instances started on the same data directory? Best regards, Radu -- Sematext Cloud - Full Stack Observability - https://sematext.com Solr and Elasticsearch Consulting, Training and Production Support > On 1 Dec 2020, at 06:25, sambasivarao giddaluri > wrote: > > when

Re: Facet to part of search results

2020-12-03 Thread Radu Gheorghe
> On 3 Dec 2020, at 20:18, Shawn Heisey wrote: > > On 12/3/2020 9:55 AM, Jae Joo wrote: >> Is there any way to apply facet to the partial search result? >> For ex, we have 10m return by "dog" and like to apply facet to first 10K. >> Possible? > > The point of facets is to provide accurate

Re: facet.method=smart

2020-12-03 Thread Radu Gheorghe
Hi Jae, No, it’s not smarter than explicitly defining, for example enum for a low-cardinality field. Think of “smart” as a default path, and explicit definitions as some “hints”. You can see that default path in this function:

What do you usually look for in Solr logs?

2020-11-26 Thread Radu Gheorghe
Hello Solr users, I've recently added a Solr logs integration to our logging SaaS and I wanted to ask what would be useful that I may have missed. First, there are some regexes to parse Solr Logs here:

Re: Is metrics api enabled by default in solr 8.2

2020-10-14 Thread Radu Gheorghe
Hi, Yes, the API works by default on 8.2: https://lucene.apache.org/solr/guide/8_2/metrics-reporting.html I don’t know of a way to disable it, but he configuration is described in the page above (i.e. on how to configure different reporters). Best regards, Radu -- Sematext Cloud - Full Stack

Re: how to config split authentication methods -- BasicAuth for WebUI, & none (or SSL client) for client connections?

2020-10-14 Thread Radu Gheorghe
Hello, If you enable authentication, this will work on your HTTP port. Solr won’t make a difference on whether the request comes from the Web UI or Dovecot. I guess the workaround could be to put the web UI behind a proxy like NGINX and have authentication there? But if anyone can have direct

Re: Solr Document Update issues

2020-10-14 Thread Radu Gheorghe
Hi, I wouldn’t commit on every update. The general practice is to use autoCommit and autoSoftCommit, so this work is done in background depending on how quickly you want data persisted and available for search:

Re: Question regarding replica leader

2020-07-19 Thread Radu Gheorghe
Hi Vishal, I think that’s true, yes. The cluster has a leader (overseer), but this particular shard doesn’t seem to have a leader (yet). Logs should give you some pointers about why this happens (it may be, for example, that each replica is waiting for the other to become a leader, because

Re: Log4J Logging to Http

2020-06-18 Thread Radu Gheorghe
Hi Florian, I don’t know the answer to your specific question, but I would like to suggest a different approach. Excuse me in advance, I usually hate suggesting different approaches. The reason why I suggest a different approach is because logging via HTTP can be blocking a thread e.g. until

Re: How to determine why solr stops running?

2020-06-08 Thread Radu Gheorghe
check the last logs after it crashed. Best regards, Radu https://sematext.com > On 8 Jun 2020, at 16:28, Ryan W wrote: > > "If Solr auto-restarts" > > It doesn't auto-restart. Is there some auto-restart functionality? I'm > not aware of that. > > On Mon, Ju

Re: Getting to grips with auto-scaling

2020-06-08 Thread Radu Gheorghe
Hi Tom, To your last two questions, I'd like to vent an alternative design: have dedicated "hot" and "warm" nodes. That is, 2020+lists will go to the hot tier, and 2019, 2018,2017+lists go to the warm tier. Then you can scale the hot tier based on your query load. For the warm tier, I assume

Re: How to determine why solr stops running?

2020-06-08 Thread Radu Gheorghe
Hi Ryan, If Solr auto-restarts, I suppose it's systemd doing that. When it restarts the Solr service, systemd should log this (maybe somethibg like: journalctl --no-pager | grep -i solr). Then you can go in your Solr logs and check what happened right before that time. Also, check system logs

Re: Shingles behavior

2020-05-21 Thread Radu Gheorghe
Turns out, it’s down to setting enableGraphQueries=false in the field definition. I completely missed that :( > On 21 May 2020, at 07:49, Radu Gheorghe wrote: > > Hi Alex, long time no see :) > > I tried with sow, and that basically invalidates query-time shingles (it only

Re: Shingles behavior

2020-05-20 Thread Radu Gheorghe
g on both passes > rather than just indexing one. But at least it is something to try and > is one of the difference areas between Solr and ES. > > Regards, >Alex. > > On Tue, 19 May 2020 at 05:59, Radu Gheorghe > wrote: > > > > Hello Solr users, > > >

Shingles behavior

2020-05-19 Thread Radu Gheorghe
Hello Solr users, I’m quite puzzled about how shingles work. The way tokens are analysed looks fine to me, but the query seems too restrictive. Here’s the sample use-case. I have three documents: mona lisa smile mona lisa mona I have a shingle filter set up like this (both index- and

Re: Which Solr metrics do you find important?

2020-04-29 Thread Radu Gheorghe
that stuff. > > > > I wrote this when I was using datadog to supplement what they offered: > > https://github.com/msporleder/dd-solrcloud/blob/master/solrcloud.py > > (sorry for crappy python) and it got me most of the monitoring I > > needed for my particular situation. >

Re: Which Solr metrics do you find important?

2020-04-28 Thread Radu Gheorghe
t; > > > On Tue, Apr 28, 2020 at 7:57 AM Radu Gheorghe > wrote: > > > > Hi fellow Solr users, > > > > I'm looking into improving our Solr monitoring > > <https://sematext.com/docs/integration/solr/> and I was curious on which > > metrics you co

Which Solr metrics do you find important?

2020-04-28 Thread Radu Gheorghe
Hi fellow Solr users, I'm looking into improving our Solr monitoring and I was curious on which metrics you consider relevant. >From what we currently have, I'm only really missing fieldCache. Which we collect, but not show in the UI yet (unless you

Re: Filtered join in Solr?

2020-02-05 Thread Radu Gheorghe
01-01T00:00:00Z"], > "watched_movies":["2"], > "_version_":1657646162827542528, > "movies":{"numFound":1,"start":0,"docs":[ > { > "id":"2", &g

Filtered join in Solr?

2020-02-04 Thread Radu Gheorghe
Hello Solr users, How would you design a filtered join scenario? Say I have a bunch of movies (excuse any inaccuracies, this is an imagined scenario): curl -XPOST -H 'Content-Type: application/json' 'localhost:8983/solr/test/update?commitWithin=1000' --data-binary ' [{ "id": "1", "title":

solr-diagnostics: utility for collecting info from the Solr installation

2020-01-16 Thread Radu Gheorghe
Hello Solr users :) We just published a small tool that collects diagnostics information: configs, logs, metrics API output, etc as well as system info (dmesg, netstat, top...). I thought others might find it interesting, so here's a short blog post that describes it:

Re: Partial results from streaming expressions (i.e. making them "stream")

2018-01-17 Thread Radu Gheorghe
gt; http://joelsolr.blogspot.com/ > > On Wed, Jan 17, 2018 at 8:54 AM, Radu Gheorghe <radu.gheor...@sematext.com> > wrote: > >> Hello, >> >> I have some updates on this, but it's still not very clear for me how >> to move forward. >> >> T

Re: Partial results from streaming expressions (i.e. making them "stream")

2018-01-17 Thread Radu Gheorghe
earch Analytics Solr & Elasticsearch Support * http://sematext.com/ On Mon, Jan 15, 2018 at 10:58 AM, Radu Gheorghe <radu.gheor...@sematext.com> wrote: > Hello fellow solr-users! > > Currently, if I do an HTTP request to receive some data via streaming > expressions, like: &

Partial results from streaming expressions (i.e. making them "stream")

2018-01-15 Thread Radu Gheorghe
Hello fellow solr-users! Currently, if I do an HTTP request to receive some data via streaming expressions, like: curl --data-urlencode 'expr=search(test, q="foo_s:*", fl="foo_s", sort="foo_s

autoAddReplicas doesn't respect replicationFactor?

2017-10-03 Thread Radu Gheorghe
Hello, I'm trying to figure out if this is an issue or I'm doing something wrong. Basically, with Solr 6.6.1 running on HDFS (Hadoop 2.7.4), I see that if I create a one-shard collection with replicationFactor=1 (on one node), then add a second node, it creates a new replica on this new node. I

Re: Multiple rollups/facets in one streaming aggregation?

2016-08-17 Thread Radu Gheorghe
les from each worker and the merge the metrics and then emit the >> merged metrics in and EOF Tuple. >> >> If you think this meets your needs, feel free to create a jira and add >> begin a patch and I can help get it committed. >> >> >> Joel Bernstein &

Multiple rollups/facets in one streaming aggregation?

2016-08-16 Thread Radu Gheorghe
Hello Solr users :) Right now it seems that if I want to rollup on two different fields with streaming expressions, I would need to do two separate requests. This is too slow for our use-case, when we need to do joins before sorting and rolling up (because we'd have to re-do the joins). Since in