Re: High cpu and gc time when performing optimization.

2016-07-12 Thread Otis Gospodnetic
Heap: start small and increase as necessary. Leave as much RAM for FS cache, don't give it to the JVM until it starts crying. SPM for Solr will help you see when Solr and JVM are starting to hurt. Otis > On Jul 12, 2016, at 11:45, Jason wrote: > > I'm using optimize

Re: Indexing logs in Solr

2016-06-05 Thread Otis Gospodnetic
You can ship SOLR logs to Logsene or any other log management service and not worry too much about their storage/size. Otis > On Jun 5, 2016, at 02:08, Anil wrote: > > Hi , > > i would like to index logs using to enable search on it in our application. > > The problem

Re: Best way to track cumulative GC pauses in Solr

2015-11-13 Thread Otis Gospodnetic
Hi Tom, SPM for SOLR should be helpful here. See http://sematext.com/spm Otis > On Nov 13, 2015, at 10:00, Tom Evans wrote: > > Hi all > > We have some issues with our Solr servers spending too much time > paused doing GC. From turning on gc debug, and extracting

Re: Best strategy for logging security

2015-06-02 Thread Otis Gospodnetic
Logstash is open-source and free. At some point Sematext contributed Solr connector/output to Logstash. Here are some numbers about Logstash (and rsyslog, which is also an option, though it doesn't have Solr output):

Re: Solr Performance with Ram size variation

2015-04-17 Thread Otis Gospodnetic
Hi, Because you went over 31-32 GB heap you lost the benefit of compressed pointers and even though you gave the JVM more memory the GC may have had to work harder. This is a relatively well educated guess, which you can confirm if you run tests and look at GC counts, times, JVM heap memory pool

Re: Measuring QPS

2015-04-06 Thread Otis Gospodnetic
Hi Daniel, See SPM http://sematext.com/spm/, which will give you QPS and a bunch of other Solr, JVM, and OS metrics, along with alerting, anomaly detection, and not-yet-announced transaction tracing https://sematext.atlassian.net/wiki/display/PUBSPM/Transactions+Tracing. It has percentiles Wunder

Re: Best way to monitor Solr regarding crashes

2015-03-28 Thread Otis Gospodnetic
Hi Michael , SPM - http://sematext.com/spm will help. It can monitor all SOLR and JVM metrics and alert you when their values cross thresholds or become abnormal. In your case I'd first look at the JVM metrics - memory pools and their utilization. Heartbeat alert will notify you when your

Re: Solr Monitoring - Stored Stats?

2015-03-26 Thread Otis Gospodnetic
Matt, SPM will give you all that out of the box with alerts, anomaly detection etc. See http://sematext.com/spm Otis On Mar 25, 2015, at 11:26, Matt Kuiper matt.kui...@issinc.com wrote: Hello, I am familiar with the JMX points that Solr exposes to allow for monitoring of statistics

Re: How To Remove an Alert

2015-03-23 Thread Otis Gospodnetic
Hi, I think this may have been for Sematext SPM http://sematext.com/spm/ for Solr monitoring and Jack got our help a few hours ago. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Mon, Mar 23, 2015 at 7:46

Re: backport Heliosearch features to Solr

2015-03-01 Thread Otis Gospodnetic
Hi Yonik, Now that you joined Cloudera, why not everything? Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Sun, Mar 1, 2015 at 4:50 PM, Yonik Seeley ysee...@gmail.com wrote: As many of you know, I've been

Re: how to debug solr performance degradation

2015-02-25 Thread Otis Gospodnetic
Lots of suggestions here already. +1 for those JVM params from Boogie and for looking at JMX. Rebecca, try SPM http://sematext.com/spm (will look at JMX for you, among other things), it may save you time figuring out JVM/heap/memory/performance issues. If you can't tell what's slow via SPM, we

Re: Confirm Solr index corruption

2015-02-18 Thread Otis Gospodnetic
Hi, It sounds like Solr simply could not index some docs. The index is not corrupt, it's just that indexing was failing while disk was full. You'll need to re-send/re-add/re-index the missing docs (or simply all of them if you don't know which ones are missing). Otis -- Monitoring * Alerting *

Re: 43sec commit duration - blocked by index merge events?

2015-02-13 Thread Otis Gospodnetic
gilinac...@gmail.com wrote: Thanks Otis, can you confirm that a commit call will wait for merges to complete before returning? On Thu, Feb 12, 2015 at 8:46 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: If you are using Solr and SPM for Solr, you can check a report that shows

Re: How to make SolrCloud more elastic

2015-02-13 Thread Otis Gospodnetic
, Matt -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Wednesday, February 11, 2015 9:13 PM To: solr-user@lucene.apache.org Subject: Re: How to make SolrCloud more elastic Hi Matt, You could create extra shards up front, but if your queries

Re: Solr scoring confusion

2015-02-13 Thread Otis Gospodnetic
Hi Scott, Try optimizing after reindexing and this should go away. Had to do with updated/deleted docs participating in score computation. Otis On Feb 13, 2015, at 18:29, Scott Johnson sjohn...@dag.com wrote: We are getting inconsistent scoring results in Solr. It works about 95% of the

Re: Multy-tenancy and quarantee of service per application (tenant)

2015-02-12 Thread Otis Gospodnetic
Not really, not 100%, if tenants share the same hardware and there is no isolation through things like containers (in which case they don't share the same SolrCloud cluster, really). Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support *

Re: 43sec commit duration - blocked by index merge events?

2015-02-12 Thread Otis Gospodnetic
If you are using Solr and SPM for Solr, you can check a report that shows the # of files in an index and the report that shows you the max docs-num docs delta. If you see the # of files drop during a commit, that's a merge. If you see a big delta change, that's probably a merge, too. You could

Re: Solrcloud performance issues

2015-02-12 Thread Otis Gospodnetic
Hi, Did you say you have 150 servers in this cluster? And 10 shards for just 90M docs? If so, that 150 hosts sounds like too much for all other numbers I see here. I'd love to see some metrics here. e.g. what happens with disk IO around those commits? How about GC time/size info? Are JVM

Re: Solr 4.10.x on Oracle Java 1.8.x ?

2015-02-11 Thread Otis Gospodnetic
Bok Jakov, We've been running Solr with Java 8 for several months without issues. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Feb 10, 2015 at 3:03 PM, Jakov Sosic jso...@gmail.com wrote: Hi guys,

Re: Multi words query

2015-02-11 Thread Otis Gospodnetic
Hi, Can you share details about how exactly you are querying Solr? Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Wed, Feb 11, 2015 at 5:21 AM, melb melaggo...@gmail.com wrote: Hi, I have a solr collection

Re: How to make SolrCloud more elastic

2015-02-11 Thread Otis Gospodnetic
Hi Matt, You could create extra shards up front, but if your queries are fanned out to all of them, you can run into situations where there are too many concurrent queries per node causing lots of content switching and ultimately being less efficient than if you had fewer shards. So while this

Re: Solrcloud (to HDFS) poor indexing performance

2015-02-10 Thread Otis Gospodnetic
Hi Tim, Although I doubt Kafka is the problem, I'd look at that first and eliminate that. What about those Flume agents? How are they behaving in terms of CPU/GC, and such? You have 18 Solr nodes. what happens if you increase the number of Flume sinks? Are you seeing anything specific that

Re: Garbage Collection tuning - G1 is now a good option

2015-01-07 Thread Otis Gospodnetic
Not sure about AggressiveOpts, but G1 has been working for us nicely. We've successfully used it with HBase, Hadoop, Elasticsearch, and other custom Java apps (all still Java 7, but Java 8 should be even better). Not sure if we are using in on our Solr instances. e.g. see

Re: .htaccess / password

2015-01-06 Thread Otis Gospodnetic
Hi Craig, If you want to protect Solr, put it behind something like Apache / Nginx / HAProxy and put .htaccess at that level, in front of Solr. Or try something like http://blog.jelastic.com/2013/06/17/secure-access-to-your-jetty-web-application/ Otis -- Monitoring * Alerting * Anomaly Detection

Re: Solr on HDFS in a Hadoop cluster

2015-01-06 Thread Otis Gospodnetic
Hi Charles, See http://search-lucene.com/?q=solr+hdfs and https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Jan 6, 2015 at 11:02 AM,

Re: Solr on HDFS in a Hadoop cluster

2015-01-06 Thread Otis Gospodnetic
Oh, and https://issues.apache.org/jira/browse/SOLR-6743 Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Jan 6, 2015 at 12:52 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi Charles, See http

Re: SolrCloud multi-datacenter failover?

2015-01-05 Thread Otis Gospodnetic
Hi, Check http://search-lucene.com/?q=%22Cross+Data+Center+Replicaton%22 - http://issues.apache.org/jira/browse/SOLR-6273 Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Fri, Jan 2, 2015 at 4:52 PM, jaime

Re: questions about default operator within solr query string

2015-01-05 Thread Otis Gospodnetic
Hi Chun, Something like: +slug:variety +slug:entertainment headline:entertainment should work. But you may also want to use function queries for slug filtering: http://search-lucene.com/?q=fqfc_project=Solr

Re: Solr performance issues

2014-12-26 Thread Otis Gospodnetic
Likely lots of disk + network IO, yes. Put SPM for Solr on your nodes to double check. Otis On Dec 26, 2014, at 09:17, Mahmoud Almokadem prog.mahm...@gmail.com wrote: Dears, We've installed a cluster of one collection of 350M documents on 3 r3.2xlarge (60GB RAM) Amazon servers. The

# of daily/weekly/monthly Solr downloads?

2014-12-09 Thread Otis Gospodnetic
Hi, Does anyone know the number of daily/weekly/monthly Solr downloads? Thanks, Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/

Re: Standardized index metrics (Was: Constantly high disk read access (40-60M/s))

2014-12-01 Thread Otis Gospodnetic
Hi, On Sat, Nov 29, 2014 at 2:27 PM, Michael Sokolov msoko...@safaribooksonline.com wrote: On 11/29/14 1:30 PM, Toke Eskildsen wrote: Michael Sokolov [msoko...@safaribooksonline.com] wrote: I wonder if there's any value in providing this metric (total index size - stored field size - term

Re: Constantly high disk read access (40-60M/s)

2014-12-01 Thread Otis Gospodnetic
Po-Yu, To add what others have said: * Your query cache is clearly not serving its purpose, so you are just wasting your heap on it. Consider disabling it. * That's a pretty big index. Do your queries really always have to go against the whole index? Are there multiple tenants in this index

Re: Replicate a collection to a 2nd SolrCloud

2014-11-25 Thread Otis Gospodnetic
Hi, I think you are looking for this: http://search-lucene.com/?q=Cross+Data+Center+Replicationfc_project=Solr == https://issues.apache.org/jira/browse/SOLR-6273 Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On

Re: Does any solr version use lucene concurrent flush

2014-11-25 Thread Otis Gospodnetic
Yes. Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Nov 25, 2014 at 4:37 PM, Aaron Beach aaron.be...@sendgrid.com wrote: -- Aaron Beach Senior Data Scientist w: +1-303-625-7043 SendGrid -- Email

Re: New Meetup in London - Lucene/Solr User Group

2014-11-18 Thread Otis Gospodnetic
Would LOVE to see the results (assuming you can ensure the same fruit(s?) are being compared) Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Tue, Nov 18, 2014 at 11:55 AM, Alexandre Rafalovitch

Re: Search for partial name in Solr 4.x

2014-11-09 Thread Otis Gospodnetic
Hi, You may be looking for wildcard queries or ngrams. Otis On Nov 9, 2014, at 3:26 PM, PeriS peri.subrahma...@htcinc.com wrote: I was wondering if there is a way to search on partial names? Ex; Field is a string and stores values like titles of a book; When searching part of the

Re: recovery process - node with stale data elected leader

2014-11-07 Thread Otis Gospodnetic
Hi, Not a direct answer to your question, sorry, but since 4.6.0 is relatively old and there have been a ton of changes around leader election, syncing, replication, etc., I'd first jump to the latest Solr and then see if this is still a problem. Otis -- Monitoring * Alerting * Anomaly Detection

Re: Migrating cloud to another set of machines

2014-10-30 Thread Otis Gospodnetic
, 2014 at 1:16 PM, Jakov Sosic jso...@gmail.com wrote: On 10/30/2014 04:47 AM, Otis Gospodnetic wrote: Hi/Bok Jakov, 2) sounds good to me. It means no down-time. 1) means stoppage. If stoppage is not OK, but falling behind with indexing new content is OK, you could: * add a new cluster

Re: Migrating cloud to another set of machines

2014-10-29 Thread Otis Gospodnetic
Hi/Bok Jakov, 2) sounds good to me. It means no down-time. 1) means stoppage. If stoppage is not OK, but falling behind with indexing new content is OK, you could: * add a new cluster * start reading from old index and indexing into the new index * stop old cluster when done * index new

Re: Heavy Multi-threaded indexing and SolrCloud 4.10.1 replicas out of synch.

2014-10-27 Thread Otis Gospodnetic
Hi, You may simply be overwhelming your cluster-nodes. Have you checked various metrics to see if that is the case? Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr Elasticsearch Support * http://sematext.com/ On Oct 26, 2014, at 9:59 PM, S.L

Re: about Solr log file

2014-10-22 Thread Otis Gospodnetic
Hi Chunki, Having logs on the local disk is not a problem. You can use tools like rsyslog or Logstash or Flume or fluentd and ship your logs wherever you want - your own centralized logging system or Splunk or Logsene for example. This will make it easier to debug/troubleshoot, too - no need to

Re: Shared Directory for two Solr Clouds(Writer and Reader)

2014-10-20 Thread Otis Gospodnetic
Hi Jae, Sounds a bit complicated and messy to me, but maybe I'm missing something. What are you trying to accomplish with this approach? Which problems do you have that are making you look for non-straight forward setup? Otis -- Monitoring * Alerting * Anomaly Detection * Centralized Log

Re: Retrieving and updating large set of documents on Solr 4.7.2

2014-08-18 Thread Otis Gospodnetic
Hi, Not sure if you've seen https://issues.apache.org/jira/browse/SOLR-5244 ? It's not in Solr 4.7.2, but may be a good excuse to update Solr. Otis -- Solr Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Mon, Aug 18, 2014 at 4:09

Re: Anybody uses Solr JMX?

2014-08-07 Thread Otis Gospodnetic
week and probably this one too. Is there not a risk that reading certain JMX properties actually hogs the process? (or is it by design that MBeans are supposed to be read without any lock effect?). thanks for the hint. paul On 6 mai 2014, at 04:43, Otis Gospodnetic otis.gospodne

Re: Solr vs ElasticSearch

2014-08-01 Thread Otis Gospodnetic
are already using SOLR but there has been a push to check elasticsearch. All the benchmarks I have seen are at least few years old. On Fri, Aug 1, 2014 at 4:59 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Not super fresh, but more recent than the 2 links you sent: http

Re: Auto suggest with adding accents

2014-08-01 Thread Otis Gospodnetic
ignoreCase=true/ filter class=solr.LowerCaseFilterFactory / filter class=solr.StandardFilterFactory/ /analyzer /fieldType thanks best regards, Anass BENJELLOUN 2014-07-31 18:41 GMT+02:00 Otis Gospodnetic-5 [via Lucene] ml-node+s472066n4150410...@n3.nabble.com: You

Re: Solr Memory question

2014-08-01 Thread Otis Gospodnetic
Which version of Solr? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Fri, Aug 1, 2014 at 11:17 PM, Ethan eh198...@gmail.com wrote: Our SolrCloud setup : 3 Nodes with Zookeeper, 2 running SolrCloud. Current dataset

Re: Auto suggest with adding accents

2014-07-31 Thread Otis Gospodnetic
You need to do the opposite. Make sure accents are NOT removed at index query time. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Thu, Jul 31, 2014 at 5:49 PM, benjelloun anass@gmail.com wrote: hi, q=gene it

Re: Solr is working very slow after certain time

2014-07-31 Thread Otis Gospodnetic
Can we look at your disk IO and CPU? SPM http://sematext.com/spm/ can help. Isn't UseCompressedOops a typo? And deprecated? In general, may want to simplify your JVM params unless you are really sure they are helping. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr

Re: Solr vs ElasticSearch

2014-07-31 Thread Otis Gospodnetic
Not super fresh, but more recent than the 2 links you sent: http://blog.sematext.com/2012/08/23/solr-vs-elasticsearch-part-1-overview/ Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Thu, Jul 31, 2014 at 10:33 PM, Salman

Re: Solr enterprise tech support in Brazil

2014-07-09 Thread Otis Gospodnetic
Hello, Sematext would be happy to help. Please see signature. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Jul 9, 2014, at 4:15 PM, Jefferson Olyntho Neto (STI) jefferson.olyn...@unimedbh.com.br wrote: Dear all,

JOB: Solr / Elasticsearch engineer @ Sematext

2014-07-08 Thread Otis Gospodnetic
Hi, I think most people on this list have heard of Sematext http://sematext.com/, so I'll skip the company info, and just jump to the meat, which involves a lot of fun work with Solr and/or Elasticsearch: We have an opening for an engineer who knows either Elasticsearch or Solr or both and wants

Re: Tomcat or Jetty to use with solr in production ?

2014-06-30 Thread Otis Gospodnetic
Hi Gurunath, In 90% of our engagements with various Solr customers we see Jetty, which we also recommend and use ourselves for Solr + our own services and products. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Mon, Jun

Re: solr4 optimization

2014-06-09 Thread Otis Gospodnetic
Hi, I don't remember last time I ran optimize. Sure, yes, things will work faster if you optimize an index and reduce the number of segments, but if you are regularly writing to that index and performance is OK, leave it to Lucene segment merges to purge deletes. Otis -- Performance Monitoring

Re: Cache response time

2014-06-04 Thread Otis Gospodnetic
Hi Jeremy, Nothing in Solr tracks that time. Caches are pluggable. If you really want this info you could write your own cache that is just a proxy for the real cache and then you can time it. But why do you need this info? Do you suspect that is slow? Otis -- Performance Monitoring * Log

Re: Strange behaviour when tuning the caches

2014-06-03 Thread Otis Gospodnetic
help. -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: June-03-14 12:17 AM To: solr-user@lucene.apache.org Subject: Re: Strange behaviour when tuning the caches Hi Jean-Sebastien, One thing you didn't mention is whether as you

Re: Strange behaviour when tuning the caches

2014-06-02 Thread Otis Gospodnetic
Hi Jean-Sebastien, One thing you didn't mention is whether as you are increasing(I assume) cache sizes you actually see performance improve? If not, then maybe there is no value increasing cache sizes. I assume you changed only one cache at a time? Were you able to get any one of them to the

Re: Uneven shard heap usage

2014-05-31 Thread Otis Gospodnetic
Hi Joe, Are you/how are you sure all 3 shards are roughly the same size? Can you share what you run/see that shows you that? Are you sure queries are evenly distributed? Something like SPM http://sematext.com/spm/ should give you insight into that. How big are your caches? Otis --

Re: Offline Indexes Update to Shard

2014-05-29 Thread Otis Gospodnetic
Hi, On Wed, May 28, 2014 at 4:25 AM, Vineet Mishra clearmido...@gmail.comwrote: Hi All, Has anyone tried with building Offline indexes with EmbeddedSolrServer and posting it to Shards. What do you mean by posting it to shards? How is that different than copying them manually to the right

Re: Solr High GC issue

2014-05-29 Thread Otis Gospodnetic
Hi Bihan, That's a lot of parameters and without trying one can't really give you very specific and good advice. If I had to suggest something quickly I'd say: * go back to the basics - remove most of those params and stick with the basic ones. Look at GC and tune slowly by changing/adding

Re: Contribute QParserPlugin

2014-05-28 Thread Otis Gospodnetic
Hi, I think the question is not really how to do it - that's clear - http://wiki.apache.org/solr/HowToContribute The question is really about whether something like this would be of interest to Solr community, whether it is likely it would be accepted into Solr core or contrib, or whether,

Re: Percolator feature

2014-05-28 Thread Otis Gospodnetic
Yes - Luwak. Stay tuned for more. :) Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, May 28, 2014 at 4:44 PM, Jorge Luis Betancourt Gonzalez jlbetanco...@uci.cu wrote: Is there some work around in Solr ecosystem to

Re: Contribute QParserPlugin

2014-05-28 Thread Otis Gospodnetic
On Thu, May 29, 2014 at 2:50 AM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, I think the question is not really how to do it - that's clear - http://wiki.apache.org/solr/HowToContribute The question is really about whether something like this would be of interest to Solr

Re: Physical Files v. Reported Index Size

2014-05-15 Thread Otis Gospodnetic
Darrell, Look at the top index.x directory in your second image. Looks like that's your index, the same one you see in the Solr UI. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Tue, May 6, 2014 at 11:34 PM,

Re: SolrCloud - Highly Reliable / Scalable Resources?

2014-05-13 Thread Otis Gospodnetic
Hi, Re: we have suffered several issues which always seem quite problematic to resolve. Try grabbing the latest version if you can. We identified a number of issues in older SolrCloud versions when working on large client setups with thousands of cores, but a lot of those issues have been

Re: Anybody uses Solr JMX?

2014-05-05 Thread Otis Gospodnetic
Alexandre, you could use something like http://blog.sematext.com/2012/09/25/new-tool-jmxc-jmx-console/ to quickly dump everything out of JMX and see if there is anything there Solr Admin UI doesn't expose. I think you'll find there is more in JMX than Solr Admin UI shows. Otis -- Performance

Re: How to get a list of currently executing queries?

2014-04-28 Thread Otis Gospodnetic
No, though one could write a custom SearchComponent, I imagine. Not terribly useful for most situations where queries typically run for only a few milliseconds, but Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Thu,

Re: TB scale

2014-04-25 Thread Otis Gospodnetic
Hi Ed, Unfortunately, there is no good *general* advice, so you'd need to provide a lot more detail to get useful help. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Fri, Apr 25, 2014 at 3:48 PM, Ed Smiley

Re: SolrCloud load balancing during heavy indexing

2014-04-25 Thread Otis Gospodnetic
Hi, On Fri, Apr 25, 2014 at 12:54 PM, zzT zis@gmail.com wrote: Erick Erickson wrote Back up, you're misunderstanding the update process. A leader node distributes the update to every replica. So _all_ your nodes in a slice are indexing when _any_ of them index. So the idea of sending

Re: Search for a mask that matches the requested string

2014-04-25 Thread Otis Gospodnetic
Luwak is not based on the fork of Lucene or rather, the fork you are seeing is there only because the Luwak authors needed highlighting. If you don't need highlighting you can probably modify Luwak a bit to use regular Lucene. The Lucene fork you are seeing there will also, eventually, be

Re: Application of different stemmers / stopword lists within a single field

2014-04-25 Thread Otis Gospodnetic
Hi Tim, Step one is probably to detect language boundaries. You know your data. If they happen on paragraph breaks, your job will be easier. If they don't, a bit harder, but not impossible at all. I'm sure there is a ton of research on this topic out there, but the obvious approach would

Re: What contributes to disk IO?

2014-03-26 Thread Otis Gospodnetic
Lucene segment merges cause both reads and writes. If you look at SPM, you'll see the number of index files and the number of segments, which will give you an idea what's going on at that level. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support *

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-24 Thread Otis Gospodnetic
I think SQP is getting axed, no? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Mon, Mar 24, 2014 at 3:45 PM, T. Kuro Kurosaka k...@healthline.comwrote: On 3/19/14 5:13 PM, Otis Gospodnetic wrote: Hi, Guessing it's

Re: Limit on # of collections -SolrCloud

2014-03-20 Thread Otis Gospodnetic
Hours sounds too long indeed. We recently had a client with several thousand collections, but restart wasn't taking hours... Otis Solr ElasticSearch Support http://sematext.com/ On Mar 20, 2014 5:49 PM, Erick Erickson erickerick...@gmail.com wrote: How many total replicas are we talking here?

Re: w/10 ? [was: Partial Counts in SOLR]

2014-03-19 Thread Otis Gospodnetic
Hi, Guessing it's surround query parser's support for within backed by span queries. Otis Solr ElasticSearch Support http://sematext.com/ On Mar 19, 2014 4:44 PM, T. Kuro Kurosaka k...@healthline.com wrote: In the thread Partial Counts in SOLR, Salman gave us this sample query: ((stock or

Re: Excessive Heap Usage from docValues?

2014-03-19 Thread Otis Gospodnetic
Hi, Which type of doc values? See Wiki or reference guide for a list of types. Otis Solr ElasticSearch Support http://sematext.com/ On Mar 19, 2014 5:02 PM, tradergene nos...@krevets.com wrote: Hello All, I'm hoping to get your assistance in debugging what seems like a memory issue. I

Re: Indexing large documents

2014-03-18 Thread Otis Gospodnetic
Hi, I think you probably want to split giant documents because you / your users probably want to be able to find smaller sections of those big docs that are best matches to their queries. Imagine querying War and Peace. Almost any regular word your query for will produce a match. Yes, you may

Re: Help me understand these newrelic graphs

2014-03-14 Thread Otis Gospodnetic
? On Thu, Mar 13, 2014 at 8:46 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: It really depends, hard to give a definitive instruction without more pieces of info. e.g. if your CPUs are all maxed out and you already have a high number of concurrent queries than sharding

Re: Help me understand these newrelic graphs

2014-03-13 Thread Otis Gospodnetic
Hi, I think NR has support for breaking by handler, no? Just checked - no. Only webapp controller, but that doesn't apply to Solr. SPM should be more helpful when it comes to monitoring Solr - you can filter by host, handler, collection/core, etc. -- you can see the demo -

Re: Solr supports log-based recovery?

2014-03-13 Thread Otis Gospodnetic
Skimmed this, but yes, docs are durable thanks to transaction log that can replay on start. Otis Solr ElasticSearch Support http://sematext.com/ On Mar 13, 2014 8:25 PM, shushuai zhu ss...@yahoo.com wrote: Hi, I noticed the following post indicating that Solr could recover not-committed

Re: Help me understand these newrelic graphs

2014-03-13 Thread Otis Gospodnetic
by any means but our queries can get a bit complex with a bit of faceting. Do you still think it makes sense to shard? How easy would this be to get working? On Thu, Mar 13, 2014 at 4:02 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, I think NR has support for breaking

Disabling lookups into disabled caches?

2014-03-11 Thread Otis Gospodnetic
Hi, Is there a way to disable cache *lookups* into cached that are disabled? Check this for example: https://apps.sematext.com/spm-reports/s/Z04bfIvGyH This is a Document cache that was enabled, and then got disabled. But the lookups are still happening, which is pointless if the cache is

Re: Disabling lookups into disabled caches?

2014-03-11 Thread Otis Gospodnetic
Heisey wrote: On 3/11/2014 8:07 PM, Otis Gospodnetic wrote: Is there a way to disable cache *lookups* into cached that are disabled? Check this for example: https://apps.sematext.com/spm-reports/s/Z04bfIvGyH This is a Document cache that was enabled, and then got disabled

Re: Mixing lucene scoring and other scoring

2014-03-06 Thread Otis Gospodnetic
Hi Benson, http://lucene.apache.org/core/4_7_0/expressions/org/apache/lucene/expressions/Expression.html https://issues.apache.org/jira/browse/SOLR-5707 That? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Thu, Mar 6,

Re: Solr Filter Cache Size

2014-03-06 Thread Otis Gospodnetic
What Erick said. That's a giant Filter Cache. Have a look at these Solr metrics and note the Filter Cache in the middle: http://www.flickr.com/photos/otis/8409088080/ Note how small the cache is and how high the hit rate is. Those are stats for http://search-lucene.com/ and

Re: Indexing huge data

2014-03-05 Thread Otis Gospodnetic
Hi, 6M is really not huge these days. 6B is big, though also still not huge any more. What seems to be the bottleneck? Solr or DB or network or something else? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Wed, Mar 5,

Re: Indexing huge data

2014-03-05 Thread Otis Gospodnetic
in such cases. Thanks. On 3/5/14, 11:47 AM, Otis Gospodnetic wrote: Hi, 6M is really not huge these days. 6B is big, though also still not huge any more. What seems to be the bottleneck? Solr or DB or network or something else? Otis -- Performance Monitoring * Log Analytics * Search

Re: Indexing huge data

2014-03-05 Thread Otis Gospodnetic
to expect in such cases. Thanks. On 3/5/14, 11:47 AM, Otis Gospodnetic wrote: Hi, 6M is really not huge these days. 6B is big, though also still not huge any more. What seems to be the bottleneck? Solr or DB or network or something else? Otis -- Performance Monitoring * Log

Re: Scalability Limit of SolrCloud

2014-02-27 Thread Otis Gospodnetic
It depends on hardware, your latency requirements and such. We've helped customers with several billion documents, so big numbers alone are not a problem. Otis Solr ElasticSearch Support http://sematext.com/ On Feb 27, 2014 6:47 AM, Vineet Mishra clearmido...@gmail.com wrote: Hi All What is

Re: SolrCloud Startup

2014-02-24 Thread Otis Gospodnetic
Hi, Slow startup could it be your transaction logs are being replayed? Are they very big? Do you see lots of disk reading during those 20-30 minutes? Shawn was referring to http://wiki.apache.org/solr/SolrPerformanceProblems Otis -- Performance Monitoring * Log Analytics * Search

JOB @ Sematext: Professional Services Lead = Head

2014-02-18 Thread Otis Gospodnetic
Hello, We have what I think is a great opening at Sematext. Ideal candidate would be in New York, but that's not an absolute must. More info below + on http://sematext.com/about/jobs.html in job-ad-speak, but I'd be happy to describe what we are looking for, what we do, and what types of

Re: Solr server requirements for 100+ million documents

2014-02-11 Thread Otis Gospodnetic
, the 3 servers you mean here are 2 for shards/nodes and 1 for Zookeeper. Is that correct? Thanks, Susheel -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Friday, January 24, 2014 5:21 PM To: solr-user@lucene.apache.org Subject: Re: Solr server

Re: need help in understating solr cloud stats data

2014-02-04 Thread Otis Gospodnetic
solution. - Mark http://about.me/markrmiller On Feb 3, 2014, at 11:10 PM, Otis Gospodnetic otis.gospodne...@gmail.com wrote: Hi, Oh, I just saw Greg's email on dev@ about this. IMHO aggregating in the search engine is not the way to do. Leave that to external tools, which are likely

Re: Special NGRAMish requirement

2014-02-03 Thread Otis Gospodnetic
Hi, Can you provide an example, Alexander? Otis Solr ElasticSearch Support http://sematext.com/ On Feb 3, 2014 5:28 AM, Lochschmied, Alexander alexander.lochschm...@vishay.com wrote: Hi, we need to use something very similar to EdgeNGram (minGramSize=1 maxGramSize=50 side=front). The

Re: how to write an efficient query with a subquery to restrict the search space?

2014-02-03 Thread Otis Gospodnetic
Hi, Sounds like a possible document and query routing use case. Otis Solr ElasticSearch Support http://sematext.com/ On Jan 31, 2014 7:11 AM, svante karlsson s...@csi.se wrote: It seems to be faster to first restrict the search space and then do the scoring compared to just use the full

Re: Adding DocValues in an existing field

2014-02-03 Thread Otis Gospodnetic
Hi, You can change the field definition and then reindex. Otis Solr ElasticSearch Support http://sematext.com/ On Jan 30, 2014 1:12 PM, yriveiro yago.rive...@gmail.com wrote: Hi, Can I add to an existing field the docvalue feature without wipe the actual? The modification on the schema

Re: need help in understating solr cloud stats data

2014-02-03 Thread Otis Gospodnetic
Hi, Oh, I just saw Greg's email on dev@ about this. IMHO aggregating in the search engine is not the way to do. Leave that to external tools, which are likely to be more flexible when it comes to this. For example, our SPM for Solr can do all kinds of aggregations and filtering by a number of

Re: Duplicate Facet.FIelds cause same results, should dedupe?

2014-02-03 Thread Otis Gospodnetic
Hi, Don't know if this is old or new problem, but it does feel like a bug to me. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On Mon, Feb 3, 2014 at 10:48 AM, William Bell billnb...@gmail.com wrote: If we add :

Re: SOLR USING 100% percent CPU and not responding after a while

2014-01-28 Thread Otis Gospodnetic
Hi, Show us more graphs. Is the GC working hard? Any of the JVM mem pools at or near 100%? SPM for Solr is your friend for long term monitoring/alerting/trends, jconsole and visualvm for a quick look. Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch

Re: Solr Related Search Suggestions

2014-01-27 Thread Otis Gospodnetic
Hi, I don't know of anything like that in OSS, but we have it here: http://sematext.com/products/related-searches/index.html Is that the functionality you are looking for? Otis -- Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ On

  1   2   3   4   5   6   7   8   9   10   >