Re: Exception while adding data in multiple threads

2019-07-19 Thread Shawn Heisey
On 7/18/2019 9:50 AM, Ashish Athavale wrote: I am getting below exception while adding data into solr. I am adding data concurrently in 20 threads, 100 documents in a batch per thread. Each documents contains 40 fields and all are indexed. This issue occurs only when I add in multi threads.

Re: Does commitWithin override autoSoftCommit?

2019-07-18 Thread Shawn Heisey
On 7/18/2019 9:37 AM, Benjamin Mellish wrote: I have a solrconfig.xml file as follows: 2000 2 false ${solr.ulog.dir:} But I also submit records with a 'commitWithin' of 10 seconds. It seems that my documents are not

Re: Correct order of mappinCharFilter, Tokenizer and GermanStemFilter

2019-07-18 Thread Shawn Heisey
On 7/18/2019 3:01 AM, Doris Peter wrote: So, the mappingCharFilter seems to be executed at first, no matter which position it has in the configuration? CharFilters are always executed first. Then one Tokenizer, then Filters. This will always be the case, even if you order the config so

Re: Hardware requirements to host Apache Solr application

2019-07-15 Thread Shawn Heisey
On 7/15/2019 6:37 AM, Kaushal Shriyan wrote: Are there any recommended hardware requirements to host Apache Solr application? For example is it t2.large in AWS or similar configuration in GCP and Microsoft Azure. Thanks in advance and i look forward to hearing from you. This question is

Re: Solr Wiki Page Down

2019-07-14 Thread Shawn Heisey
On 7/14/2019 2:33 AM, Prince Manohar wrote: Apace Solr's wiki page is down. I tried accessing the following pages, but it's showing 404 Not Found. https://wiki.apache.org/solr https://wiki.apache.org/solr/IRCChannels Apache took down the MoinMoin wiki system, which is what

Re: Spark-Solr connector

2019-07-12 Thread Shawn Heisey
On 7/11/2019 8:50 PM, Dwane Hall wrote: I’ve just started looking at the excellent spark-solr project (thanks Tim Potter, Kiran Chitturi, Kevin Risden and Jason Gerlowski for their efforts with this project it looks really neat!!). I’m only at the initial stages of my exploration but I’m

Re: Solr 6.6.0 - DIH - Multiple entities - Multiple DBs

2019-07-11 Thread Shawn Heisey
On 7/11/2019 9:04 AM, Joseph_Tucker wrote: Looks like I've managed to get some semblance of this working. The indexes are much faster, but the RAM usage by SolrJ is quite high. Is it normal to see around 6GB of RAM usage? (My test is indexing 250,000 records with the 50 child entities)

Re: Solr Sudden I/O spike

2019-07-11 Thread Shawn Heisey
On 6/14/2019 5:53 AM, Sripra deep wrote: Any help would be appreciated, I am using solr 7.1.0, Suddenly we got a high I/O even with a very low request rate and the core went down. Did anybody experience the same or root cause of this. Below are the log error msg that we got from solr.log

Re: 8.0 upgrade issue

2019-07-10 Thread Shawn Heisey
On 6/19/2019 7:15 PM, Scott Yeadon wrote: I’m running Solr on Ubuntu 18.04 (32-bit) using OpenJDK 10.0.2. Up until now I have had no problem with Solr (started running it since 4.x), however after upgrading from 7.x to 8.x I am getting serious memory issues. I have a small repository of

Re: QTime

2019-07-10 Thread Shawn Heisey
On 7/10/2019 3:17 PM, Lucky Sharma wrote: I am seeing one very weird behaviour of QTime of SOLR. Scenario is : When I am hitting the Solr Cloud Instance, situated at a DC with my local machine while load test I was seeing 400ms Qtime response and 1sec Http Response time. How much data was in

Re: Dynamic field schema

2019-07-10 Thread Shawn Heisey
On 7/10/2019 6:52 AM, ericstein wrote: the documents are in both cores. the official title field data exists in both. However, it only gives me the official_title_s field in the second core when I query. When I look at the schema in the admin it only shows the official_title_s field The schema

Re: Sorting based on Date and Name combination in Solr 6.2

2019-07-10 Thread Shawn Heisey
On 7/10/2019 7:22 AM, Santosh Kumar S wrote: When we try sorting it, then the records are getting sorted based on Timestamp, but we need to sort only on date part keeping timestamp part aside. That is how sorting on a field that has a timestamp is going to work. It will always use the whole

Re: Dynamic field schema

2019-07-09 Thread Shawn Heisey
On 7/9/2019 5:42 PM, ericstein wrote: I am expecting both cores to have the following fields: official_title_s official_title_t However, the second core only recognizes: official_title_s It seems that the schema doesn't recognize my field the same in both cores. What do you mean by

Re: SolrQuery setting the boolean operator to use

2019-07-09 Thread Shawn Heisey
On 7/9/2019 4:52 PM, Steven White wrote: In this code sample that's part of my overall code, how do I tell Solr dynamically / programmatically to use AND or OR as the default operator? SolrQuery query = new SolrQuery(queryString); query.setRequestHandler("/select_test"); response =

Re: SOLR running on Azure web app/services

2019-07-09 Thread Shawn Heisey
On 7/9/2019 10:22 AM, Victor Casas wrote: Hello, is Solr supported on Azure web app or Azure Service? If so, is there any white paper on how to get this done or recommendations/suggestions?? Solr is written entirely in java, and it is implemented as a webapp, which requires a servlet

Re: Swap space in Solr Getting full and unable to index all data in solr

2019-07-09 Thread Shawn Heisey
On 7/9/2019 5:59 AM, PandurangPailvan wrote: In my dot net application for indexing in solr using solrconnection.Post() to index. initially in solr UI swap space displays 32gb and once reach to 32 it grows up to 64 gb and then stops indexing with system.outofmemory exception. Can anybody help

Re: Solr 7 - core is lost after restart

2019-07-09 Thread Shawn Heisey
On 7/9/2019 6:07 AM, talvinder.matharu wrote: If solr restarts only search for cores within the Solr home, should the value 'instanceDir' be limited to relative directory path when creating them in via the admin interface, installing the instance into Solr home and therefore avoiding this

Re: Facet Query performance

2019-07-08 Thread Shawn Heisey
On 7/8/2019 12:00 PM, Midas A wrote: Number of Docs :50+ docs Index Size: 300 GB RAM: 256 GB JVM: 32 GB Half a million documents producing an index size of 300GB suggests *very* large documents. That typically produces an index with fields that have very high cardinality, due to text

Re: Facet Query performance

2019-07-08 Thread Shawn Heisey
On 7/8/2019 3:08 AM, Midas A wrote: I have enabled docvalues on facet field but query is still taking time. How i can improve the Query time . docValues="true" multiValued="true" termVectors="true" /> *Query: * There's very little information here -- only a single field definition and

Re: Solr 7 - core is lost after restart

2019-07-04 Thread Shawn Heisey
On 7/3/2019 4:07 AM, Talvinder Matharu wrote: I have created a new core via the admin console, also tried via the API, and notice that after a restart it fails to pick up the cores that were created. To overcome this, I have to recreate the core again using the admin console to allow me to

Re: tlog/commit questions

2019-07-04 Thread Shawn Heisey
On 7/3/2019 1:36 AM, Avi Steiner wrote: We had some cases with customers (Solr 5.3.1, one search node, one shard) with huge tlog files (more than 1 GB). With 30 seconds on the autoCommit, that should not be happening. When a hard commit fires, the current tlog is closed and a new one

Re: Sort on PointFieldType

2019-07-04 Thread Shawn Heisey
On 7/4/2019 9:14 AM, Prince Manohar wrote: I am using Solr version *6.4.2*. I got your answers. I have another question. Can we use a *Range query* on point field? I am trying to do something like *fq=abc.pqr_d:[ 1500 TO 2000 ]* Is it a valid filter? Yes. In fact, range queries are one of

Re: SolrCloud indexing triggers merges and timeouts

2019-07-03 Thread Shawn Heisey
On 7/2/2019 10:53 PM, Rahul Goswami wrote: Hi Shawn, Thank you for the detailed suggestions. Although, I would like to understand the maxMergeCount and maxThreadCount params better. The documentation

Re: Add dynamic field to existing index slow

2019-06-30 Thread Shawn Heisey
On 6/30/2019 2:08 PM, derrick cui wrote: Good point Erick, I will try it today, but I have already use cursorMark in my query for deep pagination. Also I noticed that my cpu usage is pretty high, 8 cores, usage is over 700%. I am not sure it will help if I use ssd disk That depends on

Re: qf in conjunction with boost

2019-06-29 Thread Shawn Heisey
On 6/27/2019 8:54 PM, Mark Sholund wrote: qf=title^5 description^5 _text_ And now I want to include additional boosting based on a popularity score include with some documents. I’ve done this as follows q={!boost b=map(popularity_d,0,0,1)} However now it seems that the score is the same

Re: Different number of replicas for different shards

2019-06-29 Thread Shawn Heisey
On 6/29/2019 12:23 AM, Nawab Zada Asad Iqbal wrote: is it possible to specify different number of replicas for different shards? i.e if I expect some shard to get more queries , i can add more replicas to that shard alone, instead of adding replicas for all the shards. On initial collection

Re: Large Filter Query

2019-06-26 Thread Shawn Heisey
On 6/26/2019 12:56 PM, Lucky Sharma wrote: @Shawn: Sorry I forgot to mention the corpus size: the corpus size is around 3 million docs, where we need to query for 1500 docs and run aggregations, sorting, search on them. Assuming the documents aren't HUGE, that sounds like something Solr

Re: Large Filter Query

2019-06-26 Thread Shawn Heisey
On 6/26/2019 12:31 PM, Lucky Sharma wrote: What we are doing is, we will be having a set of unique Ids of solr document at max 1500, we need to run faceting and sorting among them. there is no direct search involved. It's a head-on search since we already know the document unique keys

Re: SolrInputDocument setField method

2019-06-26 Thread Shawn Heisey
On 6/26/2019 9:52 AM, Vincenzo D'Amore wrote: I have a very basic question related to the SolrInputDocument behaviour. Looking at SolrInputDocument source code I found how the method setField works: public void setField(String name, Object value ) { SolrInputField field = new

Re: Solr filter query on text fields

2019-06-24 Thread Shawn Heisey
On 6/24/2019 5:37 PM, Wei wrote: I'm assuming that the asterisks here are for emphasis, that they are not actually present. This can be very confusing. It is far better to relay the precise information and not try to emphasize anything. For query q=*:*=description:”ice cream”, the

Re: Is Solr can do that ?

2019-06-21 Thread Shawn Heisey
On 6/21/2019 10:32 AM, Matheo Software Info wrote: My question is very simple JI would like to know if Solr can process around 30To of data (Pdf, Text, Word, etc…) ? What is the best way to index this huge data ? several servers ? several shards ? other ? Sure, Solr can do that. Whether

Re: Delete with Solrj deleteByQuery - Boolean clauses

2019-06-19 Thread Shawn Heisey
On 6/19/2019 8:33 AM, rgummadi wrote: Still using Solr 4.6. Terms query parser does not exist in this version right? Correct. It was added in version 4.10.0. https://issues.apache.org/jira/browse/SOLR-6318 Thanks, Shawn

Re: CloudSolrClient :java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Map . Related to "router" : "compositeId"

2019-06-19 Thread Shawn Heisey
On 6/19/2019 7:44 AM, Shawn Heisey wrote: The version of SolrJ that's included in Spring Boot 1.5.8 is 5.5.4 ... CloudSolrClient does not do well when the client version is significantly different than the server version.  Pairing a 5.5.4 client with a 4.10.3 server could be problematic

Re: CloudSolrClient :java.lang.ClassCastException: java.lang.String cannot be cast to java.util.Map . Related to "router" : "compositeId"

2019-06-19 Thread Shawn Heisey
On 6/19/2019 7:32 AM, Rushikesh Garadade wrote: I am using CloudSolrClient with Spring boot Solr 1.5.18.RELEASE and Solr Version is Solr 4.10.3. When using Spring's packaging of SolrJ, you need to talk to Spring about most problems you're having. They do more than simply include SolrJ and

Re: Solr 5.3 to 6.0

2019-06-18 Thread Shawn Heisey
On 6/18/2019 12:16 PM, ilango dhandapani wrote: Tried several attempts like delete collection/config, take index backup from 5.3, clear index and place them back after upgrade. All tried resulted in faceting not working with 5.3 and 6.0 data combined. Most likely what happened here is that the

Re: unix socket or D-Bus?

2019-06-17 Thread Shawn Heisey
On 6/16/2019 10:43 PM, Felipe Gasper wrote: Does Solr do its own authentication, or does Jetty do that? One of the benefits of UNIX sockets is that the socket exposes the peer’s credentials, so Solr/Jetty could implement logic that says, “ah, you’re root? Cool, you’re in.” As far as I know,

Re: Question regarding negated block join queries

2019-06-17 Thread Shawn Heisey
On 6/17/2019 4:46 AM, Bram Biesbrouck wrote: q={!parent which=-(parentUri:*)}*:* Pure negative queries do not work in Lucene. Sometimes, when you do a single-clause negative query, Solr is able to detect the problem and automatically make an adjustment so the query works. This happens

Re: unix socket or D-Bus?

2019-06-15 Thread Shawn Heisey
On 6/15/2019 9:15 AM, Felipe Gasper wrote: Has it ever been proposed to have Solr listen on a UNIX socket or D-Bus rather than TCP? This would alleviate the need for local Solr integrations (e.g., Dovecot) to store “dummy” credentials, and it would tighten security by

Re: Facet on multicore search when one field exists only in one of cores

2019-06-14 Thread Shawn Heisey
On 6/14/2019 7:54 AM, Claudio R wrote: When I try this request to get facet of fields: fieldA, fieldB and fieldC on multicore search, I get error: http://localhost:8983/solr/core1/select?q=*:*=localhost:8983/solr/core1,localhost:8983/solr/core2=*,[shard]=true=fieldA=fieldB=fieldC Error from

Re: SOLR EofException help

2019-06-13 Thread Shawn Heisey
On 6/13/2019 7:30 AM, ennio wrote: The server for most part runs fine, but when I look at the logs I see from time to time the following error. org.eclipse.jetty.io.EofException: Closed Jetty's EofException is nearly always caused by a specific event: The client talking to Solr closed the

Re: Increased disk space usage 8.1.1 vs 7.7.1

2019-06-13 Thread Shawn Heisey
On 6/13/2019 4:19 AM, Markus Jelsma wrote: We are upgrading to Solr 8. One of our reindexed collections takes a GB more than the production uses which is on 7.7.1. Production also has deleted documents. This means Solr 8 somehow uses more disk space. I have checked both Solr and Lucene's

Re: SolrCloud indexing triggers merges and timeouts

2019-06-13 Thread Shawn Heisey
On 6/6/2019 9:00 AM, Rahul Goswami wrote: *OP Reply* : Total 48 GB per node... I couldn't see another software using a lot of memory. I am honestly not sure about the reason for change of directory factory to SimpleFSDirectoryFactory. But I was told that with mmap at one point we started to see

Re: Is it possible configure a single data-config.xml file for all the environments?

2019-06-13 Thread Shawn Heisey
On 6/12/2019 7:46 PM, Hugo Angel Rodriguez wrote: Thanks Shawn for your answers Regarding your question: " Are these environments on separate Solr instances, separate servers, or are they on the same Solr instance?" My answers is: These environments are on separate solr instances, separate

Re: Is it possible configure a single data-config.xml file for all the environments?

2019-06-12 Thread Shawn Heisey
On 6/12/2019 9:05 AM, Hugo Angel Rodriguez wrote: I need to configure a single data-config.xml file in solr for SAS AML 7.1. I have three environments: Development, quality and production, and you know the first lines in a data-config.xml file is for connection to a database (database name,

Re: Issue in solr result

2019-06-12 Thread Shawn Heisey
On 6/12/2019 12:22 AM, Nikhil Reddy wrote: I am setting up an elastic search engine using SOLR. I have few columns where the datatype is VARCHAR(). The column in the result below(image uploaded) is a alphanumric document ID. So when I use varchar as a column datatype and ingest the data into

Re: Sort date stored in text field?

2019-06-10 Thread Shawn Heisey
On 6/10/2019 3:26 PM, Dave Beckstrom wrote: I have a field called metatag.date that is field-type: org.apache.solr.schema.TextFieldThe field is being populated by NUTCH, which grabs the date from the html: I'm trying to sort by date (metatag.date desc) passed on the URL and it's not

Re: Query takes a long time Solr 6.1.0

2019-06-10 Thread Shawn Heisey
On 6/10/2019 3:24 AM, vishal patel wrote: We have 27 collections and each collection has many schema fields and in live too many search and index create requests come and most of the searching requests are sorting, faceting, grouping, and long query. So approx average 40GB heap are used so we

Re: Query takes a long time Solr 6.1.0

2019-06-07 Thread Shawn Heisey
On 6/6/2019 5:45 AM, vishal patel wrote: One server(256GB RAM) has two below Solr instance and other application also 1) shards1 (80GB heap ,790GB Storage, 449GB Indexed data) 2) replica of shard2 (80GB heap, 895GB Storage, 337GB Indexed data) The second server(256GB RAM and 1 TB storage) has

Re: Custom cache for Solr Cloud mode

2019-06-07 Thread Shawn Heisey
On 6/7/2019 8:49 AM, Erick Erickson wrote: Yes. ZooKeeper has a “blob store”. See the Blob Store API in the ref guide. Minor nit. You will be creating a jar file, and configuring your collection to be able to find the new jar file. Then you _upload_ both to ZooKeeper and reload your

Re: Urgent help on solr optimisation issue !!

2019-06-07 Thread Shawn Heisey
On 6/6/2019 11:27 PM, jena wrote: Because of heavy indexing & deletion, we optimise solr instance everyday, because of that our solr cloud getting unstable , every solr instance go on recovery mode & our search is getting affected & very slow because of that. Optimisation takes around 1hr

Re: strange behavior

2019-06-06 Thread Shawn Heisey
On 6/6/2019 12:46 PM, Wendy2 wrote: Why "AND" didn't work anymore? I use Solr 7.3.1 and edismax parser. Could someone explain to me why the following query doesn't work any more? What could be the cause? Thanks! q=audit_author.name:Burley,%20S.K.%20AND%20entity.type:polymer It worked

Re: Unexpected behaviour when Solr 6 Admin UI pages are cached and server is Solr 8?

2019-06-05 Thread Shawn Heisey
On 6/5/2019 2:40 PM, Gus Heck wrote: Experiences that force the user to think about the browser cache are sub-par :). Anything that changes the URL will interrupt caching so just adding a query parameter &_v=8.1.1 (or whatever) to every request would probably do the trick, there's no need to

Re: SolrCloud indexing triggers merges and timeouts

2019-06-05 Thread Shawn Heisey
On 6/5/2019 9:39 AM, Rahul Goswami wrote: I have a solrcloud setup on Windows server with below config: 3 nodes, 24 shards with replication factor 2 Each node hosts 16 cores. 16 CPU cores, or 16 Solr cores? The info may not be all that useful either way, but just in case, it should be

Re: Unexpected behaviour when Solr 6 Admin UI pages are cached and server is Solr 8?

2019-06-05 Thread Shawn Heisey
On 6/5/2019 11:10 AM, Colvin Cowie wrote: Upon opening the Admin UI I got some nasty behaviour, which appears to be a result of some the Solr 6 Admin UI pages being cached. In general I would consider this a bug, and a good reason to raise an issue in Jira. The admin UI should tell the

Re: Solr Migration to The AWS Cloud

2019-06-05 Thread Shawn Heisey
On 6/5/2019 11:18 AM, Joe Lerner wrote: Our application is migrating from on-premise to AWS. We are currently on Solr Cloud 7.3.0. We are interested in exploring ways to do this with minimal, down-time, as in, maybe one hour. One strategy would be to set up a new empty Solr Cloud instance in

Re: query parsed in different ways in two identical solr instances

2019-06-05 Thread Shawn Heisey
On 6/5/2019 8:41 AM, Danilo Tomasoni wrote: Hello, I have two solr instances with exactly the same configuration. The only difference that i know is that the first (the working one, is solr 7.3.0, while the one that's not working is solr 7.3.1) If I execute the same query (with debugQuery=on)

Re: Query takes a long time Solr 6.1.0

2019-06-05 Thread Shawn Heisey
On 6/5/2019 7:08 AM, vishal patel wrote: I have attached RAR file but not attached properly. Again attached txt file. For 2 shards and 2 replicas, we have 2 servers and each has 256 GB ram and 1 TB storage. One shard and another shard replica in one server. You got lucky. Even text files

Re: Query takes a long time Solr 6.1.0

2019-06-05 Thread Shawn Heisey
On 6/5/2019 5:35 AM, vishal patel wrote: We have 2 shards and 2 replicas in Live also have multiple collections. We are performing heavy search and update. There is no information here about how many servers are serving those four shard replicas. -> I have*attached*some query which takes

Re: Solr Heap Usage

2019-06-03 Thread Shawn Heisey
On 6/2/2019 4:35 PM, John Davis wrote: If we assume there is no query load then effectively this boils down to most effective way for adding a large number of documents to the solr index. I've looked through SolrJ, DIH and others -- is the bottomline across all of them to "batch updates" and not

Re: where to see deleted document in Solr log

2019-06-03 Thread Shawn Heisey
On 6/3/2019 2:51 PM, Wendy2 wrote: Hi, I am using Solr 7.3.1 to index data via DIH. Solr admin panel indicated that 152160 documents got indexed, while 3944 documents were deleted. But DIH indicated that added/update: 662059 documents. Deleted 0 documents. I try to find the deleted documents,

Re: Using Solr as a Database?

2019-06-03 Thread Shawn Heisey
On 6/2/2019 7:28 AM, Ralph Soika wrote: This is not intended to contradict the other replies you've gotten, only supplement them. Now as far as I understand is solr a cluster enabled datastore which can be used to store also all the data form our document. The problem with relational

Re: Please help on pdate type during indexing

2019-06-03 Thread Shawn Heisey
On 6/2/2019 11:34 PM, derrick cui wrote: I spent whole day to indexing my data to solr(8.0), but there is one field which type is pdate always failed. error adding field 'UpdateDate'='org.apache.solr.common.SolrInputField:UpdateDate=2019-06-03T05:22:14.842Z' msg=Invalid Date in Date Math

Re: WG: SolrException: Can't determine a Sort Order with Solr 6.6

2019-06-03 Thread Shawn Heisey
On 6/3/2019 5:12 AM, Schwank, Désirée wrote: But how can it be protected. What can I do? What is to configure? Can you help me with an example. Place the Solr server in a network location such that only trusted systems and people can reach it. Sanitize all input in your application before

Re: Solr Heap Usage

2019-06-01 Thread Shawn Heisey
On 6/1/2019 12:27 AM, John Davis wrote: I've read a bunch of the wiki's on solr heap usage and wanted to confirm my understanding of what all does solr use the heap for: This is something that's not straightforward to answer. It would not be wrong to say that Solr uses the Java heap for

Re: Backup not working in SOlr 6.6

2019-05-31 Thread Shawn Heisey
On 5/31/2019 10:57 AM, Chuck Reynolds wrote: Hey guys I’m try to do a backup of my Solr cloud cluster but it is never starting. When I execute the async backup command it returns quickly like I would expect with the following response 0111234 But the backup never starts. My reply is a

Re: Issue with max documents on single instance

2019-05-31 Thread Shawn Heisey
On 5/31/2019 12:40 PM, Erie Data Systems wrote: My question is this, can I implement a "clustered" environment on single server so I can take advantage of the segmented data? I have a TON (96gb) of RAM and plenty of SSD disk space available... Yes. One Solr instance can have many cores, and

Re: Two related Solr issues

2019-05-31 Thread Shawn Heisey
On 5/31/2019 7:50 AM, Mahesh Varma, Y. A. wrote: Hi Team, What happens to Sitecore - Solr query handling when a core is corrupted in Solr slave in a Master - slave setup? Our Sitecore site's solr search engine is a master-slave setup. One of the cores of the Slave is corrupted and is not

Re: Solr query with long query

2019-05-30 Thread Shawn Heisey
On 5/30/2019 4:13 PM, Venkateswarlu Bommineni wrote: Thank you guys for quick response. I was able to query solr by sending 1500 products using solrJ with http post method. But I had to change maxBooleanClauses to 4096 from default 1024. But I wanted to check with you guys that, will there be

Re: Solr query with long query

2019-05-30 Thread Shawn Heisey
On 5/30/2019 2:20 PM, Venkateswarlu Bommineni wrote: I have got a requirement to send many strings (~1500) in the filter query param to the solr. Can you please provide any suggestions/precautions we need to take care in this particular scenario. You'll probably want to send that as a POST,

Re: Newbie permissions problem running solr

2019-05-30 Thread Shawn Heisey
On 5/30/2019 1:04 PM, Bernard T. Higonnet wrote: I have installed solr from ports under FreeBSD 12.0 and I am trying to run solr as described in the Solr Quick Start tutorial. I keep getting permission errors: /usr/local/solr/example/cloud/node2/solr/../logs  could not be created. Exiting

Re: Solr 8.1.1, JMX and VisualVM

2019-05-30 Thread Shawn Heisey
On 5/30/2019 9:58 AM, Markus Jelsma wrote: Hello, It solves the problem! So, with this flag disabled, would that mean our Solr would have lower performance than with it? That flag ended up in the config because I was using it in my GC experiments, and my wiki page appears to have been used

Re: Very low filter cache hit ratio

2019-05-29 Thread Shawn Heisey
On 5/29/2019 7:33 AM, Saurabh Sharma wrote: Many filters are common among the queries. AFAIK, filter cache are created against filters and by that logic one should get good hit ratio for those cached filter conditions.i tried to create a cache of 100K size and that too was not producing good hit

Re: Very low filter cache hit ratio

2019-05-29 Thread Shawn Heisey
On 5/29/2019 6:57 AM, Saurabh Sharma wrote: What can be the possible reasons for low cache usage? How can I leverage cache feature for high traffic indexes? Your usage apparently does not use the exact same query (or filter query, in the case of filterCache) very often. In order to achieve

Re: SolrException: Can't determine a Sort Order with Solr 6.6

2019-05-28 Thread Shawn Heisey
On 5/28/2019 7:48 AM, Schwank, Désirée wrote: At the end of April we realized lots of errors, "SolrException: Can't determine a Sort Order (asc or desc) in sort spec 'score+desc,id+asc'" first appearance in logs about 2019-04-29, without apparent reason. The problem here is that you are

Re: Solr-8.1.0 uses much more memory

2019-05-27 Thread Shawn Heisey
On 5/27/2019 9:49 AM, Joe Doupnik wrote:     A few more numbers to contemplate. An experiment here, adding 80 PDF and PPTX files into an empty index. Solr v8.0 regular settings, 1.7GB quiesent memory consumption, 1.9GB while indexing, 2.92 minutes to do the job. Solr v8.0, using GC_TUNE from

Re: Solr-8.1.0 uses much more memory

2019-05-26 Thread Shawn Heisey
On 5/26/2019 12:52 PM, Joe Doupnik wrote:     I do queries while indexing, have done so for a long time, without difficulty nor memory usage spikes from dual use. The system has been designed to support that.     Again, one may look at the numbers using "top" or similar. Try Solr v8.0 and

Re: Solr-8.1.0 uses much more memory

2019-05-26 Thread Shawn Heisey
On 5/25/2019 9:40 AM, Joe Doupnik wrote:     Comparing memory consumption (real, not virtual) of quiesent Solr v8.0 and prior with Solr v8.1.0 reveals the older versions use about 1.6GB on my systems but v8.1.0 uses 4.5 to 5+GB. Systems used are SUSE Linux, with Oracle JDK v1.8 and openjdk

Re: very high query time on solr due to high CPU usage

2019-05-25 Thread Shawn Heisey
On 5/25/2019 6:43 AM, Saurabh Sharma wrote: Hi, Link to image https://ibb.co/8g6gXwr That screenshot is not sorted the way that was mentioned on the wiki page - by the RES memory column. I do see several other Java processes besides Solr. There might be other high-memory use processes

Re: very high query time on solr due to high CPU usage

2019-05-25 Thread Shawn Heisey
On 5/25/2019 5:11 AM, Saurabh Sharma wrote: I again faced the issue and restarting the leader worked for me this time. Please find attached the top command for further insights. First java process in screenshot is solr. Can it be a possibility that there are some issue with this particular

Re: G1GC with Solr7

2019-05-24 Thread Shawn Heisey
On 5/24/2019 9:23 AM, Shawn Heisey wrote: On 5/24/2019 9:02 AM, sajitmk wrote: Im trying to use G1GC garbage collection with Apache Solr 7.7 I've set the variable GC_TUNE as follows GC_TUNE="-XX:+UseG1GC -XX:+UseStringDeduplication" in /etc/default/solr.in.sh However, the so

Re: G1GC with Solr7

2019-05-24 Thread Shawn Heisey
On 5/24/2019 9:02 AM, sajitmk wrote: Im trying to use G1GC garbage collection with Apache Solr 7.7 I've set the variable GC_TUNE as follows GC_TUNE="-XX:+UseG1GC -XX:+UseStringDeduplication" in /etc/default/solr.in.sh However, the solr process still seems to be using CMS

Re: Restore is creating the cluster incorrectly

2019-05-24 Thread Shawn Heisey
On 5/24/2019 7:16 AM, Chuck Reynolds wrote: I have 4 instances of Solr running on 3 servers with a replication factor of 3. When I execute the command to do the restore to a new *TEST* cluster it create each master with the same IP address and port but all subsequent replicas are create

Re: Enabling SSL on SOLR breaks my SQL Server connection

2019-05-23 Thread Shawn Heisey
On 5/23/2019 9:56 AM, Paul wrote: Thanks for the reply Shawn. What I was asking is whether there is an option to exclude the comms to SQL from SOLR managed encryption as the JDBC driver manages the connection and SOLR is acting as the Client in this instance and is already using encrypted comms

Re: Enabling SSL on SOLR breaks my SQL Server connection

2019-05-23 Thread Shawn Heisey
On 5/23/2019 5:45 AM, Paul wrote: unable to find valid certification path to requested target This seems to be the root of your problem with the connection to SQL server. If I have all the context right, Java is saying it can't validate the certificate returned by the SQL server. This

Re: Is it possible to reconstruct non stored fields and tun those into stored fields

2019-05-22 Thread Shawn Heisey
On 5/22/2019 3:51 PM, Pushkar Raste wrote: Looks like giving Luke a shot is the answer. Can you point me to an example to extract the fields from inverted Index using Luke. Luke is a GUI application that can view the Lucene index in considerable detail. To use Luke directly, you'd have to

Re: CloudSolrClient (any version). Find the node your query has connected to.

2019-05-22 Thread Shawn Heisey
On 5/22/2019 10:47 AM, Russell Taylor wrote: I will add that we have set commits to be only called by the loading program. We have turned off soft and autoCommits in the solrconfig.xml. Don't turn off autoCommit. Regular hard commits, typically with openSearcher set to false so they don't

Re: SolrCloud (7.3) and Legacy replication slaves

2019-05-21 Thread Shawn Heisey
On 5/21/2019 8:48 AM, Michael Tracey wrote: Is it possible set up an existing SolrCloud cluster as the master for legacy replication to a slave server or two? It looks like another option is to use Uni-direction CDCR, but not sure what is the best option in this case. You're asking for

Re: Sort on docValue field is slow.

2019-05-20 Thread Shawn Heisey
On 5/20/2019 8:59 AM, Ashwin Ramesh wrote: Hi Shawn, Thanks for the prompt response. 1. date type def - 2. The field is brand new. I added it to schema.xml, uploaded to ZK & reloaded the collection. After that we started indexing the few thousand. Did we still need to do a full reindex to a

Re: Sort on docValue field is slow.

2019-05-20 Thread Shawn Heisey
On 5/20/2019 6:25 AM, Ashwin Ramesh wrote: Hoping to get advice on a specific issue - We have a collection of 50M documents. We recently added a featuredAt field defined as such - What is the fieldType definition for "date"? We cannot assume that you have left this the same as Solr's

Re: Solr8.0.0 Performance Test

2019-05-19 Thread Shawn Heisey
On 5/19/2019 12:20 AM, Kayak28 wrote: Hello, Apache Solr community members: I have a few questions about the load test of Solr8. - for Solr8, optimization command merge segment to 2, but not 1. Is that ok behavior? Since version 7.5, optimize with TieredMergePolicy (the default policy)

Re: minimize disc space requirement.

2019-05-18 Thread Shawn Heisey
On 5/18/2019 9:36 AM, tom_s wrote: im aware that the best practice is to have disk space on your solr servers to be 2 times the size of the index. but my goal to minimize this overhead and have my index occupy more than 50% of disk space. in our index documents have TTL, so documents are deleted

Re: ERR_SSL_VERSION_OR_CIPHER_MISMATCH Solr 8.1.0

2019-05-17 Thread Shawn Heisey
On 5/16/2019 10:16 AM, Younge, Kent A - Norman, OK - Contractor wrote: I have upgraded one of our boxes to Solr 8.1.0 on RHEL 7.6 with Java 12.0.1. I also had a certificate up for renewal and I went through my regular process of creating the certificate and key. Now I get a

Re: Seeking advice on SolrCloud production architecture with CDCR

2019-05-14 Thread Shawn Heisey
On 5/14/2019 4:55 PM, Cody Burleson wrote: I’m worried, for example, about spreading the Zookeper cluster between the two data centers because of potential latency across the pond. Maybe we keep the ZK ensemble on one side of the pond only? I imagined, for instance, 2 ZK nodes on one server,

Re: Solr node goes into recovery mode

2019-05-13 Thread Shawn Heisey
On 5/13/2019 8:26 AM, Maulin Rathod wrote: Recently we are observing issue where solr node (any random node) automatically goes into recovery mode and stops responding. Do you KNOW that these Solr instances actually need a 60GB heap? That's a HUGE heap. When a full GC happens on a heap

Re: Solr query takes a too much time in Solr 6.1.0

2019-05-13 Thread Shawn Heisey
On 5/13/2019 2:51 AM, vishal patel wrote: Executing an identical query again will likely satisfy the query from Solr's caches. Solr won't need to talk to the actual index, and it will be REALLY fast. Even a massively complex query, if it is cached, will be fast. All caches are disabled in

Re: ParseException : Unable to index date in ISODate("2019-03-12T21:53:16.841Z") format in SOLR 5.4.1

2019-05-11 Thread Shawn Heisey
On 5/11/2019 2:06 PM, Abhijit Pawar wrote: "ISODate("2019-03-12T21:53:16.841Z”)” saves the date in mongoDB as* 2019-05-09 21:53:16.841Z* which is passed to SOLR while indexing. It then throws below error: *java.text.ParseException: Unparseable date: "Tue Mar 12 21:53:16 UTC 2019"* If that

Re: very high query time on solr due to high CPU usage

2019-05-11 Thread Shawn Heisey
On 5/11/2019 12:49 PM, Saurabh Sharma wrote: Full collection is present on all 3 nodes.I have checked max docs on every node and they were around 1.5 million on each node with 0.9 Millions active records. *How much disk space do all the indexes take?* -> index size is around 2GB/per node.

Re: very high query time on solr due to high CPU usage

2019-05-11 Thread Shawn Heisey
On 5/11/2019 8:05 AM, Saurabh Sharma wrote: I have been observing a very unique pattern in our solr resource usage. I am running a cluster with 3 nodes and RAM on each node is 12GB. We are doing hard commits every 1 minute and soft commits every 15 seconds. Under normal circumstances solr

Re: What determines which logging settings are available?

2019-05-10 Thread Shawn Heisey
On 5/10/2019 4:26 PM, Oakley, Craig (NIH/NLM/NCBI) [C] wrote: We are wanting to tweak the logging levels of our Solr 7.4 nodes to see what might be helpful to add to the solr.log for debugging purposes. In investigating what is available, however, I run /solr/admin/info/logging and I find that

Re: Solr query takes a too much time in Solr 6.1.0

2019-05-10 Thread Shawn Heisey
On 5/10/2019 7:32 AM, vishal patel wrote: We have 2 shards and 2 replicas in Live environment. we have multiple collections. Some times some query takes much time(QTime=52552). There are so many documents indexing and searching within milliseconds. There could be any number of causes of

<    1   2   3   4   5   6   7   8   9   10   >