Re: OOM Error

2016-10-25 Thread Toke Eskildsen
On Tue, 2016-10-25 at 15:04 -0400, Susheel Kumar wrote: > Thanks, Toke.  Analyzing GC logs helped to determine that it was a > sudden > death.   > The peaks in last 20 mins... See   http://tinypic.com/r/n2zonb/9 Peaks yes, but there is a pattern of  1) Stable memory use 2) Temporary doubling of

Re: Facet behavior

2016-10-25 Thread Bastien Latard | MDPI AG
Hi Guys, Could any of you tell me if I'm right? Thanks in advance. kr, Bast Forwarded Message Subject:Re: Facet behavior Date: Thu, 20 Oct 2016 14:45:23 +0200 From: Bastien Latard | MDPI AG To: solr-user@lucene.apache.org Hi Yonik,

Re: OOM Error

2016-10-25 Thread Shawn Heisey
On 10/25/2016 8:03 PM, Susheel Kumar wrote: > Agree, Pushkar. I had docValues for sorting / faceting fields from > begining (since I setup Solr 6.0). So good on that side. I am going to > analyze the queries to find any potential issue. Two questions which I am > puzzling with > > a) Should the

Re: OOM Error

2016-10-25 Thread Erick Erickson
Off the top of my head: a) Should the below JVM parameter be included for Prod to get heap dump Makes sense. It may produce quite a large dump file, but then this is an extraordinary situation so that's probably OK. b) Currently OOM script just kills the Solr instance. Shouldn't it be enhanced

Re: OOM Error

2016-10-25 Thread Susheel Kumar
Agree, Pushkar. I had docValues for sorting / faceting fields from begining (since I setup Solr 6.0). So good on that side. I am going to analyze the queries to find any potential issue. Two questions which I am puzzling with a) Should the below JVM parameter be included for Prod to get heap

Re: Does _version_ field in schema need to be indexed and/or stored?

2016-10-25 Thread Yonik Seeley
On Tue, Oct 25, 2016 at 6:41 PM, Brent wrote: > I know that in the sample config sets, the _version_ field is indexed and not > stored, like so: > > > > Is there any reason it needs to be indexed? It may depend on your solr version, but the starting configsets currently

Re: Does _version_ field in schema need to be indexed and/or stored?

2016-10-25 Thread Alexandre Rafalovitch
Did you try using optimistic concurrency or SolrCloud? It should NOT work if I understand what's going on correctly. And if you don't index and don't store (and don't docValue), you don't actually have that field active. That's how the dynamicField */false/false/false works to avoid unknown

Re: Combine Data from PDF + XML

2016-10-25 Thread Erick Erickson
First you need to define the problem what do you mean by "combine"? Do the XML files contain, say, metadata about an associated PDF file? Or are these entirely orthogonal documents that you need to index into the same collection? Best, Erick On Tue, Oct 25, 2016 at 4:18 PM,

Re: OOM Error

2016-10-25 Thread Pushkar Raste
You should look into using docValues. docValues are stored off heap and hence you would be better off than just bumping up the heap. Don't enable docValues on existing fields unless you plan to reindex data from scratch. On Oct 25, 2016 3:04 PM, "Susheel Kumar" wrote: >

Re: Solr 5.3.1 - Synonym is not working as expected

2016-10-25 Thread Ahmet Arslan
Hi, If your index is pure Chinese, I would do the expansion on query time only. Simply replace English query term with Chinese translations. Ahmet On Tuesday, October 25, 2016 12:30 PM, soundarya wrote: We are using Solr 5.3.1 version as our search engine. This

Combine Data from PDF + XML

2016-10-25 Thread tesm...@gmail.com
Hi, I ma new to Apache Solr. Developing a search project. The source data is coming from two sources: 1) XML Files 2) PDF Files I need to combine these two sources for search. Couldn't find example of combining these two sources. Any help is appreciated. Regards,

Re: Graph Traversal Question

2016-10-25 Thread Joel Bernstein
Because the edges are unique on the subject->object there isn't currently a way to capture the relationship. Aggregations can be rolled up on numeric fields and as Yonik mentioned you can track the ancestor. It would be fairly easy to track the relationship by adding a relationship array that

Does _version_ field in schema need to be indexed and/or stored?

2016-10-25 Thread Brent
I know that in the sample config sets, the _version_ field is indexed and not stored, like so: Is there any reason it needs to be indexed? I'm able to create collections and use them with it not indexed, but I wonder if it negatively impacts performance. -- View this message in context:

Re: Graph Traversal Question

2016-10-25 Thread Yonik Seeley
You can get the nodes that to came from by adding trackTraversal=true A cut'n'paste example from my Lucene/Solr Revolution slides: curl $URL -d 'expr=gatherNodes(reviews, search(reviews, q="user_s:Yonik AND rating_i:5", fl="book_s,user_s,rating_i",sort="user_s asc"),

Graph Traversal Question

2016-10-25 Thread Grant Ingersoll
Hi, I'm playing around with the new Graph Traversal/GatherNodes capabilities in Solr 6. I've been indexing Yago facts ( http://www.mpi-inf.mpg.de/departments/databases-and-information-systems/research/yago-naga/yago/downloads/) which give me triples of something like subject-relationship-object

Re: Related Search

2016-10-25 Thread Grant Ingersoll
Hi Rick, I typically do this stuff just by searching a different collection that I create offline by analyzing query logs and then indexing them and searching. On Mon, Oct 24, 2016 at 8:32 PM Rick Leir wrote: > Hi all, > > There is an issue 'Create a Related Search

Re: OOM Error

2016-10-25 Thread Susheel Kumar
Thanks, Toke. Analyzing GC logs helped to determine that it was a sudden death. The peaks in last 20 mins... See http://tinypic.com/r/n2zonb/9 Will look into the queries more closer and also adjusting the cache sizing. Thanks, Susheel On Tue, Oct 25, 2016 at 3:37 AM, Toke Eskildsen

Re: OOM Error

2016-10-25 Thread William Bell
I would also recommend that 8GB is cutting it close for Java 8 JVM with SOLR. We use 12GB and have had issues with 8GB. But your mileage may vary. On Tue, Oct 25, 2016 at 1:37 AM, Toke Eskildsen wrote: > On Mon, 2016-10-24 at 18:27 -0400, Susheel Kumar wrote: > > I am

Re: CachedSqlEntityProcessor with delta-import

2016-10-25 Thread Erick Erickson
Why not use delete by id rather than query? It'll be more efficient Probably not a big deal though. On Tue, Oct 25, 2016 at 1:47 AM, Aniket Khare wrote: > Hi Sowmya, > > I my case I have implemeneted the data indexing suggested by James and for > deleting the reords

Re: how is calculated score in group queries?

2016-10-25 Thread Erick Erickson
bq: I'm still using Solr 4.8.1, any chance this thing is changed/fixed in the early releases? What do you want it changed to? IIUC this is the intended behavior. As for the rest, I'll defer to others. Best, Erick On Tue, Oct 25, 2016 at 2:19 AM, Vincenzo D'Amore wrote: >

Re: Solr Cloud A/B Deployment Issue

2016-10-25 Thread jimtronic
Also, if we issue a delete by query where the query is "_version_:0", it also creates a transaction log and then has no trouble transferring leadership between old and new nodes. Still, it seems like when we ADDREPLICA, some sort of transaction log should be started. Jim -- View this message

Re: Solr Cloud A/B Deployment Issue

2016-10-25 Thread jimtronic
Interestingly, If I simply add one document to the full cluster after all 6 nodes are active, this entire problem goes away. This appears to be because a transaction log entry is created which in turn prevents the new nodes from going into full replication recovery upon leader change. Adding a

RE: Solr 6.0 Highlighting Not Working

2016-10-25 Thread Teague James
Hi - Thanks for the reply, I'll give that a try.   -Original Message- From: jimtronic [mailto:jimtro...@gmail.com] Sent: Monday, October 24, 2016 3:56 PM To: solr-user@lucene.apache.org Subject: Re: Solr 6.0 Highlighting Not Working Perhaps you need to wrap your inner "" and "" tags in

Solr 5.3.1 - Synonym is not working as expected

2016-10-25 Thread soundarya
We are using Solr 5.3.1 version as our search engine. This setup is provided by the Bitnami cloud and the amazon AMI is ami-50a47e23. We have a website which has content in Chinese. We use Nutch crawler to crawl the entire website and index it to the Solr collection. We have configured few

how is calculated score in group queries?

2016-10-25 Thread Vincenzo D'Amore
Hi all, I have a couple of questions about grouping, I hope you can help me. I'm trying to understand how is calculated group score in group queries. So, I did my home work and it seems that group score is taken from score of first document for each group found, i.e.: 3653

Re: CachedSqlEntityProcessor with delta-import

2016-10-25 Thread Aniket Khare
Hi Sowmya, I my case I have implemeneted the data indexing suggested by James and for deleting the reords I have created my own data indexing job which will call the delete API periodically by passing the list of unique Id.

Re: OOM Error

2016-10-25 Thread Toke Eskildsen
On Mon, 2016-10-24 at 18:27 -0400, Susheel Kumar wrote: > I am seeing OOM script killed solr (solr 6.0.0) on couple of our VM's > today. So far our solr cluster has been running fine but suddenly > today many of the VM's Solr instance got killed. As you have the GC-logs, you should be able to