Re: SolrCloud breaks and does not recover

2015-11-03 Thread Björn Häuser
Hi! Thank you for your super fast answer. I can provide more data, the question is which data :-) These are the config parameters solr runs with: https://gist.github.com/bjoernhaeuser/24e7080b9ff2a8785740 (taken from the admin ui) These are the log files:

Re: Many files /dataImport in same project

2015-11-03 Thread fabigol
So I can run a script containing a fixed time for each file. It is a solution. is that good and only solution? My solrConfig.xml file, I declare a data-import files for 6. can i group in the same data-import? -- View this message in context:

Re: Many files /dataImport in same project

2015-11-03 Thread fabigol
Hi, My problem is that I got a an old 3-year project without knowing whether it worked I passed the stage of understanding and I managed to make it work. At the beginning I think about why 6 files I'm in a new stage which is optimization. I have 6-import data files and am currently forced to

[CONF] Apache Solr Reference Guide > Result Grouping

2015-11-03 Thread vishal raut
Hello, In context to the question I asked on Solr confluence (I have copied the conversation at the end of this mail). I have indexed various videos in solr which I have in my database. I want to search for those video titles, but there can be duplicate video titles as well (If the video is

Re: language plugin

2015-11-03 Thread Alexandre Rafalovitch
I wonder what would happen if the DistributedUpdateProcessorFactory is manually added into the chain and the LangDetect definition is moved AFTER it. As per https://wiki.apache.org/solr/UpdateRequestProcessor#Distributed_Updates This would mean that the detection code would be executed on each

Re: language plugin

2015-11-03 Thread Upayavira
Looking at the code, this is not going to work without modifications to Solr (or at least a custom component). The atomic update code is closely embedded into the Solr DistributedUpdateProcessor, which expands the atomic update into a full document and then posts it to the shards. You need to do

Re: [CONF] Apache Solr Reference Guide > Result Grouping

2015-11-03 Thread Toke Eskildsen
On Tue, 2015-11-03 at 14:53 +0530, vishal raut wrote: > I have indexed various videos in solr which I have in my database. I want > to search for those video titles, but there can be duplicate video titles > as well (If the video is same but source is different, this will have > separate entry in

OpenNLP plugin or similar NER software for Solr

2015-11-03 Thread liviuchristian
Hi everyone, I need to install a plugin to extract Location (Country/State/City) from free text documents - any professional advice?!? Does OpenNLP really does the job? Is it English only? US only? Or does it cover worldwide places names? Could someone help me with this job - installation,

Re: SolrCloud breaks and does not recover

2015-11-03 Thread Erick Erickson
The GC logs don't really show anything interesting, there would be 15+ second GC pauses. The Zookeeper log isn't actually very interesting. As far as OOM errors, I was thinking of _solr_ logs. As to why the cluster doesn't self-heal, a couple of things: 1> Once you hit an OOM, all bets are off.

Re: Kate Winslet vs Winslet Kate

2015-11-03 Thread scott chu
solr-user,妳好 With repsect to querying, Dismax makes solr query syntax quite like Google's, you type simple keywords, you can boost them, you can use +/- just like Google's. Meaning they give users a lot of covenince and less boolean knowlege to establish intended query string. Normal Lucene

Re: Many files /dataImport in same project

2015-11-03 Thread Gora Mohanty
On 2 November 2015 at 22:38, Alexandre Rafalovitch wrote: > On 2 November 2015 at 11:30, Gora Mohanty wrote: >> As per my last >> follow-up, there is currently no way to have DIH automatically pick up >> different data-config files without manually editing

Re: SolrCloud breaks and does not recover

2015-11-03 Thread Björn Häuser
Hi, thank you for your answer. 1> No OOM hit, the log does not contain any hind of that. Also solr wasn't restarted automatically. But the gc log has some pauses which are longer than 15 seconds. 2> So, if we need to recover a system we need to stop ingesting data into it? 3> The JVMs

Re: SolrCloud breaks and does not recover

2015-11-03 Thread Rallavagu
One another item to look into is to increase the zookeeper timeout in solr.xml of Solr. This would help with timeout caused by long GC pauses. On 11/3/15 9:12 AM, Björn Häuser wrote: Hi, thank you for your answer. 1> No OOM hit, the log does not contain any hind of that. Also solr wasn't

Re: how to change uniqueKey?

2015-11-03 Thread Mikhail Khludnev
Hello Oleksandr, It seems there is no way ManagedIndexSchema doesn't even refer to IndexSchema.uniqueKeyField. You may only choose the proper (with uniqueKeyField) config set when creating collection. On Tue, Nov 3, 2015 at 5:24 PM, Oleksandr Yermolenko wrote: > Hello, All, > >

Re: Kate Winslet vs Winslet Kate

2015-11-03 Thread Yangrui Guo
Tried but still didn't get correct result. I guess the reason is because I use block join with the document. My current solution is to use a name tagged to extract persons then put name field restriction before it. This will not work with all situations though. Thanks for the reply. On Tuesday,

Re: Compiling SolrJ for Java 6

2015-11-03 Thread Erick Erickson
You're on your one if you try to do this. Solr 4.10 requires Java7. I don't believe Solr will even compile under 1.6. You may bet lucky and get SolrJ to compile, but whether it works or not is chancy at best. Best, Erick On Tue, Nov 3, 2015 at 2:13 PM, O. Olson wrote: > Hi,

Compiling SolrJ for Java 6

2015-11-03 Thread O. Olson
Hi, I'm looking to compile the SolrJ for Solr 4.10.3 for running on Java 6. (Due to choices beyond my control, we are on this older version of SolrJ and Java 6.) I'm looking for any pointers on how I could do it? I tried downloading the source from SVN (for Solr 4.10.3, not the latest version).

Re: Kate Winslet vs Winslet Kate

2015-11-03 Thread Imtiaz Shakil Siddique
I think edismax query parser perfectly fits for your needs. You can make edismax search for query words on multiple fields using the "qf" parameter and you can also set the priority of those searched fields. Edismax also auto generates phrase query for specified fields . ( look into the "pf" and

Re: Compiling SolrJ for Java 6

2015-11-03 Thread Shawn Heisey
On 11/3/2015 3:33 PM, Erick Erickson wrote: > You're on your one if you try to do this. Solr 4.10 requires Java7. I > don't believe Solr will even compile under 1.6. > > You may bet lucky and get SolrJ to compile, but whether it works or > not is chancy at best. > > Best, > Erick > > On Tue, Nov

Re: Compiling SolrJ for Java 6

2015-11-03 Thread O. Olson
Thank you Erick. I'm sorry I did not clarify this in my original message. I'm compiling Solr (or SolrJ) under Java 7. I'm aware that it requires Java 7 to compile, and that's why I have not changed the "java.source" value in the common-build.xml file. SolrJ compiles fine. My problem is that I

Re: DIH Caching with Delta Import

2015-11-03 Thread Todd Long
Erick Erickson wrote > Have you considered using SolrJ instead of DIH? I've seen > situations where that can make a difference for things like > caching small tables at the start of a run, see: > > searchhub.org/2012/02/14/indexing-with-solrj/ Nice write-up. I think we're going to move to that

Re: Compiling SolrJ for Java 6

2015-11-03 Thread O. Olson
Damm. I always thought cross compilation of Java worked (i.e. compile in one version with the target of a previous version). I guess it worked in my code because I did not use any of the new features. Thank you very much Shawn. No, I'm not running SolrCloud, but I wanted to use the new features

Re: Compiling SolrJ for Java 6

2015-11-03 Thread Upayavira
I think it was around 4.7 that the Java7 requirement was introduced. You may find trying 4.6 will get you what you are needing. I'd expect the artifacts in the Maven repo should be compiled with Java6 from that point backwards. Upayavira On Tue, Nov 3, 2015, at 10:33 PM, Erick Erickson wrote: >

Apache Solr SpellChecker Integration with the default select request handler

2015-11-03 Thread Shruthi BN
Hi Team, I want to integrate spellcheck handler with default select handler. Please guide me how can I achieve this. I tried like explicit 10 text productname default on true 5 true

Re: Many files /dataImport in same project

2015-11-03 Thread Erick Erickson
A possibility: Define 6 different request handlers, something like: /home/username/data-config-1.xml /home/username/data-config-2.xml And fire off 6 separate commands, one to each end point. WARNING! I have not tried this personally, so it might be an

Re: Many files /dataImport in same project

2015-11-03 Thread Gora Mohanty
On 3 November 2015 at 21:25, Alexandre Rafalovitch wrote: > On 3 November 2015 at 10:38, Gora Mohanty wrote: >>> I missed previous discussions, but the DIH config file is given in a >>> query parameter. So, if there is a bunch of them on a file system, one

Re: Many files /dataImport in same project

2015-11-03 Thread Alexandre Rafalovitch
On 3 November 2015 at 10:38, Gora Mohanty wrote: >> I missed previous discussions, but the DIH config file is given in a >> query parameter. So, if there is a bunch of them on a file system, one >> could probably do >> find . - name "*.dihconf" | xargs curl . > > Sorry, I

how to change uniqueKey?

2015-11-03 Thread Oleksandr Yermolenko
Hello, All, I can't find the way to change uniqueKey in "managed-schema" environment!!! my steps: 1. Solr 5.3.1 /opt/solr/bin/solr start /opt/solr/bin/solr create -c my_mail 2. added a few fields: curl -X POST -H 'Content-type:application/json' --data-binary '{

Re: language plugin

2015-11-03 Thread Upayavira
Actually, you are right. It would be executed on every node if you put LandDetect after a deliberately inserted DistrubutedUpdateProcessorFactory entry. Not optimal, but would work. Upayavira On Tue, Nov 3, 2015, at 12:26 PM, Alexandre Rafalovitch wrote: > I wonder what would happen if the

Re: Queries for many terms

2015-11-03 Thread Alan Woodward
TermsQuery works by pulling the postings lists for each term and OR-ing them together to create a bitset, which is very memory-efficient but means that you don't know at doc collection time which term has actually matched. For your case you probably want to create a SpanOrQuery, and then