Re: Solr facet search improvements

2015-01-28 Thread Jack Krupansky
It would probably be better to do entity extraction and normalization of job titles as a front-end process before ingesting the data into Solr, but you could also do it as a custom or script update processor. The latter can be easily coded in JavaScript to run within Solr Your first step in any

Re: Running multiple full-import commands via curl in a script

2015-01-28 Thread Mikhail Khludnev
Literally, queue can be done by submitting as is (async) and polling command status. However, giving https://github.com/apache/lucene-solr/blob/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DataImportHandler.java#L200 you can try to add synchronous=true... that

Re: Reindex data without creating new index.

2015-01-28 Thread Shawn Heisey
On 1/27/2015 11:54 PM, SolrUser1543 wrote: I want to reindex my data in order to change a value of some field according to value of another. ( both field are existing ) For this purpose I run a clue utility in order to get a list of IDs. Then I created an update processor , which can set

Re: Solr facet search improvements

2015-01-28 Thread Shawn Heisey
On 1/28/2015 3:56 AM, thakkar.aayush wrote: I have around 1 million job titles which are indexed on Solr and am looking to improve the faceted search results on job title matches. For example: a job search for *Research Scientist Computer Architecture* is made, and the facet field title

What is the best way to update an index?

2015-01-28 Thread Carl Roberts
Hi, What is the best way to update an index with new data or records? Via this command: curl http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-importclean=falsesynchronous=trueentity=cve-2002; or this command: curl

Re: Running multiple full-import commands via curl in a script

2015-01-28 Thread Carl Roberts
Thanks Mikhail - synchronous=true works like a charm...:) On 1/28/15, 5:16 AM, Mikhail Khludnev wrote: Literally, queue can be done by submitting as is (async) and polling command status. However, giving

Re: Morphology of synonims

2015-01-28 Thread Shawn Heisey
On 1/28/2015 5:11 AM, Reinforcer wrote: Is Solr capable of using morphology for synonims? For example. Request: inanely. Indexed text in Solr: Searching keywords without morphology is fatuously. inane and fatuous are synonims. So, inanely ---morphology inane -synonims--- fatuous

Re: Reading data from another solr core

2015-01-28 Thread Alvaro Cabrerizo
Hi, I usually use the SolrEntityProcessor for moving/transform data between cores, it's a piece of cake! Regards. On Wed, Jan 28, 2015 at 8:13 AM, solrk koushikga...@gmail.com wrote: Hi Guys, I have multiple cores setup in my solr server. I would like read/import data from one

Morphology of synonims

2015-01-28 Thread Reinforcer
Hi, Is Solr capable of using morphology for synonims? For example. Request: inanely. Indexed text in Solr: Searching keywords without morphology is fatuously. inane and fatuous are synonims. So, inanely ---morphology inane -synonims--- fatuous ---morphology fatuously. Is this

Solr facet search improvements

2015-01-28 Thread thakkar.aayush
I have around 1 million job titles which are indexed on Solr and am looking to improve the faceted search results on job title matches. For example: a job search for *Research Scientist Computer Architecture* is made, and the facet field title which is tokenized in solr and gives the following

CoreContainer#createAndLoad, existing cores not loaded

2015-01-28 Thread Clemens Wyss DEV
My problem: I create cores dynamically using container#create( CoreDescriptor ) and then add documents to the very core(s). So far so good. When I restart my app I do container = CoreContainer#createAndLoad(...) but when I then call container.getAllCoreNames() an empty list is returned. What

Re: extract and add fields on the fly

2015-01-28 Thread Mark
Create the SID from the existing doc implies that a document already exists that you wish to add fields to. However if the document is a binary are you suggesting 1) curl to upload/extract passing docID 2) obtain a SID based off docID 3) add addtinal fields to SID commit I know I'm possibly

Re: extract and add fields on the fly

2015-01-28 Thread Andrew Pawloski
Sorry, I may have misunderstood: Are you talking about adding additional fields at indexing time? (Here I would add the fields first *then* send to solr.) Are you talking about updating a field withing an existing document in a solr index? (In that case I would direct you here [1].) Am I still

Re: extract and add fields on the fly

2015-01-28 Thread Mark
Second thoughts SID is purely i/p as its name suggests :) I think a better approach would be 1) curl to upload/extract passing docID 2) curl to update additional fields for that docID On 28 January 2015 at 17:30, Mark javam...@gmail.com wrote: Create the SID from the existing doc implies

Re: extract and add fields on the fly

2015-01-28 Thread Mark
I'm looking to 1) upload a binary document using curl 2) add some additional facets Specifically my question is can this be achieved in 1 curl operation or does it need 2? On 28 January 2015 at 17:43, Mark javam...@gmail.com wrote: Second thoughts SID is purely i/p as its name suggests :)

Re: extract and add fields on the fly

2015-01-28 Thread Andrew Pawloski
I would switch the order of those. Add the new fields and *then* index to solr. We do something similar when we create SolrInputDocuments that are pushed to solr. Create the SID from the existing doc, add any additional fields, then add to solr. On Wed, Jan 28, 2015 at 11:56 AM, Mark

Re: replicas goes in recovery mode right after update

2015-01-28 Thread Vijay Sekhri
Hi Shawn, Thank you so much for the assistance. Building is not a problem . Back in the days I have worked with linking, compiling and building C , C++ software . Java is a piece of cake. We have built the new war from the source version 4.10.3 and our preliminary tests have shown that our issue

IndexFormatTooNewException

2015-01-28 Thread Joshi, Shital
Hi, We upgraded our cluster to Solr 4.10.0 for couple days and again reverted back to 4.8.0. However the dashboard still shows Solr 4.10.0. Do you know why? * solr-spec 4.10.0 * solr-impl 4.10.0 1620776 * lucene-spec 4.10.0 * lucene-impl 4.10.0 1620776 We recently added

Re: replica never takes leader role

2015-01-28 Thread Mark Miller
Yes, after 45 seconds a replica should take over as leader. It should likely explain in the logs of the replica that should be taking over why this is not happening. - Mar On Wed Jan 28 2015 at 2:52:32 PM Joshi, Shital shital.jo...@gs.com wrote: When leader reaches 99% physical memory on the

RE: IndexFormatTooNewException

2015-01-28 Thread Joshi, Shital
Thank you for replying. We added new shard to same cluster where some shards are showing Solr version 4.10.0 and this new shard is showing Solr version 4.8.0. All shards source Solr software from same location and use same start up script. I am surprised how older shards are still running

Re: IndexFormatTooNewException

2015-01-28 Thread Chris Hostetter
: We upgraded our cluster to Solr 4.10.0 for couple days and again : reverted back to 4.8.0. However the dashboard still shows Solr 4.10.0. : Do you know why? because you didn't fully revert - you are still running Solr 4.10.0 - the details of what steps you took to try and switch back make a

Re: Reindex data without creating new index.

2015-01-28 Thread SolrUser1543
By rebalancing I mean that such a big amount of updates will create a situation which will require running optimization of index ,because each document will be added again, instead of original one. But according to what you say it is should not be a problem, am I correct? -- View this

Re: Solrcloud open new searcher not happening in slave for deletebyID

2015-01-28 Thread Shawn Heisey
On 1/27/2015 5:50 PM, vsriram30 wrote: I am using Solrcloud 4.6.1 In that if I use CloudSolrServer to add a record to solr, then I see the following commit update command in both master and in slave node : One of the first things to find out is whether it's still a problem in the latest

Re: Stop word suggestions are coming when I indexed sentence using ShingleFilterFactory

2015-01-28 Thread Nitin Solanki
Ok.. I got the solution. Changed the value of maxQueryFrequency from 0.01(1%) to 0.9(90%). It is working. thanks a lot. On Tue, Jan 27, 2015 at 8:55 PM, Dyer, James james.d...@ingramcontent.com wrote: Can you give a little more information as to how you have the spellchecker configured in

RE: replica never takes leader role

2015-01-28 Thread Joshi, Shital
We're using Solr 4.8.0 -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Tuesday, January 27, 2015 7:47 PM To: solr-user@lucene.apache.org Subject: Re: replica never takes leader role What version of Solr? This is an ongoing area of improvements and several

RE: Suggesting broken words with solr.WordBreakSolrSpellChecker

2015-01-28 Thread fabio.bozzo
I tried increasing my alternativeTermCount to 5 and enable extended results. I also added a filter fq parameter to clarify what I mean: *Querying for go pro is good:* { responseHeader: { status: 0, QTime: 2, params: { q: go pro, indent: true, fq: marchio:\GO PRO\,

Re: IndexFormatTooNewException

2015-01-28 Thread Shawn Heisey
On 1/28/2015 2:51 PM, Joshi, Shital wrote: Thank you for replying. We added new shard to same cluster where some shards are showing Solr version 4.10.0 and this new shard is showing Solr version 4.8.0. All shards source Solr software from same location and use same start up script. I am

Re: Solrcloud open new searcher not happening in slave for deletebyID

2015-01-28 Thread vsriram30
Thanks Shawn. Not sure whether I will be able to test it out with 4.10.3. I will try the workarounds and update. Thanks, V.Sriram -- View this message in context: http://lucene.472066.n3.nabble.com/Solrcloud-open-new-searcher-not-happening-in-slave-for-deletebyID-tp4182439p4182757.html Sent

Re: Reading data from another solr core

2015-01-28 Thread solrk
Thank you Alvaro Cabrerizo! I am going to give a shot. -- View this message in context: http://lucene.472066.n3.nabble.com/Reading-data-from-another-solr-core-tp4182466p4182758.html Sent from the Solr - User mailing list archive at Nabble.com.

Define Id when using db dih

2015-01-28 Thread SolrUser1543
Hi, I am using data import handler and import data from oracle db. I have a problem that the table I am importing from has no one column which is defined as a key. How should I define the key in the data config file ? Thanks -- View this message in context:

AW: CoreContainer#createAndLoad, existing cores not loaded

2015-01-28 Thread Clemens Wyss DEV
Thx Shawn. I am running latest-greatest Solr (4.10.3) Solr home is e.g. /opt/webs/siteX/WebContent/WEB-INF/solr the core(s) reside in /opt/webs/siteX/WebContent/WEB-INF/solr/cores Should these be found by core discovery? If not, how can I configure coreRootDirectory in sorl.xml to be cores folder

Re: PostingsFormat block size

2015-01-28 Thread Trym Møller
Hi Thanks for your input. I do not do updates to the existing docs, so that is not relevant in my case, and I have just skipped that test case :-) I have not been able to measure any significant changes to the distributed searches or just doing a direct search for an id. Did I miss

AW: CoreContainer#createAndLoad, existing cores not loaded

2015-01-28 Thread Clemens Wyss DEV
BTW: None of my core folders contains a core.properties file ... ? Could it be due to the fact that I am (so far) running only EmbeddedSolrServer, hence no real Solr-Server? -Ursprüngliche Nachricht- Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch] Gesendet: Donnerstag, 29. Januar

Re: replica never takes leader role

2015-01-28 Thread Erick Erickson
This is not the desired behavior at all. I know there have been improvements in this area since 4.8, but can't seem to locate the JIRAs. I'm curious _why_ the nodes are going down though, is it happening at random or are you taking it down? One problem has been that the Zookeeper timeout used to

Re: CoreContainer#createAndLoad, existing cores not loaded

2015-01-28 Thread Shawn Heisey
On 1/28/2015 8:52 AM, Clemens Wyss DEV wrote: My problem: I create cores dynamically using container#create( CoreDescriptor ) and then add documents to the very core(s). So far so good. When I restart my app I do container = CoreContainer#createAndLoad(...) but when I then call

Re: extract and add fields on the fly

2015-01-28 Thread Mark
Use case is use curl to upload/extract/index document passing in additional facets not present in the document e.g. literal.source=old system In this way some fields come from the uploaded extracted content and some fields as specified in the curl URL Hope that's clearer? Regards Mark On 28

Re: extract and add fields on the fly

2015-01-28 Thread Mark
That approach works although as suspected the schma has to recognise the additinal facet (stuff in this case): responseHeader:{status:400,QTime:1},error:{msg:ERROR: [doc=6252671B765A1748992DF1A6403BDF81A4A15E00] unknown field 'stuff',code:400}} ..getting closer.. On 28 January 2015 at

Re: extract and add fields on the fly

2015-01-28 Thread Alexandre Rafalovitch
Well, the schema does need to know what type your field is. If you can't add it to schema, use dynamicFields with prefixe/suffixes or dynamic schema (less recommended). Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 28 January 2015 at 13:32,

Re: extract and add fields on the fly

2015-01-28 Thread Alexandre Rafalovitch
Sounds like 'literal.X' syntax from https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika Can you explain your use case as different from what's already documented? May be easier to understand. Regards, Alex. Sign up for my Solr resources

RE: Suggesting broken words with solr.WordBreakSolrSpellChecker

2015-01-28 Thread Dyer, James
Try using something larger than 2 for alternativeTermCount. 5 is probably ok here. If that doesn't work, then post the exact query you are using and the full extended spellcheck results. James Dyer Ingram Content Group -Original Message- From: fabio.bozzo [mailto:f.bo...@3-w.it]

Re: [MASSMAIL]Re: Contextual sponsored results with Solr

2015-01-28 Thread Jorge Luis Betancourt González
We are trying to avoid firing 2 queries per request. I've started to play with a PostFilter to see how it goes, perhaps something in the line of the ReRankQueryQueryParser could be used to avoid using two queries and instead rerank the results? - Original Message - From: Ahmet Arslan

Re: PostingsHighlighter highlighted snippet size (fragsize)

2015-01-28 Thread Zisis Tachtsidis
It seems that a solution has been found. PostingsHighlighter uses by default Java's SENTENCE BreakIterator so it breaks the snippets into fragments per sentence. In my text_en analysis chain though I was using a filter that lowercases input and this seems to mess with the logic of SENTENCE

extract and add fields on the fly

2015-01-28 Thread Mark
Is it possible to use curl to upload a document (for extract indexing) and specify some fields on the fly? sort of: 1) index this document 2) by the way here are some important facets whilst your at it Regards Mark

Re: How to implement Auto complete, suggestion client side

2015-01-28 Thread Olivier Austina
Hi, Thank you Dan Davis and Alexandre Rafalovitch. This is very helpful for me. Regards Olivier 2015-01-27 0:51 GMT+01:00 Alexandre Rafalovitch arafa...@gmail.com: You've got a lot of options depending on what you want. But since you seem to just want _an_ example, you can use mine from

Re: replicas goes in recovery mode right after update

2015-01-28 Thread Erick Erickson
Vijay: Thanks for reporting this back! Could I ask you to post a new patch with your correction? Please use the same patch name (SOLR-5850.patch), and include a note about what you found (I've already added a comment). Thanks! Erick On Wed, Jan 28, 2015 at 9:18 AM, Vijay Sekhri

Re: extract and add fields on the fly

2015-01-28 Thread Mark
Thanks Alexandre, I figured it out with this example, https://wiki.apache.org/solr/ExtractingRequestHandler whereby you can add additional fields at upload/extract time curl

Issue on server restarts with Solr 4.6.0 Cloud

2015-01-28 Thread andrew jenner
Using Solr 4.6.0 on linux with Java 6 (Oracle JRockit 1.6.0_75 R28.3.2-14-160877-1.6.0_75) We are seeing these issues when doing a restart on a Solr cloud configuration.After restarting each server in sequence none of them will come up. The servers start up after a long time but the cloud status

RE: replica never takes leader role

2015-01-28 Thread Joshi, Shital
When leader reaches 99% physical memory on the box and starts swapping (stops replicating), we forcefully bring down leader (first kill -15 and then kill -9 if kill -15 doesn't work). This is when we are looking up to replica to assume leader's role and it never happens. Zookeeper timeout is