It would probably be better to do entity extraction and normalization of
job titles as a front-end process before ingesting the data into Solr, but
you could also do it as a custom or script update processor. The latter can
be easily coded in JavaScript to run within Solr
Your first step in any
Literally, queue can be done by submitting as is (async) and polling
command status. However, giving
https://github.com/apache/lucene-solr/blob/trunk/solr/contrib/dataimporthandler/src/java/org/apache/solr/handler/dataimport/DataImportHandler.java#L200
you can try to add synchronous=true... that
On 1/27/2015 11:54 PM, SolrUser1543 wrote:
I want to reindex my data in order to change a value of some field according
to value of another. ( both field are existing )
For this purpose I run a clue utility in order to get a list of IDs.
Then I created an update processor , which can set
On 1/28/2015 3:56 AM, thakkar.aayush wrote:
I have around 1 million job titles which are indexed on Solr and am looking
to improve the faceted search results on job title matches.
For example: a job search for *Research Scientist Computer Architecture* is
made, and the facet field title
Hi,
What is the best way to update an index with new data or records? Via
this command:
curl
http://127.0.0.1:8983/solr/nvd-rss/dataimport?command=full-importclean=falsesynchronous=trueentity=cve-2002;
or this command:
curl
Thanks Mikhail - synchronous=true works like a charm...:)
On 1/28/15, 5:16 AM, Mikhail Khludnev wrote:
Literally, queue can be done by submitting as is (async) and polling
command status. However, giving
On 1/28/2015 5:11 AM, Reinforcer wrote:
Is Solr capable of using morphology for synonims?
For example. Request: inanely.
Indexed text in Solr: Searching keywords without morphology is fatuously.
inane and fatuous are synonims.
So, inanely ---morphology inane -synonims--- fatuous
Hi,
I usually use the SolrEntityProcessor for moving/transform data between
cores, it's a piece of cake!
Regards.
On Wed, Jan 28, 2015 at 8:13 AM, solrk koushikga...@gmail.com wrote:
Hi Guys,
I have multiple cores setup in my solr server. I would like read/import
data
from one
Hi,
Is Solr capable of using morphology for synonims?
For example. Request: inanely.
Indexed text in Solr: Searching keywords without morphology is fatuously.
inane and fatuous are synonims.
So, inanely ---morphology inane -synonims--- fatuous
---morphology fatuously. Is this
I have around 1 million job titles which are indexed on Solr and am looking
to improve the faceted search results on job title matches.
For example: a job search for *Research Scientist Computer Architecture* is
made, and the facet field title which is tokenized in solr and gives the
following
My problem:
I create cores dynamically using container#create( CoreDescriptor ) and then
add documents to the very core(s). So far so good.
When I restart my app I do
container = CoreContainer#createAndLoad(...)
but when I then call container.getAllCoreNames() an empty list is returned.
What
Create the SID from the existing doc implies that a document already
exists that you wish to add fields to.
However if the document is a binary are you suggesting
1) curl to upload/extract passing docID
2) obtain a SID based off docID
3) add addtinal fields to SID commit
I know I'm possibly
Sorry, I may have misunderstood:
Are you talking about adding additional fields at indexing time? (Here I
would add the fields first *then* send to solr.)
Are you talking about updating a field withing an existing document in a
solr index? (In that case I would direct you here [1].)
Am I still
Second thoughts SID is purely i/p as its name suggests :)
I think a better approach would be
1) curl to upload/extract passing docID
2) curl to update additional fields for that docID
On 28 January 2015 at 17:30, Mark javam...@gmail.com wrote:
Create the SID from the existing doc implies
I'm looking to
1) upload a binary document using curl
2) add some additional facets
Specifically my question is can this be achieved in 1 curl operation or
does it need 2?
On 28 January 2015 at 17:43, Mark javam...@gmail.com wrote:
Second thoughts SID is purely i/p as its name suggests :)
I would switch the order of those. Add the new fields and *then* index to
solr.
We do something similar when we create SolrInputDocuments that are pushed
to solr. Create the SID from the existing doc, add any additional fields,
then add to solr.
On Wed, Jan 28, 2015 at 11:56 AM, Mark
Hi Shawn,
Thank you so much for the assistance. Building is not a problem . Back in
the days I have worked with linking, compiling and building C , C++
software . Java is a piece of cake.
We have built the new war from the source version 4.10.3 and our
preliminary tests have shown that our issue
Hi,
We upgraded our cluster to Solr 4.10.0 for couple days and again reverted back
to 4.8.0. However the dashboard still shows Solr 4.10.0. Do you know why?
* solr-spec 4.10.0
* solr-impl 4.10.0 1620776
* lucene-spec 4.10.0
* lucene-impl 4.10.0 1620776
We recently added
Yes, after 45 seconds a replica should take over as leader. It should
likely explain in the logs of the replica that should be taking over why
this is not happening.
- Mar
On Wed Jan 28 2015 at 2:52:32 PM Joshi, Shital shital.jo...@gs.com wrote:
When leader reaches 99% physical memory on the
Thank you for replying.
We added new shard to same cluster where some shards are showing Solr version
4.10.0 and this new shard is showing Solr version 4.8.0. All shards source Solr
software from same location and use same start up script. I am surprised how
older shards are still running
: We upgraded our cluster to Solr 4.10.0 for couple days and again
: reverted back to 4.8.0. However the dashboard still shows Solr 4.10.0.
: Do you know why?
because you didn't fully revert - you are still running Solr 4.10.0 - the
details of what steps you took to try and switch back make a
By rebalancing I mean that such a big amount of updates will create a
situation which will require running optimization of index ,because each
document will be added again, instead of original one.
But according to what you say it is should not be a problem, am I correct?
--
View this
On 1/27/2015 5:50 PM, vsriram30 wrote:
I am using Solrcloud 4.6.1 In that if I use CloudSolrServer to add a record
to solr, then I see the following commit update command in both master and
in slave node :
One of the first things to find out is whether it's still a problem in
the latest
Ok.. I got the solution.
Changed the value of maxQueryFrequency from 0.01(1%) to 0.9(90%). It is
working. thanks a lot.
On Tue, Jan 27, 2015 at 8:55 PM, Dyer, James james.d...@ingramcontent.com
wrote:
Can you give a little more information as to how you have the spellchecker
configured in
We're using Solr 4.8.0
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Tuesday, January 27, 2015 7:47 PM
To: solr-user@lucene.apache.org
Subject: Re: replica never takes leader role
What version of Solr? This is an ongoing area of improvements and several
I tried increasing my alternativeTermCount to 5 and enable extended results.
I also added a filter fq parameter to clarify what I mean:
*Querying for go pro is good:*
{
responseHeader: {
status: 0,
QTime: 2,
params: {
q: go pro,
indent: true,
fq: marchio:\GO PRO\,
On 1/28/2015 2:51 PM, Joshi, Shital wrote:
Thank you for replying.
We added new shard to same cluster where some shards are showing Solr version
4.10.0 and this new shard is showing Solr version 4.8.0. All shards source
Solr software from same location and use same start up script. I am
Thanks Shawn. Not sure whether I will be able to test it out with 4.10.3. I
will try the workarounds and update.
Thanks,
V.Sriram
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solrcloud-open-new-searcher-not-happening-in-slave-for-deletebyID-tp4182439p4182757.html
Sent
Thank you Alvaro Cabrerizo! I am going to give a shot.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Reading-data-from-another-solr-core-tp4182466p4182758.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi,
I am using data import handler and import data from oracle db.
I have a problem that the table I am importing from has no one column which
is defined as a key.
How should I define the key in the data config file ?
Thanks
--
View this message in context:
Thx Shawn. I am running latest-greatest Solr (4.10.3)
Solr home is e.g.
/opt/webs/siteX/WebContent/WEB-INF/solr
the core(s) reside in
/opt/webs/siteX/WebContent/WEB-INF/solr/cores
Should these be found by core discovery?
If not, how can I configure coreRootDirectory in sorl.xml to be cores folder
Hi
Thanks for your input.
I do not do updates to the existing docs, so that is not relevant in my
case, and I have just skipped that test case :-)
I have not been able to measure any significant changes to the
distributed searches or just doing a direct search for an id.
Did I miss
BTW:
None of my core folders contains a core.properties file ... ? Could it be due
to the fact that I am (so far) running only EmbeddedSolrServer, hence no real
Solr-Server?
-Ursprüngliche Nachricht-
Von: Clemens Wyss DEV [mailto:clemens...@mysign.ch]
Gesendet: Donnerstag, 29. Januar
This is not the desired behavior at all. I know there have been
improvements in this area since 4.8, but can't seem to locate the JIRAs.
I'm curious _why_ the nodes are going down though, is it happening at
random or are you taking it down? One problem has been that the Zookeeper
timeout used to
On 1/28/2015 8:52 AM, Clemens Wyss DEV wrote:
My problem:
I create cores dynamically using container#create( CoreDescriptor ) and then
add documents to the very core(s). So far so good.
When I restart my app I do
container = CoreContainer#createAndLoad(...)
but when I then call
Use case is
use curl to upload/extract/index document passing in additional facets not
present in the document e.g. literal.source=old system
In this way some fields come from the uploaded extracted content and some
fields as specified in the curl URL
Hope that's clearer?
Regards
Mark
On 28
That approach works although as suspected the schma has to recognise the
additinal facet (stuff in this case):
responseHeader:{status:400,QTime:1},error:{msg:ERROR:
[doc=6252671B765A1748992DF1A6403BDF81A4A15E00] unknown field
'stuff',code:400}}
..getting closer..
On 28 January 2015 at
Well, the schema does need to know what type your field is. If you
can't add it to schema, use dynamicFields with prefixe/suffixes or
dynamic schema (less recommended).
Regards,
Alex.
Sign up for my Solr resources newsletter at http://www.solr-start.com/
On 28 January 2015 at 13:32,
Sounds like 'literal.X' syntax from
https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Solr+Cell+using+Apache+Tika
Can you explain your use case as different from what's already
documented? May be easier to understand.
Regards,
Alex.
Sign up for my Solr resources
Try using something larger than 2 for alternativeTermCount. 5 is probably ok
here. If that doesn't work, then post the exact query you are using and the
full extended spellcheck results.
James Dyer
Ingram Content Group
-Original Message-
From: fabio.bozzo [mailto:f.bo...@3-w.it]
We are trying to avoid firing 2 queries per request. I've started to play with
a PostFilter to see how it goes, perhaps something in the line of the
ReRankQueryQueryParser could be used to avoid using two queries and instead
rerank the results?
- Original Message -
From: Ahmet Arslan
It seems that a solution has been found.
PostingsHighlighter uses by default Java's SENTENCE BreakIterator so it
breaks the snippets into fragments per sentence.
In my text_en analysis chain though I was using a filter that lowercases
input and this seems to mess with the logic of SENTENCE
Is it possible to use curl to upload a document (for extract indexing)
and specify some fields on the fly?
sort of:
1) index this document
2) by the way here are some important facets whilst your at it
Regards
Mark
Hi,
Thank you Dan Davis and Alexandre Rafalovitch. This is very helpful for me.
Regards
Olivier
2015-01-27 0:51 GMT+01:00 Alexandre Rafalovitch arafa...@gmail.com:
You've got a lot of options depending on what you want. But since you
seem to just want _an_ example, you can use mine from
Vijay:
Thanks for reporting this back! Could I ask you to post a new patch with
your correction? Please use the same patch name
(SOLR-5850.patch), and include a note about what you found (I've already
added a comment).
Thanks!
Erick
On Wed, Jan 28, 2015 at 9:18 AM, Vijay Sekhri
Thanks Alexandre,
I figured it out with this example,
https://wiki.apache.org/solr/ExtractingRequestHandler
whereby you can add additional fields at upload/extract time
curl
Using Solr 4.6.0 on linux with Java 6 (Oracle JRockit 1.6.0_75
R28.3.2-14-160877-1.6.0_75)
We are seeing these issues when doing a restart on a Solr cloud
configuration.After restarting each server in sequence none of them
will come up. The servers start up after a long time but the cloud
status
When leader reaches 99% physical memory on the box and starts swapping (stops
replicating), we forcefully bring down leader (first kill -15 and then kill -9
if kill -15 doesn't work). This is when we are looking up to replica to assume
leader's role and it never happens.
Zookeeper timeout is
48 matches
Mail list logo