Thank you Erick,
Fortunately I can modify the data feeding process to start my post-indexing
tasks.
2014-06-30 22:13 GMT+02:00 Erick Erickson erickerick...@gmail.com:
The paradigm is different. In SolrCloud when a client sends an indexing
request to any node in the system, when the
done. There is a bug, remove something is ok.
--
View this message in context:
http://lucene.472066.n3.nabble.com/why-full-import-not-work-well-tp4142193p4144932.html
Sent from the Solr - User mailing list archive at Nabble.com.
I want to run some query benchmarks, so I want to disable all type of caches
in solr. I commented out filterCache, queryResultCache and documentCache in
solrConfig.xml. I don't care about Result Window Size cause numdocs is 10 in
all the cases.
Are there any other hidden caches which I should
Have you also disabled the queries used to initialize searchers after commit?
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency
On Tue, Jul 1, 2014 at 3:53 PM, vidit.asthana vidit.astha...@gmail.com wrote:
I want to
Yes, I have also commented newSearcher and firstSearcher queries in
solrConfig.xml
--
View this message in context:
http://lucene.472066.n3.nabble.com/Disable-all-caches-in-Solr-tp4144933p4144935.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hello,
here is my configuration which don't work:
shema:
field name=AllChamp type=text_general multiValued=true
indexed=true
required=false stored=false/
dynamicField name=*_en type=text_en indexed=true stored=true
required=false multiValued=true/
dynamicField name=*_fr type=text_fr
I believe, you were already answered.
If you want to have text parsed/analyzed in different ways, you need
to have them in separate fields with separate analyzer stacks. Then
use disMax/eDisMax to search across those fields.
copyField copies the original content and therefore when you search
the
Hello,
I remember reading somewhere that id field (uniqueKey) must be String.
But I cannot find the definitive confirmation, just that it should be
non-analyzed.
Can I use a single-valued TrieLongField type, with precision set to 0?
Or am I going to hit issues?
Regards,
Alex.
Personal
Hello,
i have 300 feilds which are copied on AllChamp
if i want to do separated fields then i need to create 300 * Number of
languages i have, which is not logical for me.
is there any other solution?
Best regards
Anass BENJELLOUN
2014-07-01 11:28 GMT+02:00 Alexandre Rafalovitch [via Lucene]
But aren't you already creating those 300 fields anyway:
dynamicField name=*_fr type=text_fr indexed=true stored=true
required=false multiValued=true/
If you mean you have issues specifying them in eDisMax, I believe 'qf'
parameter allows to specify a wildcard.
Alternatively, you can look at the
i have documents (ar, en , fr)
i need to index them and keeping analyzer and filter for each languages.
here is all fields on schema to enderstand my probleme:
fields
field name=IdDocument type=string multiValued=false indexed=true
required=true stored=true/
field name=NomDocument type=string
On Tue, 2014-07-01 at 10:53 +0200, vidit.asthana wrote:
Are there any other hidden caches which I should know about before running
my tests?
Clear the disk cache?
- Toke Eskildsen, State and University Library, Denmark
No, you definitely can have an int or long uniqueKey. A lot of Solr's tests
use such a uniqueKey. See
solr/core/src/test-files/solr/collection1/conf/schema.xml
On Tue, Jul 1, 2014 at 3:20 PM, Alexandre Rafalovitch arafa...@gmail.com
wrote:
Hello,
I remember reading somewhere that id field
I have to download my 5 million records from sqlserver to solr into one
index. I am getting below exception after downloading 1 Million records. Is
there any configuration or another to download from sqlserver to solr.
Below is the exception i am getting in solr:
Hi,
I have a solr indexer in my network path and i want to share this
indexer(Without replication) for more than one solr instance.
Thanks,
--
View this message in context:
http://lucene.472066.n3.nabble.com/Sharing-single-indexer-for-2-different-solr-instance-tp4144954.html
Sent from
Any suggestion would be appreciated.
Regards.
On Mon, Jun 30, 2014 at 2:49 PM, Ali Nazemian alinazem...@gmail.com wrote:
Hi,
I used solr 4.8 for indexing the web pages that come from nutch. I know
that solr deduplication operation works on uniquekey field. So I set that
to URL field.
and i use dynamicfields for NomDocument,ContenuDocument,Postit
exemple: ContenuDocument_fr, ContenuDocument_en,ContenuDocument_ar
processor
class=org.apache.solr.update.processor.LangDetectLanguageIdentifierUpdateProcessorFactory
lst name=defaults
str
Hi
Just following up on my previous post about a memory leak when RELOADing cores,
I narrowed it down to the SuggestComponent, specifically 'searchComponent
name=suggest class=solr.SuggestComponent.../searchComponent' in
solrconfig.xml. Comment that out and the leak goes away.
The leak occurs
I will be out of the office starting 01/07/2014 and will not return until
02/07/2014
Please email itsta...@actionimages.com for any urgent queries.
Note: This is an automated response to your message Strategy for removing
an active shard from zookeeper sent on 7/1/2014 0:45:59.
This is the
Alex, maybe you're thinking of constraints put on shard keys?
Michael Della Bitta
Applications Developer
o: +1 646 532 3062
appinions inc.
“The Science of Influence Marketing”
18 East 41st Street
New York, NY 10017
t: @appinions https://twitter.com/Appinions | g+:
plus.google.com/appinions
Hi,
Can anyone please let me know how to integrate
http://code.google.com/p/language-detection/ in solr 3.6.1. I want four
languages (English, chinese simplified, chinese traditional, Japanes, and
Korean) to be added in one schema ie. multilingual search from single schema
file.
I tried
Thanks. This looks interesting...
-Michael
-Original Message-
From: Allison, Timothy B. [mailto:talli...@mitre.org]
Sent: Monday, June 30, 2014 8:15 AM
To: solr-user@lucene.apache.org
Subject: RE: Multiterm analysis in complexphrase query
Ahmet, please correct me if I'm wrong, but the
In LUCENE-5472, Lucene was changed to throw an error if a term is too long,
rather than just logging a message. I have fields with terms that are too long,
but I don't care - I just want to ignore them and move on.
The recommended solution in the docs is to use LengthFilterFactory, but this
Well, it's implemented in SignatureUpdateProcessorFactory. Worst case,
you can clone that code and add your preserve-field functionality.
Could even be a nice contribution.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ -
I wasn't thinking of shard keys, but may have been confused in the reading.
Thank you everyone, the long key is working just fine for me.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
Current project: http://www.solr-start.com/ - Accelerating your Solr proficiency
On Tue,
Should be fine. Things to watch:
1 solrconfig.xml has to have the HdfsDirectoryFactory enabled.
2 You probably want to configure ZooKeeper stand-alone,
although it's possible to run embedded ZK it's just awkward
since you can't really bounce Solr nodes running embedded
ZK at
In addition, KeywordTokenizer can be seemingly used but it should be avoided
for unique key field. One of my customers that used it and they had got OOM
during a long term indexing. As it was difficult to find the problem,
I'd like to share my experience.
Koji
--
OK, back up a bit and consider alternative indexing schemes. For instance,
do you really need all those fields? Could you get away with one field
where you indexed the field _name_ + associated value? (you'd have
to be very careful with your analysis chain, but...) Something like:
C67_val_value1
non-String uniqueKey fields have historically popped out
in weird places. I think at one point, for instance,
QueryElevationComponent barfed on non-string types.
So, there may still be edge cases in which this can be a problem.
IMO, they're all bugs though.
Erick
On Tue, Jul 1, 2014 at 7:43 AM,
I mentioned id as string in schema.xml and i copied the csv into example docs
folder. I used the below commaand to download the data Java
-Dtype=application/csv -jar post.jar import.csv
it's throwing the below error.Please help in this regard.
ERROR - 2014-07-01 19:57:43.902;
hello erick,
unfortunately i can't modify the schema , me and my team analyzed carefully
the problem,
so all fields you seeing are required on schema.
now i just tested to do different fields maybe it could work if i knew
syntaxe of edismax:
field name=AllChamp_ar type=text_ar multiValued=true
Ok, firstly to say you need to fix your problem but you can't modify the
schema, doesn't really help. If the schema is setup badly, then no amount
of help at search time will ever get you the results you want...
Secondly, from what I can see in the schema, there is no AllChamp_fr,
AllChamp_en,
Hello,
for Cx_val, there is some fields which are multivalued :)
for AllChamp_fr, AllChamp_en..., i juste added them to the schema to test
if edismax work.
2014-07-01 17:13 GMT+02:00 Daniel Collins [via Lucene]
ml-node+s472066n4145024...@n3.nabble.com:
Ok, firstly to say you need to fix
You can try gave some more memory to solr
On Jul 1, 2014 4:41 PM, mskeerthi mskeer...@gmail.com wrote:
I have to download my 5 million records from sqlserver to solr into one
index. I am getting below exception after downloading 1 Million records. Is
there any configuration or another to
If there's enough interest, I might get back into the code and throw a
standalone src (and jar) of the SpanQueryParser and the Solr wrapper onto
github. That would make it more widely available until there's a chance to
integrate it into Lucene/Solr. If you'd be interested in this, let me
We faced similar problems on our side. We found it more reliable to have a
mechanism to extract all data from the Database into a flat file - and then
use a JAVA program to bulk index into Solr from the file via SolrJ API.
--
View this message in context:
Lets say I create a Solr Collection with multiple shards (say 2 shards) and
set the value of router.field to a field called CompanyName. Now - we
all know that during Indexing Solr would compute a hash on the value indexed
into the CompanyName and route to an appropriate shard.
Lets say I index a
: I mentioned id as string in schema.xml and i copied the csv into example docs
: folder. I used the below commaand to download the data Java
: -Dtype=application/csv -jar post.jar import.csv
:
: it's throwing the below error.Please help in this regard.
:
: ERROR - 2014-07-01 19:57:43.902;
: I want to run some query benchmarks, so I want to disable all type of caches
Just to be clear: disabling all internal caching because you want to run a
benchmark means you're probably going to wind up running a useless
benchmark.
Solr's internal caching is a key component of it's perormance
Can anyone explain the difference between these two queries?
text:(+happy) AND -user:(123456789) = numFound 2912224
But
text:(+happy) AND user:(-123456789) = numFound 0
Now, you may just say then just put - infront of your field, duh! Well,
text:(+happy) = numFound 2912224
Hi,
I'm trying to use mlt request handler in a Solrcloud cluster.
Apparently, its showing some weird behavior. I'm getting response randomly,
it's able to return results randomly for the same query. I'm using Solrj
client which in turn communicates the cluster using zookeeper ensemble.
Here's
Yeah, there's a known bug that a negative-only query within parentheses
doesn't match properly - you need to add a non-negative term, such as *:*.
For example:
text:(+happy) AND user:(*:* -123456789)
-- Jack Krupansky
-Original Message-
From: Brett Hoerner
Sent: Tuesday, July 1,
Interesting, is there a performance impact to sending the *:*?
On Tue, Jul 1, 2014 at 2:53 PM, Jack Krupansky j...@basetechnology.com
wrote:
Yeah, there's a known bug that a negative-only query within parentheses
doesn't match properly - you need to add a non-negative term, such as
*:*. For
Also, does anyone have the Solr or Lucene bug # for this?
On Tue, Jul 1, 2014 at 3:06 PM, Brett Hoerner br...@bretthoerner.com
wrote:
Interesting, is there a performance impact to sending the *:*?
On Tue, Jul 1, 2014 at 2:53 PM, Jack Krupansky j...@basetechnology.com
wrote:
Yeah, there's
No, that's what Solr would do if the bug were fixed. Matching all documents
(*:*) is a constant score query, so it takes no significant amount of
resources. Personally, I consider this a bug in Lucene, but try convincing
them of that!
The issue was filed as:
SOLR-3744 - Solr LuceneQParser
I need to index documents from a csv file that will have 1000s of rows and
100+ columns. To help the user loading the file I must return useful errors
when indexing fails (schema violations). I'm using SolrJ to read the files
line by line, build the document, and index/commit. This approach allows
I think what you want is what’s described in
https://issues.apache.org/jira/browse/SOLR-445 This has not been committed
because it still doesn’t work with SolrCloud. Hoss gave me the hint to look
at DistributingUpdateProcessorFactory to solve the problem described in the
last comments, but I
Thank you. That's a useful link. Maybe not quite what I'm looking for, as it
appears to do with bulk loads of docs - returning an error for each bad doc.
My question is more about getting all the errors for a single doc. I'm
probably taking a performance hit by adding docs one at a time. I haven't
My vague recollection is that at least at one time there was a limitation
somewhere in SolrCloud, but whether that is still true, I don't know.
-- Jack Krupansky
-Original Message-
From: Alexandre Rafalovitch
Sent: Tuesday, July 1, 2014 9:48 AM
To: solr-user@lucene.apache.org
You could develop an update processor to skip or trim long terms as you see
fit. You can even code a script in JavaScruipt using the stateless script
update processor.
Can you tell us more about the nature of your data? I mean, sometimes
analyzer filters strip or fold accented characters
You would end up with duplicate docs on the two shards.
Solr is doing its doc-id lookup on the shards, not on
other shards. Routing takes place before this step,
so you're going to have two docs.
Best,
Erick
On Tue, Jul 1, 2014 at 9:42 AM, IJ jay...@gmail.com wrote:
Lets say I create a Solr
In this particular case, the fields are just using KeywordTokenizerFactory. I
have other fields that are tokenized, but they use tokenizers with a short
maxTokenLength.
I'm not even all that concerned about my own data, but more curious if there's
a general solution to this problem. I imagine
In trying to determine some subtle scoring differences (causing
occasionally significant ordering differences) among search results, I
wrote a parser to normalize debug.explain.structured JSON output.
It appears that every score that is different comes down to a difference in
fieldNorm, where the
why there is no comma(,) in between textlanguage in
str name=mlt.qftitle,textlanguage,caaskey/str
On Wed, Jul 2, 2014 at 12:42 AM, Shamik Bandopadhyay sham...@gmail.com
wrote:
Hi,
I'm trying to use mlt request handler in a Solrcloud cluster.
Apparently, its showing some weird behavior.
Any help here
With Regards
Aman Tandon
On Mon, Jun 30, 2014 at 11:00 PM, Aman Tandon amantandon...@gmail.com
wrote:
Hi Alex,
I was try to get knowledge from these tutorials
http://www.slideshare.net/teofili/natural-language-search-in-solr
https://wiki.apache.org/solr/OpenNLP: this one is
Not from me, no. I don't have any real examples for this ready. I
suspect the path beyond the basics is VERY dependent on your data and
your business requirements.
I would start from thinking how would YOU (as a human) do that match.
Where does the 'blue' and 'color' and 'college' and 'bags' come
Hi,
When i am shutting down the solr i am gettng the Memory Leaks error in logs.
Jul 02, 2014 10:49:10 AM org.apache.catalina.loader.WebappClassLoader
checkThreadLocalMapForLeaks
SEVERE: The web application [/solr] created a ThreadLocal with key of type
Hi Alex,
Thanks alex, one more thing i want to ask that so do we need to add the
extra fields for those entities, e.g. Item (bags), color (blue), etc.
If some how i managed to implement this nlp then i will definitely publish
it on my blog :)
With Regards
Aman Tandon
On Wed, Jul 2, 2014 at
Sorry, that's a typo when I copied the mlt definition from my solrconfig, but
there's comma in my test environment. It's not the issue.
--
View this message in context:
http://lucene.472066.n3.nabble.com/MLT-weird-behaviour-in-Solrcloud-tp4145066p4145145.html
Sent from the Solr - User mailing
59 matches
Mail list logo