in http://www.lucidimagination.com/files/file/LIWP_WhatsNew_Solr1.4.pdf
http://www.lucidimagination.com/files/file/LIWP_WhatsNew_Solr1.4.pdf under
the performance it mentions:
"Queries that don’t sort by score can eliminate scoring, which speeds up
queries"
how exactly can i do that? If i don't
Hi. All.
I got a problem with distributed solr search. The issue is
I have 76M documents spread over 76 solr instances, each instance handles
1M documents.
Previously I put all 76 instances on single server and when I tested I found
each time it runs, it will take several times, most
Hello there,
I'm guessing the sites will be searched separately. In that case I'd recommend
a core for each site.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: "scr...@asia.com"
>
Hi,
The difference indicates deletes. Optimize the index (which expunges docs
marked as deleted) and the difference disappears.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
> From: Karthik
I want to load full text into an external cache, So I added so codes
in newSearcher where I found the warm up takes place. I add my codes
before solr warm up which is configed in solrconfig.xml like this:
...
public void newSearcher(SolrIndexSearcher newSearcher,
Sol
yeah, that happened :( ,lost lot of data because of it.
Can some one explain the terms numDocs and maxDoc ?? will the difference
indicate the duplicates??
Thank you,
karthik
: Subject: range faceting with integers
: References:
: In-Reply-To:
http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email. Even if you change
: I have found that this search crashes:
:
: /solr/select?q=*%3A*&fq=&start=0&rows=1&fl=id
Ouch .. that exception is kind of hairy. it suggests that your index may
have been corrupted in some way -- do you have nay idea what happened?
have you tried using hte CheckIndex tool to see what it s
If the other suggestions don't work, you need to show us the relevant
portions of your schema.xml, and probably query output with
&debug=on tacked on...
Here are some pointers for getting help...
http://wiki.apache.org/solr/UsingMailingLists
Best
Erick
2010/7/14 Jonathan Rochkind
> "the" soun
The best way to understand how things are parsed is to go to the solr admin
page (Full interface link?) and click the "debug info" box and submit your
query. That'll tell you exactly what happens.
Alternatively, you can put &debugQuery=on on your URL...
HTH
Erick
On Wed, Jul 14, 2010 at 8:48 AM,
: is it possible to use the stored terms of a field for a faceted search?
No, the only thing stored fields can be used for is document centric
opterations (ie: once you have a small set of individual docIds, you can
access the stored fields to return to the user, or highlight, etc...)
: I mean
: My question is how do i query that?
: q=text_clean:Nike's new text_orig:"running shoes"
: seems like it would work, but not sure its the best way.
that's a perfectly good way to do it.
: Is there a way i can tell the parser(or extend it) so that every phrase
: query it will use one field and f
Does your schema have a unique id specified? If so, is it possible that you
indexed many documents that had the same ID, thus deleting previous
documents with the same ID? That would account for it, but it's a shot in
the dark...
Best
Erick
On Tue, Jul 13, 2010 at 6:20 AM, Karthik K wrote:
> Th
: I was wondering if anyone was aware of any existing functionality where
: clients/server components could register some search criteria and be
: notified of newly committed data matching the search when it becomes
: available
you can register a "postCommit" listener in your solrconfig.xml file
On 15.07.2010, at 00:09, Mat Brown wrote:
> Hi all,
>
> I can't seem to find a way to query for an empty string that is
> simpler than this:
>
> field_name:[* to ""]
>
> Things that don't work:
>
> field_name:""
> field_name["" TO ""]
>
> Is the one I'm using the simplest option? If so, is t
I fixed the path of the queryResponseWriter class in the example
solrconfig.xml. This was successfully applied against solr 4.0 trunk.
A few quirks:
* When I didn't specify a default Delimiter, it printed out null as
delimiter. I couldn't figure out why because init(NamedList args)
Hi all,
I can't seem to find a way to query for an empty string that is
simpler than this:
field_name:[* to ""]
Things that don't work:
field_name:""
field_name["" TO ""]
Is the one I'm using the simplest option? If so, is there a particular
reason the other ones I mention don't work? Just cur
I thought of another way to do it, but I still have one thing I don't
know how to do. I could do the search without sorting for the 50th
page, then look at the relevancy score on the first item on that page,
then repeat the search, but add score > that relevancy as a parameter.
Is it possible to do
I was hoping for a way to do this purely by configuration and making
the correct GET requests, but if there is a way to do it by creating a
custom Request Handler, I suppose I could plunge into that. Would that
yield the best results, and would that be particularly difficult?
On Wed, Jul 14, 2010
I'm trying to enable clustering in solr 1.4. I'm following these instructions:
http://wiki.apache.org/solr/ClusteringComponent
However, `ant get-libraries` fails for me. Before it tries to download
the 4 jar files, it tries to compile lucene? Is this necessary?
Has anyone gotten clustering worki
So you want to take the top 1000 sorted by score, then sort those by another
field. It's a strange case, and I can't think of a clean way to accomplish it.
You could do it in two queries, where the first is by score and you only
request your IDs to keep it snappy, then do a second query against
I'd like to limit the total number of documents that are returned for
a search, particularly when the sort order is not based on relevancy.
In other words, if the user searches for a very common term, they
might get tens of thousands of hits, and if they sort by "title", then
very high relevancy d
Hi,
We are planning to host on same server different website that will use solr.
What will be the best?
One core with a field i schema: site1, site2 etc... and then add this in every
query
Or one core per site?
Thanks for your help
Any other thoughts, Chris? I've been messing with this a bit, and can't seem
to get (?m)^.*$ to do what I want.
1) I don't care how many characters it returns, I'd like entire lines all the
time
2) I just want it to always return 3 lines: the line before, the actual line,
and the line after.
3
Shawn Heisey wrote:
[* TO NOW-2YEARS]^1.0
I also seem to remember seeing something about how to do "less than" in
range queries as well as the "less than or equal to" implied by the
above, but I cannot find it now.
Ranges with square brackets [] are inclusive. Ranges with parens () are
I have finally figured out how to turn this off in Thunderbird 3:
Go to Tools, Options, Display, and turn off "Display emoticons as
graphics".
On 4/12/2010 12:04 PM, Shawn Heisey wrote:
On 4/12/2010 11:55 AM, Shawn Heisey wrote:
[NOW-6MONTHS TO NOW]^5.0 ,
[NOW-1YEARS TO NOW-6MONTHS]^3.0
[NO
On Wed, Jul 14, 2010 at 12:59 PM, Blargy wrote:
>
> Nevermind. Apparently my IDE (Netbeans) was set to "No encoding"... wtf.
> Changed it to UTF-8 and recreated the file and all is good now. Thanks!
>
>
fyi I created an issue with your example here:
https://issues.apache.org/jira/browse/SOLR-2003
Re: flexibility.
This boost does decays over time, the further it gets from now the less
of a boost it receives. You are right though, it doesn't allow a fine
degree of control, particularly if you don't want to smoothly decay the
boost. I hadn't considered your suggestion, so I'll keep it in mi
One of the replies I got on a previous thread mentioned range queries,
with this example:
[NOW-6MONTHS TO NOW]^5.0 ,
[NOW-1YEARS TO NOW-6MONTHS]^3.0
[NOW-2YEARS TO NOW-1YEARS]^2.0
[* TO NOW-2YEARS]^1.0
Something like this seems more flexible, and into it, I read an
implication that the perform
Nevermind. Apparently my IDE (Netbeans) was set to "No encoding"... wtf.
Changed it to UTF-8 and recreated the file and all is good now. Thanks!
--
View this message in context:
http://lucene.472066.n3.nabble.com/Foreign-characters-question-tp964078p967058.html
Sent from the Solr - User mailing
How can I tell and/or create a UTF-8 synonyms file? Do I have to instruct
solr that this file is UTF-8?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Foreign-characters-question-tp964078p967037.html
Sent from the Solr - User mailing list archive at Nabble.com.
I used this before my search term and it works well:
{!boost b=recip(ms(NOW,publishdate),3.16e-11,1,1)}
Its enough that when I search for *:* the articles appear in
chronological order.
Tim
-Original Message-
From: Shawn Heisey [mailto:s...@elyograg.org]
Sent: Wednesday, July 14, 2010
I've started a couple of previous threads on this topic, but I did not
have a good date field in my index to use at the time. I now have a
schema with the document's post_date in tdate format, so I would like to
actually do some implementation. Right now, we are not doing relevancy
ranking a
is your synonyms file in UTF-8 encoding?
On Wed, Jul 14, 2010 at 11:11 AM, Blargy wrote:
>
> Thanks for the reply but that didnt help.
>
> Tomcat is accepting foreign characters but for some reason when it reads
> the
> synonyms file and it encounters that character ñ it doesnt appear correctly
Thanks for the reply but that didnt help.
Tomcat is accepting foreign characters but for some reason when it reads the
synonyms file and it encounters that character ñ it doesnt appear correctly
in the Field Analysis admin. It shows up as �. If I query exactly for ñ it
will work but the synonyms
I figured out where the problem was. The destination wildcard was actually
matching the wrong field. I changed the fieldnames around a bit and now
everything works fine. Thanks!
> -Ursprüngliche Nachricht-
> Von: kenf_nc [mailto:ken.fos...@realestate.com]
> Gesendet: Mittwoch, 14. Juli 2
I'm updating my solr index using a "queue" table in my database. When
records get updated, a row gets inserted into the queue table with pk,
timestamp, deleted flag, and status. DIH made it easy to use this to
identify new/udpated recods as well as deletes.
I need to do some post processing how
"the" sounds like it might be a stopword. Are you using stopwords in any
of your fields covered by the dismax search? But not in some of the
other fields covered by dismax? the combination of dismax and stopwords
can result in unexpected behavior if you aren't careful.
I wrote about this a bit her
Yep, my schema does this all day long.
--
View this message in context:
http://lucene.472066.n3.nabble.com/MultiValue-dynamicField-and-copyField-tp965941p966536.html
Sent from the Solr - User mailing list archive at Nabble.com.
Sounds like you want the 'text' fieldType (or equivalent) and are using
'string' or 'lowercase'. Those must match all exactly (well, case
insensitively in the case of 'lowercase'). The TextType field types (like
'text') do tokenizations so matches will occur under many more conditions.
--
View t
Hi Bilgin
It's right I have the same primary key, but testing with the property
"preImportDeleteQuery" into the tag entity of the data_config.xml. So now it is
working in fact it deletes only the indexs/docs for which I make the
full-import based on the field I decleare for the preImportDelete
I have a database field = hello world and i am indexing to *text* field
with standard analyzer ( text is a copy field of solr)
Now when user gives a query text:"hello world%" , how does the query is
interpreted in the background
are we actually searchingtext: hello OR text: world%(
Is it possible that you have the same IDs in both entities?
Could you show here your entity mappings?
Bilgin Ibryam
On Wed, Jul 14, 2010 at 11:48 AM, Amdebirhan, Samson, VF-Group <
samson.amdebir...@vodafone.com> wrote:
> Hi all,
>
>
>
> Can someone help me in this ?
>
>
>
> Importing 2 differen
Hi all,
Can someone help me in this ?
Importing 2 different entities one by one (specifying through the entity
parameter) why is the second import deleting the previous created index
for first entity and vice-versa?
The documentation provided by the solr website reports that :
"enti
I doubt about it. Caching system is a key value store. You have to use some
compression library to compress and decompress your data. Caching system
helps to retrieve fast. Anyways please take a look of each of the caching
system features.
Regards
Aditya
www.findbestopensource.com
On Wed, Jul 1
> Trying to analyze PositionFilter: didn't understand why earlier the
> search of 'Nina Simone I Put' failed since atleast the phrase 'Nina
> Simone' should have matched against title_0 field. Any clue?
Please note that I have configure the ShingleFilter as bigrams without unigrams.
[Honestly, I
Thank you. I don't know which cache system to use. In my application,
the cache system must support compression algorithm which has high
compression ratio and fast decompression speed(because each time it
get from cache, it must decompress).
2010/7/14 findbestopensource :
> I have just provided yo
Hi Steve,
Thanks, wrapping with PositionFilter actually worked the search and
score -- I made a mistake while re-indexing last time.
Trying to analyze PositionFilter: didn't understand why earlier the
search of 'Nina Simone I Put' failed since atleast the phrase 'Nina
Simone' should have matched
Hi everyone,
i was wondering if the following was possible somehow:
As in: using copyField to copy a multiValued field into another multiValued
field.
Cheers,
Jan
> I sent this command: curl http://localhost:8081/solr/update -F stream.body='
> ', but it doesn't reload.
>
> It doesn't reload automatically after every commit or
> optimize unless I add
> new document then i commit.
Hmm. May be there is an easier way to force it? (add empty/dummy doc)
But if y
I have just provided you two options. Since you already store as part of the
index, You could try external caching. Try using ehcache / Membase
http://www.findbestopensource.com/tagged/distributed-caching . The caching
system will do LRU and is much more efficient.
On Wed, Jul 14, 2010 at 12:39 PM
Hi Steve,
Thanks for your kind response. I checked PositionfilterFactory
(re-index as well) but that also didn't solve the problem. Interesting
the problem is not reproduceable from Solr's Field Analysis page, it
manifests only when it's in a query.
I guess the subject for this post is not very c
I have already store it in lucene index. But it is in disk and When a
query come, it must seek the disk to get it. I am not familiar with
lucene cache. I just want to fully use my memory that load 10GB of it
in memory and a LRU stragety when cache full. To load more into
memory, I want to compress
I sent this command: curl http://localhost:8081/solr/update -F stream.body='
', but it doesn't reload.
It doesn't reload automatically after every commit or optimize unless I add
new document then i commit.
Any idea?
On Tue, Jul 13, 2010 at 4:54 PM, Ahmet Arslan wrote:
> > I'm using solr 1.4 a
You have two options
1. Store the compressed text as part of stored field in Solr.
2. Using external caching.
http://www.findbestopensource.com/tagged/distributed-caching
You could use ehcache / Memcache / Membase.
The problem with external caching is you need to synchronize the deletions
and
55 matches
Mail list logo