Hi guys!
Thanks alot for your suggestions and help - I really appreciate that!
As we need e.g. the the price for sorting I think it must be in the index?
Thus, I'm not shure that a key-value-store is the thing we are looking for as we
need a good searchengine.
Currently we are using serveral
On Thu, Mar 18, 2010 at 8:45 AM, Moritz Maedler m...@moritz-maedler.dewrote:
Hi guys!
Thanks alot for your suggestions and help - I really appreciate that!
As we need e.g. the the price for sorting I think it must be in the index?
Thus, I'm not shure that a key-value-store is the thing we
Hi,
I am doing a really simple query on my index (it's running in tomcat):
http://host:8080/solr_er_07_09/select/?q=hash_id:123456
I am getting the following exception:
HTTP Status 500 - null java.lang.IllegalArgumentException at
java.nio.Buffer.limit(Buffer.java:249) at
I'm using StreamingUpdateSolrServer to index a document.
StreamingUpdateSolrServer server = new
StreamingUpdateSolrServer(http://localhost:8983/solr/core0;, 20, 4);
server.setRequestWriter(new BinaryRequestWriter());
SolrInputDocument doc = new SolrInputDocument();
doc.addField(id, 12121212);
The StreamingUpdateSolrServer does not support binary format,
unfortunately.
Erik
On Mar 18, 2010, at 8:15 AM, Tim Terlegård wrote:
I'm using StreamingUpdateSolrServer to index a document.
StreamingUpdateSolrServer server = new
I'm starting to play with Solr. I am looking at the API and see that there
is an addFacetField on the SolrQuery Object that is required to specify
which facet fields you want returned. Is there any way to specify that we
want all facet fields with explicitly having to add them all via
Hey Dominique,
See
http://www.lucidimagination.com/search/document/5ea8054ed8348e6f/highlight_arbitrary_text#3799814845ebf002
Although it might be not good solution for huge texts, wildcard/phrase queries.
http://issues.apache.org/jira/browse/SOLR-1397
On Mon, Mar 15, 2010 at 4:09 PM, dbejean
I don't think there's a way to do what has come to my mind but want to be
sure.
Let's say I have a doc with 2 fileds, one is multiValued
doc1:
name-john
year-2009;year-2010;year-2011
And I query for:
q=johnfq=-year:2010
Doc1 won't be in the matching results. Is there a way to make it appear
You could also do the xpath processing on the oracle end using the extract or
extractValue functions. Here's a good reference:
http://www.psoug.org/reference/xml_functions.html
-Original Message-
From: Neil Chaudhuri [mailto:nchaudh...@potomacfusion.com]
Sent: Wednesday, March 17,
It would be nice if the documentation mentioned this. :)
/Tim
2010/3/18 Erik Hatcher erik.hatc...@gmail.com:
The StreamingUpdateSolrServer does not support binary format, unfortunately.
Erik
On Mar 18, 2010, at 8:15 AM, Tim Terlegård wrote:
I'm using StreamingUpdateSolrServer to
Hello.
I search an synonym and spellcheck.txt
where can i find it in the laaarge internet ?
or how, do you filled these two files with good names ?
--
View this message in context:
http://old.nabble.com/where-can-i-get-an-synonym.txt-and-spellcheck.txt---tp27946812p27946812.html
Sent from
No, there isn't. How would one know what all the facet fields are,
though?
One trick, use the luke request handler to get the list of fields,
then use that list to construct the facet fields request parameters.
Erik
On Mar 18, 2010, at 8:40 AM, homerlex wrote:
I'm starting to
Hi,
Just needed some help to understand the following synonym mappings:-
1. aaa =
does it mean:-
if the user queries for aaa it is replaced with and documents
matching are searched for
or does it mean
if the user queries for aaa, documents with aaa as well
Hi,
Thanks for the mail. I had tried the WIKI.
My doubts remaining were mainly:-
1.
If we have synonyms specified and they replace your search keyword with the
ones specified wouldn't we face a risk of our original keyword missed out.
What i meant is if I have a keyword for search say
Does anyone have any recommendations on which OS to use when setting up Solr
search server?
Any memory/disk space recommendations?
Thanks
--
View this message in context:
http://old.nabble.com/Recommended-OS-tp27948306p27948306.html
Sent from the Solr - User mailing list archive at
Most sites allow you to search for some text, and then click on Facets (or
Tags or Taxonomy branches) to drill down into your search.
Most sites also show the search box in these search results, with the text
previously entered, so that you can edit it and resubmit. Perhaps you want
to add a
http://wiki.apache.org/solr/FAQ#What_are_the_Requirements_for_running_a_Solr_server.3F
I have Solr running on CentOS 5.4. It runs fine on the OpenJDK 1.6.0
and Tomcat 5. If I were to do it again, I'd probably just stick with
Jetty.
You really will need to read the docs to get the settings right
aha, okay thx.
and how do you get yout spellcheck words from your productnames ?
we have somtimes very looong names. how it is possible to use the
spellchecker function or autosuggestion in the right way ?
Erick Erickson wrote:
You probably won't find a good synonyms file. The problem
On 2010-03-18, at 1:03 PM, K Wong wrote:
http://wiki.apache.org/solr/FAQ#What_are_the_Requirements_for_running_a_Solr_server.3F
I have Solr running on CentOS 5.4. It runs fine on the OpenJDK 1.6.0
and Tomcat 5. If I were to do it again, I'd probably just stick with
Jetty.
Would you mind
Beat me to the punch with that question.
KWong, did you happen to install the Apache APR? Wondering if it is even
worth the trouble.
I am thinking about going with RedHat Enterprise 5 unless anyone has any
objections?
Jean-Sebastien Vachon wrote:
On 2010-03-18, at 1:03 PM, K Wong wrote:
We're running Solr to provide search services to a Drupal 6
installation. The site is very low traffic (35 uniques a day) and
search doesn't get used very often. I was thinking that I could get
away with running it on the Jetty that comes with Solr. It's just one
less thing that has to be looked
: It seems that Solr's query parser doesn't pass a single term query
no ... the query parser always uses the analyzer for text regardless of
wether it's a single term or not (it doesnt' even know if it's a single
term until the Analyzer tells it)
cases where the analyzer isn't used are things
:
: Thank you, Marco. I see the debug out put that looks like:
: str name=rawquerystringtitle_jpn:2001年/str
: str name=querystringtitle_jpn:2001年/str
: str name=parsedqueryPhraseQuery(title_jpn:2001 年)/str
: str name=parsedquery_toStringtitle_jpn:2001 年/str
...
: Does this mean the
1) Took care of the first one by Transformer.
2) Any input on 2 please? I need to store # of views and popularity with
each document and that can change pretty often. Recommended to use database
or can this be updated to SOLr directly? My issue with DB is that with every
SOLR search hit, will
You'll probably want to influence your relevancy on this popularity number that
is changing often. ExternalFileField looks like a possibility though I haven't
used it. Another would be using an in-memory cache which stores all popularity
numbers for any data that has its popularity updated
David,
Much appreciated. This gives me enough to work with.
I missed one important point. Our data changes pretty frequently which mean
we may be running deltas every 5-10 minutes. in-memory should work
thanks
David Smiley @MITRE.org wrote:
You'll probably want to influence your
Thanks for the reply. Can someone point me to a sample on how to use the
luke request handler to get this info?
Erik Hatcher-4 wrote:
No, there isn't. How would one know what all the facet fields are,
though?
One trick, use the luke request handler to get the list of fields,
then
I'm trying to give a super boost to fields that match exactly, but it
doesn't appear to be working. I have this:
field name=artist_tight type=string_lower indexed=true
stored=true/
field name=title_tight type=string_lower indexed=true stored=true/
copyField source=title dest=title_tight/
On Mar 18, 2010, at 2:44 PM, caman wrote:
1) Took care of the first one by Transformer.
This is often also something done by a classifier that is trained to deal with
all the statistical variations in your text. Tools like Weka, Mahout, OpenNLP,
etc. can be applied here.
2) Any input on
Tried following their tutorial for plugging zoie into solr:
http://snaprojects.jira.com/wiki/display/ZOIE/Zoie+Server
It appears it only allows you to search on documents after you do a commit?
Am I missing something here, or does plugin not doing anything.
Their tutorial tells you to do a
Can the trim filter factory work on string fieldtypes?
When I define a trim filter factory on a string fieldtype, i get an
exception:
org.apache.solr.common.SolrException: Unknown fieldtype 'string'
specified on field id
at
Can anyone tell me, where I can buy or download free spell dictionary for
solr? I need not simple dictionary! I need very good spell american-english
dictionary(or only american)!
--
View this message in context:
http://old.nabble.com/good-spell-dictionary-tp27950854p27950854.html
Sent from
Below is my data-config.xml file, which I am using to build an index for
my first shard. I have a couple of questions.
Can Solr include the hostname (short version) it's running on in the
query? Alternatively, is there a way to override the query with a URL
parameter before or when doing
On 18.03.2010, at 23:12, Shawn Heisey wrote:
Below is my data-config.xml file, which I am using to build an index for my
first shard. I have a couple of questions.
Can Solr include the hostname (short version) it's running on in the query?
Alternatively, is there a way to override the
: Now, I know how to work-around this, by appending some unique character
: sequence at each end of the field and then include this in my search in
: the front end. However, I wonder if any of you have been planning a
: patch to add a native boundary match feature to Solr that would
:
: Is there a way to get *total count of facets* per field?
sorry, no. you can skip ahead, but the only way to know when you're done
is when you stop getting constraints back for that field.
-Hoss
: Been testing nutch to crawl for solr and I was wondering if anyone had
: already worked on a system for getting the urls out of solr and generating
: an XML sitemap for Google.
it's pretty easy to just paginate through all docs in solr, so you could
do that -- but I'd be really suprised if
: Can I build a query such as :
:
: -field: A
:
: which will return all documents that do not have exclusive A in the
: their field's values. By exclusive I mean that I don't want documents
: that only have A in their list of values. In my sample case, the query
: would return doc A
: For example, in dice.com, the visitor can search by some keyword and filter
: further by Skill, Country, Province, City, Telecommute, Travel Required
: (shown on the left pane on dice.com). We were wondering if there is some
: built-in feature/functionality that can be used from Solr to
I only have time for a quick glance, but what jumps out is
that this part:
title:rude boy^100
probably isn't matching boy against your title field, it's matching rude
against title, but boy against your default field and boosting the boy
part.
Try parenthesizing (at least that works in
That looks very useful. So does this mean that this will work?
URL text:
?command=full-importnumShards=6modValue=0minDid=229615984
XML:
query=SELECT * FROM [table] WHERE (did %
${dataimporter.request.numShards}) = ${dataimporter.request.modValue}
AND ${dataimporter.request.minDid} = did
I recently switched from posting a file (PDFs in this case) to the Extract
handler, to using the Stream.URL parameter. I've noticed a huge amount of
contention around opening URL connections:
http-8080-Processor36 [BLOCKED] CPU time: 0:47
sun.net.www.protocol.file.Handler.openConnection(URL)
Spellcheck is generally more useful it it's derived from words
already *in* your index. It's of little use to a user to have
spellcheck/autosuggest show terms that aren't in the
index...
HTH
Erick
On Thu, Mar 18, 2010 at 6:00 PM, michaelnazaruk michaelnaza...@gmail.comwrote:
Can anyone tell
It's also possible to try and use the Velocity contrib response writer and
paging it w/ the sitemap elements.
BTW generating a sitemap was a big reason of a switch we did from GSA to Solr
because (for some reason) the map took way too long to generate (even simple
requests).
If you page
I gave this config idea a try, looks like it works perfectly. I thought
at first that it wasn't working, but as is usual with such things, my
XML was faulty.
Many many thanks!
Shawn
On 3/18/2010 5:19 PM, Shawn Heisey wrote:
That looks very useful. So does this mean that this will work?
Due to some issues with the (lack of) functionality behind the
abortOnConfigurationError option in solrconfig.xml, I'd like to take a
quick poll of the solr-user community...
* If you have never heard of the abortOnConfigurationError
option prior to this message, please ignore this
David - sounds kinda like this one: http://issues.apache.org/jira/browse/SOLR-1280
:)
Maybe you'd be up for rounding this issue out with your enhancements
and get this committable?
Erik
On Mar 18, 2010, at 4:06 PM, Smiley, David W. wrote:
Coincidentally I'm working on something
When I don't do the commit, I cannot search the documents I've
indexed. - that's exactly how Solr without Zoie works, and it's how
Lucene itself works. Gotta commit to see the documents indexed.
Erik
On Mar 18, 2010, at 5:41 PM, brad anderson wrote:
Tried following their
My understanding is that too many facet values will decrease performance
How many is too many? Are there any rules of thumb for this?
2 related questions:
- I expect a facet field to have many values (values are user generated), any
thing I can do to minimize the performance impact?
- Any way
I tired adding hl.maxAnalyzedChars=-1 to my search query but it didnt
helped.
Just wanted to know if there are limitations on the certain search terms.
Its bit strange that solr is not behaving properly for certain terms
(especially returning the excerpts in highlighting dictionary).
The terms
Hi Giovanni,
Let's try and isolate the problem. Can you try parsing the PDF file with
tika-app as a standalone? Take your tika-app jar file then run java -jar
tika-app-0.7-SNAPSHOT.jar -m /path/to/pdf/file
That should give you something like:
Content-Type: application/pdf
created: Thu Sep 06
51 matches
Mail list logo