I am trying to use Filebased and index based spell checker and getting this
exception All checkers need to use the same StringDistance.
They work fine as expected individually but not together.
Any pointers?
-Manasi
--
View this message in context:
Eric ,
In freq:termfreq(product,'spider') , freq is alias for 'termfreq' function
query so I could have that field with name 'freq' in document response.
this is my code which I am using to get document object and there is no
termfreq field in its fields collection.
DocList docs =
Thanks to Sandeep in this post:
http://lucene.472066.n3.nabble.com/HTTP-Status-503-Server-is-shutting-down-td4065958.html#a4078567
I was able to setup Tomcat 6 with Solr 431.
However, I need a multicore implementation and am now stuck on how to do so.
Here is what I did based on Sandeeps
Hi,
I have a strange behavior while searching my solrcloud cluster:
for a query like this
http://localhost/solr/my_collection/select?q=my+query;
http://10.1.1.193:7006/solr-madaptive/collection_mapi/select?q=%22Sairauden+sanoma%22
solr responses sometimes with one document and sometimes with
I've a set of documents with a WhiteSpaceTokenize field. I want to give more
boost when the match of the query happens in the first 3 token positions of
the field. Is there any way to do that (don't want to use payloads as they
mean on more seek to disk so lower performance)
--
View this
You must implement a SpanFirst query yourself. These are not implemented in any
Solr query parser. You can easily expand the (e)dismax parsers and add support
for it.
-Original message-
From:Anatoli Matuskova anatoli.matusk...@gmail.com
Sent: Thursday 18th July 2013 11:54
To:
Hi all
I am using a Custom RequestHandlerBase where I am querying from multiple
different Solr instance and aggregating their output as a XML Document
using DOM,
now in the RequestHandler's function handleRequestBody(SolrQueryRequest
req, SolrQueryResponse resp) I want to output this XML Document
Thanks for the quick answer Markus.
Could you give me a a guideline or point me where to check in the solr
source code to see how to get it done?
--
View this message in context:
http://lucene.472066.n3.nabble.com/boost-docs-if-token-matches-happen-in-the-first-5-words-tp4078786p4078792.html
This isn't a Solr issue. Maybe ask on the xerces list?
On Thu, Jul 18, 2013 at 3:31 PM, Vineet Mishra clearmido...@gmail.comwrote:
Hi all
I am using a Custom RequestHandlerBase where I am querying from multiple
different Solr instance and aggregating their output as a XML Document
using
Thanks for your response Shalin,
so does that mean that we can't return a XML object in SolrQueryResponse
through Custom RequestHandler?
On Thu, Jul 18, 2013 at 4:04 PM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
This isn't a Solr issue. Maybe ask on the xerces list?
On Thu, Jul
You'll need the import org.apache.lucene.search.spans package in Solr's
ExtendedDismaxQParserPlugin and add SpanFirstQuery's to the main query.
Something like:
query.add(new SpanFirstQuery(new SpanTermQuery(field, clause), distance),
BooleanClause.Occur.SHOULD);
-Original message-
Solr's response writers support only a few known types. Look at the
writeVal method in TextResponseWriter:
https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/response/TextResponseWriter.java
On Thu, Jul 18, 2013 at 4:08 PM, Vineet Mishra
Hi
It totally depends upon your affordability. If you could afford go for
bigger RAM, SSD drive and 64 Bit OS.
Benchmark your application, with certain set of docs, how much RAM it
takes, Indexing time, Search time etc. Increase the document count and
perform benchmarking tasks again. This will
Thanks Shawn and Aditya. Really appreciate your help. Based on your advice
and reading the SolrPerformance article Shawn linked me to, I ended up
getting Intel Dual Core (2 Core) i3 3220 3.3Ghz with 36GB RAM with 2 x
125GB SSD drives for 227$ per month. It's still expensive for me but I got
it
As detailed in previous email, termfreq is not a field - it is a
transformer or function. Technically, it is actually a ValueSource.
If you look at the TextResponseWriter.writeVal method you can see you it
kicks off the execution of transformers for writing documents.
-- Jack Krupansky
But it seems it even have something called XML ResponseWriter
https://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/java/org/apache/solr/response/XMLResponseWriter.java
Wont it be appropriate in my case?
Although I have not implemented it yet but how come there couldn't be any
way to
It would probably be better to integrate the responses (document lists.)
Solr response writers do a lot of special processing of the response data,
so you can't just throw random objects into the response.
You may need to explain your use case a little more clearly.
-- Jack Krupansky
So does that mean there is no way that we can write a XML or JSON object to
the SolrQueryResponse and expect it to be formatted?
Okay, let me explain. If you construct your combined response (why are you
doing that again?) in the form a Solr NamedList or SolrDocumentList then
the XMLResponseWriter (which btw uses TextResponseWriter) has no problem
writing it down as XML. The problem here is that you are giving it an
object
Hi,
Is it possible to sort search results based on the count of similar documents a
document has? Say we have a document A which has 4 other similar documents in
the index and document B which has 10. Then the order solr returns them should
be B, A. Sorting on moreLikeThis counts for each
My case is like, I have got a few Solr Instances and querying them and
getting their xml response, out of that xml I have to extract a group of
specific xml nodes, later I am combining other solr's response into a
single xml and making a DOM document out of it.
So as you mentioned in your last
I have tried doing this via custom SearchComponent, where I can find all similar
documents for each document in current search result, then add a new field into
document hoping to use sort parameter (q=*sort=similarityCount).
I don't understand this part very well, but:
But this will not
Hi Shawn;
This is what I see when I look at mbeans:
lst name=UPDATEHANDLERlst name=updateHandlerstr
name=classorg.apache.solr.update.DirectUpdateHandler2/strstr
name=version1.0/strstr name=descriptionUpdate handler that
efficiently directly updates the on-disk main lucene index/strstr
Not your updateHandler, that only shows number about what it's doing and it can
be restarted. Check your cores:
host:port/solr/admin/cores
-Original message-
From:Furkan KAMACI furkankam...@gmail.com
Sent: Thursday 18th July 2013 15:46
To: solr-user@lucene.apache.org
Subject: Re:
This sounds like a bad idea. You could have done this much simply inside
your own application using libraries that you know well.
That being said, instead of creating a DOM document, create a solr
NamedList object which can be serialized by XMLResponseWriter.
On Thu, Jul 18, 2013 at 6:48 PM,
I have a situation which is common in our current use case, where I need to
get a large number (many hundreds) of documents by id. What I'm doing
currently is creating a large query of the form id:12345 OR id:23456 OR
... and sending it off. Unfortunately, this query is taking a long time,
Hey andre, that isn't a possibility for us right now since we are
terminating nodes using aws autoscaling policies. We'll have to either
change our policies so that we can have some kind of graceful shutdown
where we get the possibility to unload cores or update zookeeper's cluster
state every
Hi all,
I need to execute a Solr query in two steps, executing in the first step a
generic limited-results query ordered by relevance, and in the second step
the ordering of the results of the first step according to a given sorting
criterion (different from relevance).
This two-steps query is
Hi Markus;
It doesn't give me how many documents updated from last commit.
2013/7/18 Markus Jelsma markus.jel...@openindex.io
Not your updateHandler, that only shows number about what it's doing and
it can be restarted. Check your cores:
host:port/solr/admin/cores
-Original
No nothing will. If you must know, you'll have to do it on the client side and
make sure autocommit is disabled.
-Original message-
From:Furkan KAMACI furkankam...@gmail.com
Sent: Thursday 18th July 2013 17:01
To: solr-user@lucene.apache.org
Subject: Re: How can I learn the total
You could start from doing id:(12345 23456) to reduce the query length and
possibly speed up parsing.
You could also move the query from 'q' parameter to 'fq' parameter, since
you probably don't care about ranking ('fq' does not rank).
If these are unique every time, you could probably look at not
Solr really isn't designed for that kind of use case. If it happens to work
well for your particular situation, great, but don't complain when you are
well outside the normal usage for a search engine (10, 20, 50, 100 results
paged at a time, with modest sized query strings.)
If you must get
Brian,
Have you tried the realtime get handler? It supports multiple documents.
http://wiki.apache.org/solr/RealTimeGet
Michael Della Bitta
Applications Developer
o: +1 646 532 3062 | c: +1 917 477 7906
appinions inc.
“The Science of Influence Marketing”
18 East 41st Street
New York, NY
Rajesh,
If you require to have an integration between Solr and Hadoop or NoSQL, I
would recommend using a commercial distribution. I think most are free to
use as long as you don't require support.
I inquired about the Cloudera Search capability, but it seems like that
far it is just preliminary:
Look at speed of reading the data - likely, it takes long time to assemble
a big response, especially if there are many long fields - you may want to
try SSD disks, if you have that option.
Also, to gain better understanding: Start your solr, start jvisualvm and
attach to your running solr. Start
I'm familiar with and have used both the DSE cluster as well as am in the
process of evaluating cloudera search, in general cloudera search has tight
integration with hdfs and takes care of replication and sharding transparently
by using the pre-existing hdfs replication and sharding, however
And I guess, if only a subset of fields is being requested but there are
other large fields present, there could be the cost of loading those extra
fields into memory before discarding them. In which case,
using enableLazyFieldLoading may help.
Regards,
Alex.
Personal website:
Hello,
I am using the solr nightly version 4.5-2013-07-18_06-04-44 and I want
to use Document Entity in schema.xml, I get this exception :
java.lang.RuntimeException: schema fieldtype
string(org.apache.solr.schema.StrField) invalid
arguments:{xml:base=solrres:/commonschema_types.xml}
at
i have now changed some things and the import runs without error. in schema.xml
i haven't got the field text but contentsExact. unfortunatly the text (from
file) isn't indexed even though i mapped it to the proper field. what am i
doing wrong?
data-config.xml:
dataConfig
dataSource
I have a TrieDateField dynamic field setup in my schema, pretty standard...
dynamicField name=*_tdt type=tdate indexed=true stored=false/
fieldType name=tdate class=solr.TrieDateField omitNorms=true
precisionStep=6 positionIncrementGap=0/
In my code I only set one field, creation_tdt and
On Thu, Jul 18, 2013 at 12:53 PM, JohnRodey timothydd...@yahoo.com wrote:
I have a TrieDateField dynamic field setup in my schema, pretty standard...
dynamicField name=*_tdt type=tdate indexed=true stored=false/
fieldType name=tdate class=solr.TrieDateField omitNorms=true
Thanks for your reply. Yes, it worked. No more crashes after switching to
1.6.0_30
--
View this message in context:
http://lucene.472066.n3.nabble.com/JVM-Crashed-SOLR-deployed-in-Tomcat-tp4078439p4078906.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hey folks,
I've been migrating an application which indexes about 15M documents from
straight-up Lucene into SolrCloud. We've set up 5 Solr instances with a 3
zookeeper ensemble using HAProxy for load balancing. The documents are
processed on a quad core machine with 6 threads and indexed
Thanks everyone for the response.
On Thu, Jul 18, 2013 at 11:22 AM, Alexandre Rafalovitch
arafa...@gmail.comwrote:
You could start from doing id:(12345 23456) to reduce the query length and
possibly speed up parsing.
I didn't know about this syntax- it looks useful.
You could also move
Hi to all,
Probably this question has a simple answer but I just want to be sure of
the potential drawbacks..when I run SolrCloud I run the main solr instance
with the -numShard option (e.g. 2).
Then as data grows, shards could potentially become a huge number. If I
hadstio to restart all nodes
I am trying to implement Historical search using SOLR.
Ex:
If I search on address 800 5th Ave and provide a time range, it should list
the name of the person who was living at the address during the time period.
I am trying to figure out a way to store the data without redundancy.
I can do a
Exploring various SpellCheckers in solr and have a few questions,
1. Which algorithm is used for generating suggestions when using
IndexBasedSpellChecker. I know its Levenshtein (with edit distance=2 -
default) in DirectSolrSpellChecker.
2. If i have 2 indices, can I setup multiple
check the below link to get more info on IndexBasedSpellCheckers
http://searchhub.org/2010/08/31/getting-started-spell-checking-with-apache-lucene-and-solr/
--
View this message in context:
http://lucene.472066.n3.nabble.com/Spellcheck-questions-tp4078985p4079000.html
Sent from the Solr -
Hello,
I send to solr( to server1 in the cluster of two servers) the folowing request
Hi,
After upgrading solr from 3.6 to 4.3, we found that solr opened a lot more
files compared to
solr 3.6 (when core is open). Since we have many cores (more than 2K and still
grow), we
would like to reduce the number of open files.
We already used shareSchema and sharedLib, we also shared
Thank you for adding to the wiki! It's always appreciated...
On Wed, Jul 17, 2013 at 5:18 PM, Ali, Saqib docbook@gmail.com wrote:
Thanks Erick!
I have added the instructions for running SolrCloud on Jboss:
http://wiki.apache.org/solr/SolrCloud%20using%20Jboss
I will refine the
Why do you care about redundancy? That's the search engine's architectural
tradeoff (as far as I understand). And, the tokens are all normalized under
the covers, so it does not take as much space as you expect.
Specifically regarding your issue, maybe you should store 'occupancy' as
the record.
52 matches
Mail list logo