Hi all,
I just get started looking at using Solr as my search web service. But I
don't know does Solr have some features for multiple queries:
- Startswith
- Exact Match
- Contain
- Doesn't Contain
- In the range
Could anyone guide me how to implement those features in Solr?
Cheers,
Samnang
On Monday 01 June 2009 16:50, Sam Michaels wrote:
So the fix for this problem would be
1. Stop using WordDelimiterFilter for queries (what is the alternative) OR
2. Not allow any search strings without any alphanumeric characters..
We ran into this same problem while replacing all characters
Hi Mark,
i actually got this error coz i was using an old version of
java. now the problem is solved
Thanks anyways
Raakhi
On Tue, Jun 9, 2009 at 11:17 AM, Rakhi Khatwani rkhatw...@gmail.com wrote:
Hi Mark,
yea i would like to open a JIRA issue for it. how do i go
On Tue, Jun 9, 2009 at 11:15 AM, revas revas...@gmail.com wrote:
1)Does the spell check component support all languages?
SpellCheckComponent relies on Lucene/Solr analyzers and tokenizers. So if
you can find an analyzer/tokenizer for your language, spell checker can
work.
2) I have a
Ok here it goes:
?xml version=1.0?
dataConfig
dataSource type=JdbcDataSource name=dbA
driver=com.mysql.jdbc.Driver
url=jdbc:mysql://localhost:3306/dbA?zeroDateTimeBehavior=convertToNull
user=root password=/
document
entity name=dbA.project dataSource=dbA
transformer=TemplateTransformer
can you avoid . dots in the entity name and try it out. dots are
special characters and it should have caused some problem
On Tue, Jun 9, 2009 at 1:37 PM, gateway0reiterwo...@yahoo.de wrote:
Ok here it goes:
?xml version=1.0?
dataConfig
dataSource type=JdbcDataSource name=dbA
Hi there Samnang!
Please see inline for comments:
On Tue, 09 Jun 2009 08:40:02 +0200, Samnang Chhun
samnang.ch...@gmail.com wrote:
Hi all,
I just get started looking at using Solr as my search web service. But I
don't know does Solr have some features for multiple queries:
- Startswith
Hi,
I was looking for ways in which we can use solr in distributed mode.
is there anyways we can use solr indexes across machines or by using Hadoop
Distributed File System?
Its has been mentioned in the wiki that
When an index becomes too large to fit on a single system, or when a single
nope
On Tue, Jun 9, 2009 at 4:59 AM, vaibhav joshicallvaib...@hotmail.com wrote:
Hi,
I am currently using solr 1.3 and runnign the sole as NT service. I need to
store data indexes on a Remote Filer machine. the Filer needs user
credentials inorder to access the same.. Is there a solr
No I changed the entity name to dbA:project but still the same problem.
Interesting sidenote If I use my Data-Config as posted (with the id field
in the comment section) none of the other entities works anymore like for
example:
entity name=user dataSource=dbA query=select username from
But the spell check componenet uses the n-gram analyzer and henc should work
for any language ,is this correct ,also we can refer an extern dictionary
for suggestions ,could this be in any language?
The open files is not because of spell check as we have not yet implemented
this yet, every time
Hey there,
Does the lucene2.9-dev used in current Solr nighty-build (9-6-2009) include
the patch LUCENE-1662 to avoid doubling memory usage in lucene FieldCache??
Thanks in advance
--
View this message in context:
On Tue, Jun 9, 2009 at 2:56 PM, revas revas...@gmail.com wrote:
But the spell check componenet uses the n-gram analyzer and henc should
work
for any language ,is this correct ,also we can refer an extern dictionary
for suggestions ,could this be in any language?
Yes it does use n-grams but
I have an index with two fields - name and type. I need to perform a search
on the name field so that *equal number of results are fetched for each type
*.
Currently, I am achieving this by firing multiple queries with a different
type and then merging the results.
In my database driven version, I
There should be no problem if you re-use the same variable
are you sure you removed the dots from everywhere?
On Tue, Jun 9, 2009 at 2:55 PM, gateway0reiterwo...@yahoo.de wrote:
No I changed the entity name to dbA:project but still the same problem.
Interesting sidenote If I use my
I don't know if I follow you correctly, but you are saying that you want X
results per type?
So you do something like limit=X and query = type:Y etc. and merge the
results?
- Aleks
On Tue, 09 Jun 2009 12:33:21 +0200, Avlesh Singh avl...@gmail.com wrote:
I have an index with two fields -
Thanks ShalinWhen we use the external file dictionary (if there is
one),then it should work fine ,right for spell check,also is there any
format for this file
Regards
Sujatha
On Tue, Jun 9, 2009 at 3:03 PM, Shalin Shekhar Mangar
shalinman...@gmail.com wrote:
On Tue, Jun 9, 2009 at 2:56
Hi all,
I'm trying to figure out how to shard our index as it is growing rapidly
and we want to make our solution scalable.
So, we have documents that are most commonly sorted by their date. My
initial thought is to shard the index by date, but I wonder if you have
any input on this and how
On Tue, Jun 9, 2009 at 4:32 PM, revas revas...@gmail.com wrote:
Thanks ShalinWhen we use the external file dictionary (if there is
one),then it should work fine ,right for spell check,also is there any
format for this file
The external file should have one token per line. See
I don't know if I follow you correctly, but you are saying that you want X
results per type?
You are right. I need X number of results per type.
So you do something like limit=X and query = type:Y etc. and merge the
results?
That is what the question is! Which means, if I have 4 types, I am
On Tue, Jun 9, 2009 at 4:03 PM, Avlesh Singh avl...@gmail.com wrote:
I have an index with two fields - name and type. I need to perform a search
on the name field so that *equal number of results are fetched for each
type
*.
Currently, I am achieving this by firing multiple queries with a
Rakhi Khatwani wrote:
Hi,
I was looking for ways in which we can use solr in distributed mode.
is there anyways we can use solr indexes across machines or by using Hadoop
Distributed File System?
Its has been mentioned in the wiki that
When an index becomes too large to fit on a single
Thanks for bringing closure to this Raakhi.
- Mark
Rakhi Khatwani wrote:
Hi Mark,
i actually got this error coz i was using an old version of
java. now the problem is solved
Thanks anyways
Raakhi
On Tue, Jun 9, 2009 at 11:17 AM, Rakhi Khatwani rkhatw...@gmail.com wrote:
Noticed this warning in the log file:
Jun 9, 2009 2:53:35 PM
org.apache.solr.handler.dataimport.TemplateTransformer transformRow
WARNING: Unable to resolve variable: dbA.project.id while parsing
expression: ${dbA.project.dbA.project},id:${dbA.project.id}
Ok? Whats that suppose to mean?
Noticed this warning in the log file:
Jun 9, 2009 2:53:35 PM
org.apache.solr.handler.dataimport.TemplateTransformer transformRow
WARNING: Unable to resolve variable: dbA.project.id while parsing
expression: ${dbA.project.dbA.project},id:${dbA.project.id}
Ok? Whats that suppose to mean?
And
Yep. CHANGES.txt for Solr has this:
34. Upgraded to Lucene 2.9-dev r779312 (yonik)
And if you click the All tab for LUCENE-1662, is says the committed
revision was 779277
-Yonik
http://www.lucidimagination.com
On Tue, Jun 9, 2009 at 5:32 AM, Marc Sturlese marc.sturl...@gmail.com wrote:
Hey
Martin Davidsson schrieb:
I've tried to read up on how to decide, when writing a query, what
criteria goes in the q parameter and what goes in the fq parameter, to
achieve optimal performance. Is there [...] some kind of rule of thumb
to help me decide how to split things up when querying
Common cache configuration parameters include @size (size attribute).
http://wiki.apache.org/solr/SolrCaching
For each of the following, does this mean the maximum size of:
* filterCache/@size - filter query results?
* queryResultCache/@size - query results?
* documentCache/@size - documents?
On Tue, Jun 9, 2009 at 6:39 PM, gateway0reiterwo...@yahoo.de wrote:
Noticed this warning in the log file:
Jun 9, 2009 2:53:35 PM
org.apache.solr.handler.dataimport.TemplateTransformer transformRow
WARNING: Unable to resolve variable: dbA.project.id while parsing
expression:
I just got the nightly build, and terms comp works great!!!
merci beaucoup
On Mon, Jun 8, 2009 at 8:00 PM, Aleksander M. Stensby
aleksander.sten...@integrasco.no wrote:
You can try out the nightly build of solr (which is the solr 1.4 dev
version) containing all the new nice and shiny
In trying to run the example distributed with Solr 1.3.0 from the command line,
the process seems to stop at the following line:
INFO: [] Registered new searcher searc...@147c1db main
The searcher ID is not always the same, but it repeatedly gets caught at this
line. Any suggestions?
Ryan McKinley schrieb:
I am working with an in index of ~10 million documents. The index
does not change often.
I need to preform some external search criteria that will return some
number of results -- this search could take up to 5 mins and return
anywhere from 0-10M docs.
If it really
Solr does not support this. You can do it yourself by taking the
highest score and using that as 100% and calculating other percentages
from that number. For example if the max score is 10 and the next
result has a score of 5, you would do (5 / 10) * 100 = 50%. Hope this
helps.
Fer-Bj schrieb:
for all the documents we have a field called small_body , which is a
60 chars max text field that were we store the abstract for each
article.
we need to display this small_body we want to compress every time.
If this works like compressing individual files, the overhead for
Yao Ge schrieb:
The facet query is considerably slower comparing to other facets from
structured database fields (with highly repeated values). What I found
interesting is that even after I constrained search results to just a
few hunderd hits using other facets, these text facets are still
Hello,
We are indexing approximately 500 documents per day. My benchmark says an
update is done in 0.7 sec just after Solr has been started. But it quickly
decrease to 2.2 secs per update !
I have just been focused on the Schema until now, and didn't changed many
stuffs in the solrconfig file.
I take it by the deafening silence that this is not possible? :-)
On Mon, Jun 8, 2009 at 11:34 AM, Amit Nithian anith...@gmail.com wrote:
Hi,
I am still using Solr 1.2 with the Lucene 2.2 that came with that version
of Solr. I am interested in taking advantage of the trie filtering to
Yonik Seeley schrieb:
Are you using Solr 1.3?
You might want to try the latest 1.4 test build -
faceting has changed a lot.
I found two significant changes (but there may well be more):
[#SOLR-911] multi-select facets - ASF JIRA
https://issues.apache.org/jira/browse/SOLR-911
Yao,
it sounds
On Tue, Jun 9, 2009 at 10:19 PM, Amit Nithian anith...@gmail.com wrote:
I take it by the deafening silence that this is not possible? :-)
Anything is possible :)
However, it might be easier to upgrade to 1.4 instead.
On Mon, Jun 8, 2009 at 11:34 AM, Amit Nithian anith...@gmail.com wrote:
I do not recommend using network storage for indexes. This is almost always
extremely slow. When I tried it, indexing ran 100X slower.
If you don't mind terrible performance, configure your NT service to
run as a specific user. The default user is one that has almost no
privileges. Create a new
On Tue, Jun 9, 2009 at 7:25 PM, Michael Ludwig m...@as-guides.com wrote:
http://wiki.apache.org/solr/SolrCaching - filterCache
A filter query is cached, which means that it is the more useful the
more often it is repeated. We know how often certain queries arise, or
at least have the means
Moin Jens,
Jens Fischer schrieb:
I was wondering if there's an option to return statistics about
distances from the query terms to the most frequent terms in the
result documents.
The additional information I'm looking for is the average distance
between these terms and my search term.
So
Shalin Shekhar Mangar schrieb:
On Tue, Jun 9, 2009 at 7:25 PM, Michael Ludwig m...@as-guides.com
wrote:
A filter query should probably be orthogonal to the primary query,
which means in plain English: unrelated to the primary query. To give
an example, I have a field category, which is a
On Tue, Jun 9, 2009 at 7:47 PM, Michael Ludwig m...@as-guides.com wrote:
Common cache configuration parameters include @size (size attribute).
http://wiki.apache.org/solr/SolrCaching
For each of the following, does this mean the maximum size of:
* filterCache/@size - filter query results?
On Tue, Jun 9, 2009 at 11:11 PM, Michael Ludwig m...@as-guides.com wrote:
Sorry, I don't understand. I used to think that the engine applies the
filter to the primary query result. What you're saying here sounds as if
it could also pre-filter my document collection to then apply a query to
Shalin Shekhar Mangar schrieb:
On Tue, Jun 9, 2009 at 7:47 PM, Michael Ludwig m...@as-guides.com
wrote:
Given the following three filtering scenarios of (a) x:bla, (b)
y:blub, and (c) x:bla AND y:blub, will I end up with two or three
distinct filters? In other words, may filters be composites
Shalin Shekhar Mangar schrieb:
No, both filters and queries are computed on the entire index.
My comment was related to the A filter query should probably be
orthogonal to the primary query... part. I meant that both kinds of
use-cases are common.
Got it. Thanks :-)
Michael Ludwig
Hi,
I would greatly appreciate a quick response to this question.
Is there a means of passing a local file to the ExtractingRequestHandler (as
the enableRemoteStreaming/stream.file option does with the other handlers) so
the file contents can directly be read from the local disk versus
I haven't tried it, but I thought the enableRemoteStreaming stuff
should work. That stuff is handled by Solr in other places, if I
recall correctly. Have you tried it?
-Grant
On Jun 9, 2009, at 2:28 PM, doraiswamy thirumalai wrote:
Hi,
I would greatly appreciate a quick response to
Hi Aleksander ,
I gone thorugh the below links and successfully configured rsync using
cygwin on windows xp. In Solr documentation they mentioned many script files
like rysnc-enable, snapshooter..etc. These all UNIX based files scripts.
where do I get these script files for windows OS ?
Any
Yep, all that sounds right.
An additional optimization counts terms for the documents *not* in the
set when the base set is over half the size of the index.
-Yonik
http://www.lucidimagination.com
On Tue, Jun 9, 2009 at 1:01 PM, Michael Ludwig m...@as-guides.com wrote:
Yonik,
from your
Define caught? When I start up Solr, here's what I see (and know it's
working):
2009-06-09 15:18:33.726::INFO: Started SocketConnector @ 0.0.0.0:8983
Jun 9, 2009 3:18:33 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=null path=null params={q=static+firstSearcher+warming
One option is to hit the Luke request handler (numTerms=0 for best
performance), grab all the field names there, then build the fl list
(or facet.field in the cases I've used this trick for) from the fields
with the prefix you desire.
Erik
On Jun 8, 2009, at 11:40 AM, Manepalli,
Michael,
Thanks for the update! I definitely need to get a 1.4 build see if it makes
a difference.
BTW, maybe instead of using faceting for text
mining/clustering/visualization purpose, we can build a separate feature in
SOLR for this. Many of commercial search engines I have experiences with
After that comes up in the command line, I can access the localhost address,
but I can't enter anything on the command line.
-Original Message-
From: Grant Ingersoll [mailto:gsing...@apache.org]
Sent: Tuesday, June 09, 2009 3:20 PM
To: solr-user@lucene.apache.org
Subject: Re:
Solr is a server running in the Jetty web container and accepting
requests over HTTP. There is no command line tool, at least not in
Solr itself, for interacting with Solr. Typically people interact
with it programmatically or via a Web Browser.
I'd start by walking through:
Thanks for the quick response, Grant.
We tried it and it seems to work.
The confusion stemmed from the fact that the wiki states that the parameter is
not used - there are also comments in the test cases for the handler that say:
//TODO: stop using locally defined fields once
Neil - when started using the packaged start.jar, Solr runs in the
foreground; that's why you can't type anything in the command line after
starting it.
Mat
On Tue, Jun 9, 2009 at 15:55, Mukerjee, Neiloy (Neil)
neil.muker...@alcatel-lucent.com wrote:
After that comes up in the command line, I
Shalin Shekhar Mangar wrote:
Second Question:
If i force an empty commit, like this:
curl
http://localhost:8080/solr_rep_master/core/update?stream.body=%3Ccommit/%3E
Then the changed synonym.txt config file are replicated to the slave.
Unfortunately now I need to do a core RELOAD on both
When 'dismax' queries are use, where is the best place to apply boost
values/factors? While indexing by supplying the 'boost' attribute to the
field, or in solrconfig.xml by specifying the 'qf' parameter with the same
boosts? What are the advantages/disadvantages to each? What happens if both
I have a text field from where I remove stop words, as a first approximation
I use facets to see the most common words in the text, but.. stopwords are
there, and if I search documents having the stopwords, then , there are no
documents in the answer.
You can test it in this address (using
Hi All,
I am facing an issue while fetching the records from database by providing
the value '${prod.prod_cd}' in this type at db-data-config.xml.
It is working fine If I provide the exact value of the product code ie
'302437-413'
Here is the db-data-config.xm I am using
dataConfig
dataSource
Hi,
I've to intercept every request to solr (search and update) and log
some performance numbers. In order to do so I tried a Servlet filter
and added this to Solr's web.xml,
filter
filter-nameIndexFilter/filter-name
Hi all,
I am very new in Solr and I want to use Solr to index data without token to
match with my search.
Does anyone know how to index data without token in Solr?
if possible, can you give me an example?
Thanks in advance,
LEE
Hi ,
Unfortunately , the problem is that an 'empty' commit does not really
do anything. I mean, it is not a real commit.Solr takes a look to find
if the index is changed and if not, the call is ignored
When we designed it, the choice was to look for all the changed conf
files also to decide if a
if you wish to intercept read calls ,a filter is the only way.
On Wed, Jun 10, 2009 at 6:35 AM, vivek sarvivex...@gmail.com wrote:
Hi,
I've to intercept every request to solr (search and update) and log
some performance numbers. In order to do so I tried a Servlet filter
and added this to
are you sure prod_cd and reg_id\ are emitted by respective entities in
the same name if not you may need to alias those fields (using as)
keep in mind ,the field namkes are case sensitive. Just to know what
are the values emitted use debug mode or use logTransformer
On Wed, Jun 10, 2009 at 4:55
Francis,
If you can wait another month or so, you could skip 1.3.0, and jump to 1.4
which will be released soon.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
From: Francis Yakin fya...@liquid.com
To: solr-user@lucene.apache.org solr-user@lucene.apache.org
Sent:
Hello,
I don't follow the index data without token to match with my search part.
Could you please give an example of what you mean?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: chem leakhina chem.leakh...@gmail.com
To:
It's like cooking. If you put too much salt in your food, it's kind of hard to
undo that and you end up with a salty meal. Boosting at search time makes it
easy to change boosts (e.g. when trying to find the best boost values), while
boosting at index time hard-codes them. You can use both
Yao,
Solr can already cluster top N hits using Carrot2:
http://wiki.apache.org/solr/ClusteringComponent
I've also done ugly manual counting of terms in top N hits. For example,
look at the right side of this:
http://www.simpy.com/user/otis/tag/%22machine+learning%22
Something like
Aleksander,
In a sense you are lucky you have time-ordered data. That makes it very easy
to shard and cheaper to search - you know exactly which shards you need to
query. The beginning of the year situation should also be easy. Do start with
the latest shard for the current year, and go to
Vincent,
It's hard to tell, but some things to look at are your JVM memory heap size,
the status of various generations in the JVM, possibility of not enough memory
and too frequent GC, etc. All can be seen with jconsole.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
Hello,
All of this is covered on the Wiki, search for: distributed search
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Rakhi Khatwani rkhatw...@gmail.com
To: solr-user@lucene.apache.org
Cc: ninad.r...@germinait.com;
Francis,
But that really is an example. It's something that you can try and something
that you can copy and base your own Solr setup on.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
From: Francis Yakin fya...@liquid.com
To:
75 matches
Mail list logo