(Sorry, my Lucene java-user access is wonky.)
I would like to verify that my snapshots are not corrupt before I enable
them.
What is the simplest program to verify that a Lucene index is not corrupt?
Or, what is a Solr query that will verify that there is no corruption? With
the minimum amoun
Solr 1.2 has a bug where if you say "commit after N documents" it does not.
But it does honor the "commit after N milliseconds" directive.
This is fixed in Solr 1.3.
-Original Message-
From: Sundar Sankaranarayanan [mailto:[EMAIL PROTECTED]
Sent: Thursday, February 07, 2008 3:30 PM
To:
Question:
Why constant updates slow down SOLR performance even if I am not executing
Commit? I just noticed this... Thead dump shows something "Lucene ...
Clone()", and significant CPU usage. I did about 5 mlns updates via HTTP
XML, single document at a time, without commit, and performance went
> With DisMax, and simple query which is single double-quote
> character, SOLR
> responds with
> 500
> org.apache.solr.common.SolrException: Cannot parse '':
...
> It is not polite neither to user's input nor to HTTP specs...
Ooohh... Sorry again: it is the only case where SOLR is polite with
With DisMax, and simple query which is single double-quote character, SOLR
responds with
500
org.apache.solr.common.SolrException: Cannot parse '': Encountered "" at
line 1, column 0. Was expecting one of: ... " " ... "-" ... "(" ... "*" ...
... ... ... ... "[" ... "{" ... ...
org.apache.lucene.q
On Feb 7, 2008 8:35 PM, Fuad Efendi <[EMAIL PROTECTED]> wrote:
> - is it a bug of DixMax?... It happens even before request reaches dismax.
That's what this whole thread has been about :-)
Stripping unbalanced quotes is part of dismax.
-Yonik
> > I think: no. And 6'2" works just as prescribed:
>
> Not really... it depends on the analyzer. If the index analyzer for
> the field ends up stripping off the trailing quote anyway, then the
> dismax query (which also dropped the quote) will match documents.
> That's why you don't see any issu
On Feb 7, 2008 6:35 PM, Fuad Efendi <[EMAIL PROTECTED]> wrote:
> Anyway I can't understand where is the problem?!! Everything works fine with
> dismax/standard/escaping/encoding.
> Can we use AND operator with dismax by
> the way?
No.
> I think: no. And 6'2" works just as prescribed:
Not really
> (catalina.out file of SOLR,
> http://www.tokenizer.org/armani/price.htm?q=Romeo%2bJuliet
> from production)
> ...
> ... DISMAX queries via CONSOLE do not support
> that...
Opsss... Again mistake, sorry.
http://192.168.1.5:18080/apache-solr-1.2.0/select?indent=on&version=2.2&q=Ro
meo%2BJuliet&s
Hi All,
I am running an application in which I am having to index
about 300,000 records of a table which has 6 columns. I am committing to
the solr server after every 10,000 rows and I observed that the by the
end of about 150,000 the process eats up about 1 Gig of memory, and
since my se
> while i agree that you don't wnat to expose your end users
> directly to
> Solr (largely for security reasons) that doesn't mean you *must*
> preprocess user entered strings before handing them to dismax
> ... dismax's
> whole goal is to make it posisble for apps to not have to worry about
Hi All,
I am using Solr for about a couple of months now and am very
satisfied with it. My solr on dev environment runs on a windows box with
1 gig memory and the solr.war is deployed on a jboss 4.05 version. When
investigating on a "Solr commit not working sometimes issue " in our
applicati
: It is not a bug/problem of SOLR. SOLR can't be exposed directly to end
: users. For handling user input and generating SOLR-specific query, use
while i agree that you don't wnat to expose your end users directly to
Solr (largely for security reasons) that doesn't mean you *must*
preprocess us
: http://192.168.1.5:18080/apache-solr-1.2.0/select/?q=*&version=2.2&start=0&r
: ows=10&indent=on
That's using standard request handler right? ... that's a much differnet
discussion -- when using standard you must of course be aware of hte
syntax and the special characters ... Walter and i hav
This is what appears in Address Bar of IE:
http://localhost:8080/apache-solr-1.2.0/select/?q=item_name%3A%22Romeo%2BJul
iet%22%2Bcategory%3A%22books%22&version=2.2&start=0&rows=10&indent=on
Input was:
item_name:"Romeo+Juliet"+category:"books"
Another input which works just fine: item_name:"6'\""
Try this query with asterisk *
http://192.168.1.5:18080/apache-solr-1.2.0/select/?q=*&version=2.2&start=0&r
ows=10&indent=on
Response:
HTTP Status 400 - Query parsing error: Cannot parse '*': '*' or '?' not
allowed as first character in WildcardQuery
: odd behavior while updating. The use case is that a document gets indexed
: with a status, in this case it's -1 for documents that aren't ready to be
: searched yet and 1 otherwise. Initial indexing works perfectly, and getting
: a result set of documents with the status of -1 works as well.
: Our users can blow up the parser without special characters.
:
: AND THE BAND PLAYED ON
: TO HAVE AND HAVE NOT
Grrr... yeah, i'd forgotten about that problem. I was hopping LUCENE-682
could solve that (by "unregistering" AND/OR/NOT as operators) but that
issue fairly dead in the water si
I forgot to mention: default opereator is AND; DisMax.
Withot URL-encoding some queries will show exceptions even with dismax.
> -Original Message-
> From: Fuad Efendi [mailto:[EMAIL PROTECTED]
> Sent: Thursday, February 07, 2008 3:31 PM
> To: solr-user@lucene.apache.org
> Subject: RE: Qu
This query works just fine: http://www.tokenizer.org/?q=Romeo+%2B+Juliet
%2B is URL-Encoded presentation of +
It shows, for instance, [Romeo & Juliet] in output.
> -Original Message-
> From: Walter Underwood [mailto:[EMAIL PROTECTED]
> Sent: Thursday, February 07, 2008 3:25 PM
> To: sol
Our users can blow up the parser without special characters.
AND THE BAND PLAYED ON
TO HAVE AND HAVE NOT
Lower-casing in the front end avoids that.
We have auto-complete on titles, so the there are plenty
of chances to inadvertently use special characters:
Romeo + Juliet
Airplane!
Sh
I have same kind of queries correctly working on my site.
It's probably because I am using URL Escaping:
http://www.tokenizer.org/?q=6%272%22
> -Original Message-
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
> Of Yonik Seeley
> Sent: Thursday, February 07, 2008 12:58 P
: I am pretty new to solr. I was wondering what is this "mm" attribute in
: requestHandler in solrconfig.xml and how it works. Tried to search wiki
: could not find it
Hmmm... yeah wiki search does mid-word matching doesn't it?
the key thng to realize is that the requestHandler you were looking a
: How about the query parser respecting backslash escaping? I need
one of the orriginal design decisions was "no user escaping" ... be able
to take in raw query strings from the user with only '+' '-' and '"'
treated as special characters ... if you allow backslash escaping of those
characters
I am pretty new to solr. I was wondering what is this "mm" attribute in
requestHandler in solrconfig.xml and how it works. Tried to search wiki
could not find it
2<-1 5<-2 6<90%
thanks
Ismail
Doug Cutting wrote:
Ning,
I am also interested in starting a new project in this area. The
approach I have in mind is slightly different, but hopefully we can come
to some agreement and collaborate.
I'm interested in this too.
My current thinking is that the Solr search API is the appropri
On Feb 7, 2008 2:51 PM, vijay_schi <[EMAIL PROTECTED]> wrote:
> I want to know, what type of analyzers can be used for the data 12345_r,
> 12346_r, 12345_c, 12346_c etc , type of data.
>
> I had text type for that uniqueKey and some query , index analyzers on it. i
> think thats making duplicat
I want to know, what type of analyzers can be used for the data 12345_r,
12346_r, 12345_c, 12346_c etc , type of data.
I had text type for that uniqueKey and some query , index analyzers on it. i
think thats making duplicates.
Yonik Seeley wrote:
>
> On Feb 7, 2008 2:27 PM, vijay_schi <[
Huh? Queries come in through URL parameters and this is all ASCII
anyway. Even in XML, entities and UTF-8 decode to the same characters
after parsing.
The glyph formerly known as Prince belongs in the private use area,
of course.
wunder
On 2/7/08 11:06 AM, "Lance Norskog" <[EMAIL PROTECTED]> wro
On Feb 7, 2008 2:27 PM, vijay_schi <[EMAIL PROTECTED]> wrote:
> I'm new to solr. I have a uniqueKey on string which has the data of
> 12345_r,12346_r etc etc.
> when I'm posting xml with same data second time, it allows the docs to be
> added. when i search for id:12345_r on solr client , i'm getti
Hi,
I'm new to solr. I have a uniqueKey on string which has the data of
12345_r,12346_r etc etc.
when I'm posting xml with same data second time, it allows the docs to be
added. when i search for id:12345_r on solr client , i'm getting multiple
records. what might be the problem ?
previously I'
Some people loathe UTF-8 and do all of their text in XML entities. This
might work better for your punctuation needs. But it still won't help you
with Prince :)
-Original Message-
From: Walter Underwood [mailto:[EMAIL PROTECTED]
Sent: Thursday, February 07, 2008 9:25 AM
To: solr-user@luc
Yes, I've seen this bit. Near as I can tell, it's what I want, so that our
Japanese users can search on a double-byte character and get back results
(since they don't use spaces to delineate words, it's impossible in the
default solr configuration to find a single double-byte character somewhere
"
: Thank you so much! I will look into firstSearcher configuration next! thanks
FYI: prompted by this thread, I added some blurbs about firstSearcher,
newSearcher, and FieldCache to the SolrCaching wiki ... as a new users
learning about this stuff, please fele free to update that wiki with any
Here are the comments for CJKTokenizer. First, is this what you want?
Remember, there are three Japanese writing systems.
/**
* CJKTokenizer was modified from StopTokenizer which does a decent job for
* most European languages. It performs other token methods for double-byte
* Characters: the
How about the query parser respecting backslash escaping? I need
free-text input, no syntax at all. Right now, I'm escaping every
Lucene special character in the front end. I just figured out that
it breaks for colon, can't search for "12:01" with "12\:01".
wunder
On 2/7/08 11:06 AM, "Chris Hoste
: I confirmed this behavior in trunk with the following query:
: http://localhost:8983/solr/select?qt=dismax&q=6'2"&debugQuery=on&qf=cat&pf=cat
:
: The result is that the double quote is dropped:
: +DisjunctionMaxQuery((cat:6'2)~0.01) DisjunctionMaxQuery((cat:6'2)~0.01)
:
: This seems like it's
I hate asking stupid questions immediately after joining a mailing list, but
I'm in a bit of a pinch here.
I'm using Solr/Tomcat for a Ruby on Rails project (acts_as_solr) and I've
had a lot of success getting it working -- for English. The problem I'm
running into is that our primary customer
On Feb 7, 2008 12:24 PM, Walter Underwood <[EMAIL PROTECTED]> wrote:
> We have a movie with this title: 6'2"
>
> I can get that string indexed, but I can't get it through the query
> parser and into DisMax. It goes through the analyzers fine. I can
> run the analysis tool in the admin interface and
Hi All,
Now i am facing problem in special character search.
I tried with the following special characters
(!,@,#,$,%,^,&,*,(,),{,},[,]).
My indexing data is :
!national!
@national@
#national#
$national$
%national%
^national^
&national&
We have a movie with this title: 6'2"
I can get that string indexed, but I can't get it through the query
parser and into DisMax. It goes through the analyzers fine. I can
run the analysis tool in the admin interface and get a match with
that exact string.
These variants don't work:
6'2"
6'2\"
6
I've done some searching through the archives and google, as well as some
tinkering on my own with no avail. My goal is to get a list of the fields
that matched a particular query. At first, I thought highlighting was the
solution however its slowly becoming clear that it doesn't do what I need it
I'm working for a french/english site and I want to know what filters
would be nice and are recomended. Should I use 2 steamers or is there
a way to mark one of them bilingual? I am using the latin-1 filter
also, any more ideas?
[]'s
--
Leonardo Santagada
Thanks Chris,
this idea has been discussed before, most notably in this thread...
http://www.nabble.com/Indexing-XML-files-to7705775.html
...as discussed there, the crux of the isue is not a special fieldtype,
but a custom ResponseWriter that outputs the XML you want, and leaves any
field va
Thanks Yonik and Ard.
Yes its the stemming problem and i have removed the
""solr.EnglishPorterFilterFactory"" from indexing and querying analyzers.
Now its working fine. Is any other problem will occur if i remove this?
Thanks,
nithya.
--
View this message in context:
http://www.nabble.com/S
Thanks Yonik and Ard.
Yes its the stemming problem and i have removed the
""solr.EnglishPorterFilterFactory"" from indexing and querying analyzers.
Now its working fine. Is any other problem will occur if i remove this?
Thanks,
nithya.
Yonik Seeley wrote:
>
> It's stemming. Administrator st
Thank you so much! I will look into firstSearcher configuration next! thanks
--
From: "Chris Hostetter" <[EMAIL PROTECTED]>
Sent: Wednesday, February 06, 2008 8:56 PM
To:
Subject: Re: how to improve concurrent request performance and stress
testin
47 matches
Mail list logo