Hi,
I'm trying to spell check a whole field using a lowercasing keyword
tokenizer [1].
for example if I query for furntree gully I'm hoping to get back
ferntree gully as a suggestion. Unfortunately the spell checker
seems to be recognizing this as two tokens and returning suggestions
for both.
Nevermind this one... With a bit more research I discovered I can use
spellcheck.q to provide the correct suggestion.
On 14 September 2010 16:02, Glen Stampoultzis gst...@gmail.com wrote:
Hi,
I'm trying to spell check a whole field using a lowercasing keyword
tokenizer [1].
for example if I
Hi Peter,
this scenario would be really great for us - I didn't know that this is
possible and works, so: thanks!
At the moment we are doing similar with replicating to the readonly
instance but
the replication is somewhat lengthy and resource-intensive at this
datavolume ;-)
Regards,
Peter.
Hi Guys,
I encountered a problem when enabling WordDelimiterFilterFactory for both
index and query (pasted relative part of schema.xml at the bottom of email).
*1. Steps to reproduce:*
1.1 The indexed sample document contains only one sentence: This is a
TechNote.
1.2 Query is:
Really well done problem statement by the way
On Tue, Sep 14, 2010 at 5:40 AM, yandong yao yydz...@gmail.com wrote:
Hi Guys,
I encountered a problem when enabling WordDelimiterFilterFactory for both
index and query (pasted relative part of schema.xml at the bottom of
email).
*1. Steps
hi,
it's the second time i am stumble across some strange behaviour:
in my schema.xml i have defined
fieldType name=textspell class=solr.TextField
positionIncrementGap=100
analyzer type=index
!-- sg324 inkl. HTMLStrip... --
charFilter
Hello!
Tokenizer is executed before filters, because tokenizer is
generating tokens and than filters operate on them.
hi,
it's the second time i am stumble across some strange behaviour:
in my schema.xml i have defined
fieldType name=textspell class=solr.TextField
I found
http://www.jarvana.com/jarvana/browse/org/ow2/weblab/service/solr-duplicates-detector/2.0/
Is anybody knows, hot to install ans use this lib on existing Solr instance?
--
View this message in context:
Why do you want to? Perhaps there's a better solution for your underlying
problem
if you'd explain shat it is...
Best
Erick
On Tue, Sep 14, 2010 at 8:05 AM, hellboy pbon...@googlemail.com wrote:
I found
CharFilters go before Tokenizers which go before (token) Filters.
Token filters (called just filter in the config) operate on tokens, so need
to go after the tokenizer. WhitespaceTokenizer is a tokenizer.
PatternReplaceFilterFactory is a token filter.
What you probably want instead is
Hi I am using solrCloud which uses an ensemble of 3 zookeeper instances.
I am performing survivability tests:
Taking one of the zookeeper instances down I would expect the client to use a
different zookeeper server instance.
But as you can see in the below logs attached
Depending on which
Hi Robert,
I am using solr 1.4, will try with 1.4.1 tomorrow.
Thanks very much!
Regards,
Yandong Yao
2010/9/14 Robert Muir rcm...@gmail.com
did you index with solr 1.4 (or are you using solr 1.4) ?
at a quick glance, it looks like it might be this:
Hi Shaun,
I think it is more easy to fix this problem, if we got more information
about what is going on in your application.
Please, could you provide the CoreAdminResponse returned by car.process()
for us?
Kind regards,
- Mitch
--
View this message in context:
Hi Mitch
Thanks for responding. Not actually sure what you wanted from
CoreAdminResponse but I put the following in:
CoreAdminRequest car = new CoreAdminRequest();
car.setCoreName(live);
car.setOtherCoreName(rebuild);
Hey guys,
Is there a way of doing the following:
We want to get the highest value from a list of multiple fields within a
document.
Example below:
max(field1,field2,field3,field4)
The values are as follow:
field1 = 100
field2 = 300
field3 = 250
field4 = not indexed in document (null)
The
Shawn Heisey wrote:
The one called PatternReplaceFilterFactory (no Char) has been around
forever. It is not mentioned on the Wiki page about analyzers. The one
called PatternReplaceCharFilterFactory is only available from svn.
This seems to be true, which I hadn't realized either. The
The stats component will give you the maximum value within one field:
http://wiki.apache.org/solr/StatsComponent
You're going to have to compute the max amongst several fields
client-side, having StatsComponent return the max for each field, and
then just max-ing them client side. Not hard.
Oh wait, I misunderstood, you want just the highest value _for one
document_, from stored fields, given for each document? StatsComponent
won't help you there.
Either do it client side, or do it at index time in a single stored
field, that's it. Maybe there's some confusing way to use a
Hey guys,
Has anyone successfully compiled and used Field Collapsing patch (236)
with Solr 1.4.1?
I keep getting this exception when I search:
null
java.lang.NullPointerException
at
Hi,
I'm tweaking my schema and the LowerCaseTokenizerFactory doesn't create
tokens, based solely on lower-casing characters. Is there a way to tell it
NOT to drop non-characters? It's amazingly frustrating that the
TokenizerFactory and the FilterFactory have two entirely different modes of
Can SOLR be configured out of the box to handle rolling log files?
Kind regards,
Vladimir Sutskever
Investment Bank - Technology
JPMorgan Chase, Inc.
Tel: (212) 552.5097
This email is confidential and subject to important disclaimers and
conditions including on offers for the purchase or
Hi,
I'm interested in using geographic clustering of records in a Solr
search index. Specifically, I want to be able to efficiently produce a
map with clustered bubbles that represent the number of documents that
are indexed with points in that general area. I'd like to combine this
with other
On Tue, Sep 14, 2010 at 1:54 PM, Scott Gonyea sc...@aitrus.org wrote:
Hi,
I'm tweaking my schema and the LowerCaseTokenizerFactory doesn't create
tokens, based solely on lower-casing characters. Is there a way to tell it
NOT to drop non-characters? It's amazingly frustrating that the
Hi,
Has anyone come across a situation where they have seen their facet
field values wrap into a new facet entry when the value exceeds 256
characters?
For example:
lst name=facet_fields
lst name=pub_articletitle
int name=12302/int
int name=hiv1403/int
int name=type1382/int
/lst
lst
Faceting on a multi-value field?
I wonder if your positionIncrementGap for your field definition in your
schema is 256. I am not sure what it defaults to. But it seems possible
if it's 256 it could lead to what you observed. Try explicitly defining
it to be really really big maybe? I'm not
On Tue, Sep 14, 2010 at 3:35 PM, Niall O'Connor
ocon...@jimmy.harvard.edu wrote:
Has anyone come across a situation where they have seen their facet field
values wrap into a new facet entry when the value exceeds 256 characters?
Yes, for indexed string fields, there currently is a limit of 256
From: Simon Willnauer [simon.willna...@googlemail.com]
Sent: Tuesday, 14 September 2010 17:47
To: solr-user@lucene.apache.org
Subject: Re: Field names
On Tue, Sep 14, 2010 at 1:39 AM, Peter A. Kirk p...@alpha-solutions.dk wrote:
result name=response numFound=9 start=0
doc
So it only finds 9?
Hmmm, were you logged in on the Wiki? If not, you can create a login
pretty easily...
Or someone might pick it up..
Erick
On Tue, Sep 14, 2010 at 12:18 PM, Jonathan Rochkind rochk...@jhu.eduwrote:
Shawn Heisey wrote:
The one called PatternReplaceFilterFactory (no Char) has been around
What does handle mean? Create them or index them?
Erick
On Tue, Sep 14, 2010 at 2:02 PM, Vladimir Sutskever
vladimir.sutske...@jpmorgan.com wrote:
Can SOLR be configured out of the box to handle rolling log files?
Kind regards,
Vladimir Sutskever
Investment Bank - Technology
JPMorgan
On Tue, Sep 14, 2010 at 4:54 PM, h00kpub...@gmail.com
h00kpub...@googlemail.com wrote:
SEVERE: org.apache.solr.common.SolrException: Error while creating field
'metadata_last_modified{type=date,properties=indexed,stored,omitNorms}' from
value '2010-09-14T22:29:24+0200'
Different timezones are
I opened a bug for this issue:
https://issues.apache.org/jira/browse/SOLR-2120
On 09/14/2010 03:51 PM, Yonik Seeley wrote:
On Tue, Sep 14, 2010 at 3:35 PM, Niall O'Connor
ocon...@jimmy.harvard.edu wrote:
Has anyone come across a situation where they have seen their facet field
values wrap
It would be a nice feature if Solr supports queries with time zone support on
an index where all times are UTC. There is some chatter about this in SOLR-750
but i haven't found an issue that would add support for time zone queries.
Did i do a lousy search or is the issue missing as of yet?
I went for a different route:
https://issues.apache.org/jira/browse/LUCENE-2644
Scott
On Tue, Sep 14, 2010 at 11:18 AM, Robert Muir rcm...@gmail.com wrote:
On Tue, Sep 14, 2010 at 1:54 PM, Scott Gonyea sc...@aitrus.org wrote:
Hi,
I'm tweaking my schema and the LowerCaseTokenizerFactory
If you're using Javas SimpleDateFormat, try enclosing your Z in the format
string with single quotes, like:
SimpleDateFormat sdf = new SimpleDateFormat(-MM-dd'T'HH:mm:ss'Z');
HTH
Erick
On Tue, Sep 14, 2010 at 4:54 PM, h00kpub...@gmail.com
h00kpub...@googlemail.com wrote:
hi... i am using
Why would you want to do that, instead of just using another tokenizer
and a lowercasefilter? It's more confusing less DRY code to leave them
separate -- the LowerCaseTokenizerFactory combines anyway because
someone decided it was such a common use case that it was worth it for
the
Jonathan, you bring up an excellent point.
I think its worth our time to actually benchmark this LowerCaseTokenizer
versus LetterTokenizer + LowerCaseFilter
This tokenizer is quite old, and although I can understand there is no doubt
its technically faster than LetterTokenizer + LowerCaseFilter
fieldType name=text_shingle4 class=solr.TextField
positionIncrementGap=100
analyzer
tokenizer class=solr.HTMLStripWhitespaceTokenizerFactory/
filter class=solr.LowerCaseFilterFactory/
filter class=solr.StopFilterFactory ignoreCase=true words=stopwords.txt/
filter class=solr.ShingleFilterFactory
Hi,
Still working on extending my proof of concept by working off the example
configuration and modifying the schema.xml. Having trouble with wildcard
searches:
factory OR faction -- 40 results (ok)
factory -- 1 result (ok)
faction -- 39 results (ok)
facti?n -- 39 results (ok)
fact* -- 40
I'd agree with your point entirely. My attacking LowerCaseTokenizer was a
result of not wanting to create yet more Classes.
That said, rightfully dumping LowerCaseTokenizer would probably have me
creating my own Tokenizer.
I could very well be thinking about this wrong... But what if I wanted
I didn't see any open Jira issues for this, so i created one...
https://issues.apache.org/jira/browse/SOLR-2121
: Date: Tue, 7 Sep 2010 01:35:39 -0700 (PDT)
: From: Marc Sturlese marc.sturl...@gmail.com
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: Re:
K, just making sure.
Erick
On Tue, Sep 14, 2010 at 5:20 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
Erick Erickson wrote:
Hmmm, were you logged in on the Wiki? If not, you can create a login
pretty easily...
Or someone might pick it up..
I was logged in, created an account just
That was it! Thank you very much.
- Original Message
From: Robert Muir rcm...@gmail.com
To: solr-user@lucene.apache.org
Sent: Tue, September 14, 2010 5:58:03 PM
Subject: Re: wildcard searches not consistent
but
facto?y -- 0 (expecting 1)
you have stemming enabled for the field?
You are probably not talking about clusters in the physical structure of data
on this disk, right?
What do YOU mean by clusters if not?
Dennis Gearon
Signature Warning
EARTH has a Right To Life,
otherwise we all die.
Read 'Hot, Flat, and Crowded'
Laugh at
There's a lot of reasons, with the performance hit being notable--but also
because I feel that using a regex on something this basic amounts to a lazy
hack. I'm typically against regular expressions in XML.
I'm vehemently opposed to them in cases where not using them should
otherwise be quite
Because (just IMO, I'm not an expert here either) the basic framework in
Solr is that tokenizers tokenize, but they don't generally change bytes
inside values. What changes bytes (or adds or removes tokens to the
token stream initially created by a tokenizer, etc) is filters. And
there's
Thanks Mark for taking time to reply. What else could cause this issue to
happen so frequently. We have a master/slave configuration and only one
update server that writes to index. We have plenty of disk space available.
Thanks
Bharat Jain
On Fri, Sep 10, 2010 at 8:19 AM, Mark Miller
Dear All:
I am studying SolrCloud now,I downloaded it
from:https://svn.apache.org/repos/asf/lucene/solr/branches/cloud/
but i found that there no
webapps:https://svn.apache.org/repos/asf/lucene/solr/branches/cloud/example/webapps/
but we need
After upgrading to 1.4.1, it is fixed.
Thanks very much for your help!
Regards,
Yandong Yao
2010/9/14 yandong yao yydz...@gmail.com
Hi Robert,
I am using solr 1.4, will try with 1.4.1 tomorrow.
Thanks very much!
Regards,
Yandong Yao
2010/9/14 Robert Muir rcm...@gmail.com
did you
Hi
when using LukeRequestHandler, I can for example call:
http://localhost:8983/solr/admin/luke?fl=namefl=cat
which will return data including the frequency of the top 10 search terms in
the specified fields.
I can also add a numTerms parameter to obtain more than the top 10.
But how do I
So, basically, faceting geographically?
within 100 meters
within 300 meters
within 1km
within 3km
within 10km
within 100km
This type of results?
Dennis Gearon
Signature Warning
EARTH has a Right To Life,
otherwise we all die.
Read 'Hot, Flat, and Crowded'
Laugh at
50 matches
Mail list logo