Hi,
Is there any patch available for distributed field collapsing. I need it in my
app. Any ideas please add...
regards,
V.Sriram
Hi Steve, thanks for the reply. I did not understand which file do I need to
rename? I'm working on Solr 1.4. The file in examples/solr/conf directory is
mapping-ISOLatin1Accent.txt. The Schema.xml has the following commented
entry.
charFilter class=solr.MappingCharFilterFactory
Hello list,
if I have a field title which copied to text and a field text that is copied to
text.stemmed.
Am I going to get the copy from the field title to the field text.stemmed or
should I include it?
thanks in advance
paul
Field values are copied before being analyzed. There is no cascading of
analyzers.
Hello list,
if I have a field title which copied to text and a field text that is
copied to text.stemmed. Am I going to get the copy from the field title to
the field text.stemmed or should I include it?
And no cascading of copying (as I experimented).
I just enriched the wiki's
http://wiki.apache.org/solr/SchemaXml#Copy_Fields
thanks to proof.
paul
Le 8 févr. 2011 à 11:16, Markus Jelsma a écrit :
Field values are copied before being analyzed. There is no cascading of
analyzers.
I'm still not quite clear what you are attempting to achieve, and more
so why you need to extend Solr rather than just wrap it.
You have data with title, description and content fields. You make no
mention of an ID field.
Surely, if you want to store some in mysql and some in Solr, you could
I'm not sure what you mean but you may be looking for debugQuery=true ?
On Tuesday 08 February 2011 08:28:12 Paul Libbrecht wrote:
To be able to see this well, it would be lovely to have a switch that
would activate a logging of the query expansion result. The Dismax
QParserPlugin is
Hi Upayavira,
Apologies for the lack of clarity in the mail. The feeds have the following
fields:
id, url, title, content, refererurl, createdDate, author, etc. We need search
functionality on title and content.
As mentioned earlier, storing title and content in solr takes up a lot of
space.
Hi,
At last the migration to Solr-1.4.1 does solve this issue :-)..
Cheers
--
View this message in context:
http://lucene.472066.n3.nabble.com/Http-Connection-is-hanging-while-deleteByQuery-tp2367405p2451214.html
Sent from the Solr - User mailing list archive at Nabble.com.
The conventional way to do it would be to index your title and content
fields in Solr, along with the ID to identify the document.
You could do a search against solr, and just return an ID field, then
your 'client code' would match that up with the title/content data from
your database. And yes,
What you are missing is that the analysis page shows what happens when
the text is run through analysis. Wildcards ARE NOT ANALYZED, so you cannot
assume that the analysis page shows you what the search terms in that
case. Regardless of whether george* is shown in the analysis page, the term
Thanks for the detailed reply Upayavira.
To answer your question, our index is growing much faster than expected and our
performance is grinding to a halt. Currently, it has over 150 million records.
We're planning to split the index into multiple shards very soon and move the
index creation
Hi,
i agree with Upayavira, probably it's better to create an external app that
retrieves content from a db.
Anyway, if i am not wrong,
finishStage is a method called by the coordinator if you have a distributed
search.
if your solr is on a single machine every component should implement only
Hello list,
I have been searching through 1.4.0 source for a standard requestHandler
plug-in example. I understand that for my purposes, extending
RequestHandlerBase is a starting point, however I was wondering if there is any
examples of plug-ins which I can view such as those contained
Hi everybody, please suggest me what's the difference between these two
things. After what processing on filter_queries the parsed_filter_queries
are generated.
Basically ... when i am searching city as fq=city:'noida'
then filter_queries and parsed_filter_queries both are same as 'noida'.
Hi,
The parsed_filter_queries contains the value after it passed through the
analyzer. In this case it remains the same because it was already lowercased
and no synonyms were used.
You're also using single quotes, these have no special meaning so you're
searching for 'noida' in the first and
Hello,
I am going through the wiki page related to cache configuration
http://wiki.apache.org/solr/SolrCaching and I have a question regarding the
general cache architecture and implementation:
In my understanding, the Current Index Searcher uses a cache instance and
when a New Index
Hi Erick,
If you have time, Can you please take a look and provide your comments (or)
suggestions for this problem?
Please let me know if you need any more information.
Thanks,
Johnny
--
View this message in context:
Inline...
On Feb 5, 2011, at 4:28 AM, Ryan Chan wrote:
Hello all,
I am following this tutorial:
http://lucene.apache.org/solr/tutorial.html, I am playing with the
TermVector, here is my step:
1. Launch the example server, java -jar start.jar
2. Index the monitor.xml, java -jar
Hi folks,
Is there any way to know the size *in bytes* occupied by a cache (filter
cache, doc cache ...)? I don't find such information within the stats page.
Regards
--
Mehdi BEN HAJ ABBES
You can dump the heap and analyze it with a tool like jhat. IBM's heap
analyzer is also a very good tool and if i'm not mistaken people also use one
that comes with Eclipse.
On Tuesday 08 February 2011 16:35:35 Mehdi Ben Haj Abbes wrote:
Hi folks,
Is there any way to know the size *in
I'm afraid I'll have to pass, I'm absolutely swamped at the moment. Perhaps
someone else can pick it up.
I will say that you should be getting terms back when you pre-lower-case
them, so look in your index via the admin page or Luke to see if what's
really in your index is what you think in the
Hi Anithya,
Yes, that sounds right.
You will want to edit mapping-FoldToASCII.txt, and my suggestion is that you
rename mapping-FoldToASCII.txt to reflect your changes (for example, if your
target language is German, you could rename it to
mapping-German-FoldToASCII.txt); otherwise it would
Just wanted to push that topic.
Regards
Em wrote:
Hi Peter,
I must jump in this discussion: From a logical point of view what you are
saying makes only sense if both instances do not run on the same machine
or at least not on the same drive.
When both run on the same machine and the
Hi list,
I wanted to create a Jira-issue because of the CSVUpdateHandler-topic I
started a few days ago. However I can not create a Jira-account - I do not
recieve any mail or something like that.
Are there any troubles with the Jira?
Regards
--
View this message in context:
Hey everyone,
I have a question about Lucene/Solr scoring in general. There are many
factors at play in the final score for each document, and very often one
factor will completely dominate everything else when that may not be the
intention.
** The question: might there be a way to enforce
Hey everyone,
Tokenization seems inherently fuzzy and imprecise, yet Solr/Lucene does not
appear to provide an easy mechanism to account for this fuzziness.
Let's take an example, where the document I'm indexing is v1.1.0 mr. jones
www.gmail.com
I may want to tokenize this as follows: [v1.1.0,
Hi Tavi,
In my understanding the scoring formula Lucene (and therefore Solr) uses is
based on a mathematical model which is proven to work for general purpose
full text searching.
The real challenge, as you mention, comes when you need to achieve high
quality scoring based on the domain you are
Hi Tavi,
could you please provide an example query for your problem and the
debugQuery's output?
It confuses me that you write score(query
apple) = max(score(field1:apple), score(field2:apple))
I think your problem could come from the norms of your request, but I am not
sure.
If you can, show
So I re-indexed some of the content, but no dice. Per Hoss, I tried
disabling the TVC and it worked great. We're not really using tvc right
now since we made a decision to turn off highlighting for the moment, so
this isn't a huge deal. I'll create a new jira issue.
FYI here is my query
Hi Tavi,
if you want to use multiple tokenization strategies (different tokenizers so
to speak) you have to use different fieldTypes.
Maybe you have to create your own tokenizer for doing what you want or a
PatternTokenizer might help you.
However, your examples for the different positions of
here is the ticket:
https://issues.apache.org/jira/browse/SOLR-2352
On 02/08/2011 11:27 AM, Jed Glazner wrote:
So I re-indexed some of the content, but no dice. Per Hoss, I tried
disabling the TVC and it worked great. We're not really using tvc right
now since we made a decision to turn off
Hi,
I was just after some advice on how to map some relational metadata to a
solr index. The web application I'm working on is based around people
and the searching based around properties of these people. Several
properties are more complex - for example, a person's occupations have
place,
I have no great answer for you, this is to me a generally unanswered question,
hard to do Solr with this sort of thing, I think you seem to understand it
properly.
There ARE some interesting new features in trunk (not 1.4) that may be
relevant, although to my perspective none of them provide
Yes, I saw something in the dev stream about compound types as well
which would also be useful (so in my example an occupation field could
comprise of multiple fields of different types) but these are up and
coming features. I suspect using multiple document types is probably the
best way for
Thanks for the help Steve, it worked!!!
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-search-for-special-chars-like-a-from-ae-tp2444921p2454816.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi Anithya,
That's good to hear. Again, please consider donating your work:
http://wiki.apache.org/solr/HowToContribute#Making_Changes.
Steve
-Original Message-
From: Anithya [mailto:surysha...@gmail.com]
Sent: Tuesday, February 08, 2011 5:16 PM
To: solr-user@lucene.apache.org
Hello,
Quick question on solr replication?
What effect does index reload after a replication has on search requests?
Can server still respond to user queries with old index?
Especially, during the following phases of replication on slaves.
So - how did you end up setting it up? In my reading of the thread, it seems
you could have a search for 'mäcman' hit 'macman' or 'maecman', but not
both, since you it seems you could only map the ä to a single replacement.
Or can it be mapped multiple times, generating multiple tokens?
Your observation regarding optimisation is an interesting one, it does
at least make sense that reducing the size of a segment will speed up
optimisation and reduce the disk space needed.
In a situation that had multiple shards, we had two 'rows', for
redundancy purposes. In that situation, we
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email. Even if you change the
subject line of your email, other mail headers still track which thread
you replied to and your question is hidden in that thread and gets less
Thanks for the suggestions! Using a new field makes sense, except it would
double the size of the index. I'd like to add additional terms, at my
discretion, only when there's ambiguity.
More specifically, do you know of any way to put multiple *tokens sets* at
the same position of the same field?
Hi Still no luck with this is the problem with
the name attribute of the datasource element in the data config ?
On 5 February 2011 10:48, lee carroll lee.a.carr...@googlemail.com wrote:
ah should this work or am i doing something obvious wrong
in config
dataSource
A couple of things...
First, you haven't provided any evidence that increasing the index size is a
concern. If your index isn't all that large, it really doesn't matter, and
conserving
index size may not be a concern.
WordDelimterFilterFactory (WDFF) will do the use cases you outlined below,
but
Is it possible to do a query like {!boost b=log(popularity)}foo over sharded
indexes?
I looked at the wiki on distributed search
(http://wiki.apache.org/solr/DistributedSearch) and it has a list of
components that are supported in distributed search. Just wondering what
component does {!boost
: In my understanding, the Current Index Searcher uses a cache instance and
: when a New Index Searcher is registered a new cache instance is used which
: is also auto-warmed. However, what happens when the New Index Searcher is a
: view of an index which has been modified? If the entries
: It looks like you can use a jndi datsource in the data import handler.
: however i can't find any syntax on this.
:
: Where is the best place to look for this ? (and confirm if jndi does work in
: dataimporthandler)
It's been a long time since i used JNDI on anything, and i've never tried
it
I just came across a ~nudge post over in the SIS list on what the status is for
that project. This got me looking more in to spatial mods with Solr4.0. I
found this enhancement in Jira.
https://issues.apache.org/jira/browse/SOLR-2155. In this issue, David mentions
that he's already integrated
sorry for cross posting, but that is the only I could get my question
posted. SOLR Mailing server treats my question as SPAM
Technical details of permanent failure:
Google tried to deliver your message, but it was rejected by the recipient
domain. We recommend contacting the other email provider
+1 to David's patch from SOLR-2155.
It would be great to implement. Great job using GDAL on converting the WKT Adam!
Cheers,
Chris
On Feb 8, 2011, at 8:18 PM, Adam Estrada wrote:
I just came across a ~nudge post over in the SIS list on what the status is
for that project. This got me
In the situation that you'd explained, I'm assuming one of the rows is the
master and the other is the slave. How did you continue feeding documents while
the master was down for optimisation?
And thanks for the link to MultiPassIndexSplitter. I shall check it out.
--
Thanks,
Ishwar
Just
Hey guys,
We're migrating from Lucene to Solr. So far the migration has been
smooth, however there is one feature I'm having issues adapting. Our calls
to our indexing service are defined in a central interface. Here is an
example of a query executed from a programmatically constructed
Actually, in that situation, we indexed twice, to both, so there was no
master and no slave. Our testing showed that search was not slowed down
unduly by indexing.
Upayavira
On Tue, 08 Feb 2011 22:34 -0800, Ishwar ishwarsridha...@yahoo.com
wrote:
In the situation that you'd explained, I'm
53 matches
Mail list logo