ok I dug more into this and realize the file extensions can vary depending on
schema, right?
for instance we dont have *.tvx, *.tvd, *.tvf (not using term vector)... and
I suspect the file extensions
may change with future lucene releases?
now it seems we can't just count the file using any
For example, I am storing email ids of a person. If the person has 3 email
ids, I want to store them as
email = 'x...@whatever.com'
email = 'a...@blah.com'
email = 'p...@moreblah.com'
How can we do this ?
I know someone will come up with why don't you store it like email1,
email2, email3 and
Just set up your schema with a string multivalued field...
On Wed, Apr 13, 2011 at 12:47 AM, shrinath.m shrinat...@webyog.com wrote:
For example, I am storing email ids of a person. If the person has 3 email
ids, I want to store them as
email = 'x...@whatever.com'
email = 'a...@blah.com'
Bill Bell wrote:
Just set up your schema with a string multivalued field...
I've this in my schema:
Worked.. Thanks...
.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Is-it-possible-to-create-a-duplicate-field-tp2815029p2815061.html
Sent from the Solr - User
Afternoon,
After an upgrade to Solr 3.1 which has largely been very smooth and
painless, I'm having a minor issue with the ExtractingRequestHandler.
The problem is that it's inserting metadata into the extracted
content, as well as mapping it to a dynamic field. Previously the
same
The current limitation or pause is when the ram buffer is flushing to disk
- when an optimize starts and is running ~4 hours, you say, that DIH is
flushing the doc`s during this pause into the index ?
-
--- System
One
No, this query returns a few more documents than if a do it by lucene query
parser. I'm going to generate another query parser that send a simple term
query and see what is the output, when i have it, i will inform in the mail.
Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida
Not sure if the title explains it all, or if what I want is even possible,
but figured I would ask.
Say, I have a series of products I'm selling, and a search of:
Blue Wool Rugs
Comes in. This returns 0 results, as Blue and Rugs match terms that are
indexes, Wool does not.
Is there a way to
For (a) I don't think anything exists today providing this mechanism.
But (b) is a good description of the dismax handler with a MM parameter of 66%.
Pierre
-Message d'origine-
De : Mark Mandel [mailto:mark.man...@gmail.com]
Envoyé : mercredi 13 avril 2011 10:04
À :
Thanks,
I changed my searching to be triggered on a newSearcher event instead and
use the new searcher to retrieve the documents. This works.
Btw can I assume that a new searcher will always be created soon after a
commit?
Regards,
Reeza
-Original Message-
From: Otis Gospodnetic
Thanks!
I searched high and low for that, couldn't see it in front of my face!
Mark
On Wed, Apr 13, 2011 at 6:32 PM, Pierre GOSSE pierre.go...@arisem.comwrote:
For (a) I don't think anything exists today providing this mechanism.
But (b) is a good description of the dismax handler with a MM
If you are using the Dismax query parser, perhaps could you take a look to
the minimum should match parameter 'mm' :
http://wiki.apache.org/solr/DisMaxQParserPlugin#mm_.28Minimum_.27Should.27_Match.29
Ludovic.
2011/4/13 Mark Mandel [via Lucene]
ml-node+2815186-149863473-383...@n3.nabble.com
Erick,
I was under the misconception that a solr transaction is ACID.
From what you said, I guess solr transactions are not Isolated.
Thanks,
Phong
On Tue, Apr 12, 2011 at 2:54 PM, Erick Erickson erickerick...@gmail.comwrote:
See below:
On Tue, Apr 12, 2011 at 2:21 PM, Phong Dais
Yes, you can assume this since that's the only
way new content will be searchable, as you've
discovered
Best
Erick
On Wed, Apr 13, 2011 at 4:42 AM, Reeza Edah Tally re...@nova-hub.comwrote:
Thanks,
I changed my searching to be triggered on a newSearcher event instead and
use the new
Hi there,
Just a quick question that the wiki page (
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters) didn't seem to
answer very well.
Given an analyzer that has zero or more Char Filter Factories, one
Tokenizer Factory, and zero or more Token Filter Factories, which value(s)
are
I would like to build a component that during indexing
analyses all tokens
in a stream and adds metadata to a new field based on my
analysis. I have
different tasks that I would like to perform, like basic
classification and
certain more advanced phrase detections. How would I do
this? A
On Apr 13, 2011, at 12:06 AM, Liam O'Boyle wrote:
Afternoon,
After an upgrade to Solr 3.1 which has largely been very smooth and
painless, I'm having a minor issue with the ExtractingRequestHandler.
The problem is that it's inserting metadata into the extracted
content, as well as
Or is the only the final value after completing the whole chain indexed?
Yes.
Koji
--
http://www.rondhuit.com/en/
hi
how to update jetty 6 to jetty 7 ?
--
View this message in context:
http://lucene.472066.n3.nabble.com/jetty-update-tp2816084p2816084.html
Sent from the Solr - User mailing list archive at Nabble.com.
I'm using version 1.4.1. It appears that when several documents in a result
set have the same score, the secondary sort is by 'indexed_at' ascending.
Can this be altered in the config xml files? If I wanted the secondary sort
to be indexed_at descending for example, or by a different field, say
Dear list,
after setting echoParams to none wildcard search isn't working.
Only if I set echoParams to explicit then wildcard is possible.
http://wiki.apache.org/solr/CoreQueryParameters
states that echoParams is for debugging purposes.
We use Solr 3.1.0.
Snippet from solrconfig.xml:
What does the parsed query look like with debugQuery=true for both scenarios?
Any difference? Doesn't make any sense that echoParams would have an effect,
unless somehow your search client is relying on parameters returned to do
something with them.?!
Erik
On Apr 13, 2011, at 09:57
Its seems that is a problem of my own query, now i need to investigate if
there is something different between a normal query and my implementation of
the query, because if you use it alone, its works properly.
Thanks,
Marco Martínez Bautista
http://www.paradigmatecnologico.com
Avenida de
Hello,
I just updatet to SOLR 3.1 and wondering if the phpnative response
writer plugin is part of it?
( https://issues.apache.org/jira/browse/SOLR-1967 )
When I try to compile the sources files I get some errors :
PHPNativeResponseWriter.java:57:
Hi
I'm having an error when i import an xml file with DIH.
In this file my id is an url wich looks like this :
http://www.example.com/?cp=30_sst=ac=655
Apparently the issue is with the = character?
Is there any workaround?
Error trace:
rows processed:0 Processing Document # 849
at
This is invalid XML. Entities must be encoded or embedded within CDATA tags.
On Wednesday 13 April 2011 16:10:51 Rosa (Anuncios) wrote:
Hi
I'm having an error when i import an xml file with DIH.
In this file my id is an url wich looks like this :
http://www.example.com/?cp=30_sst=ac=655
Hi Erik,
never mind.
Can't reproduce this strange behavior.
Obviously stopping and starting of solr solved this.
Thanks,
Bernd
Am 13.04.2011 16:00, schrieb Erik Hatcher:
What does the parsed query look like with debugQuery=true for both scenarios?
Any difference?
Doesn't make any sense that
On Wed, Apr 13, 2011 at 10:00 AM, Marco Martinez
mmarti...@paradigmatecnologico.com wrote:
Its seems that is a problem of my own query, now i need to investigate if
there is something different between a normal query and my implementation of
the query, because if you use it alone, its works
We have an ecommerce application B2C/B2B with a large amount of price list that
range into 2000+ and growing. They want to index price to have facets and
sorting. That seems like that would be a lot of columns to index, example below:
INDEX COLUMN: NamePrice
Indexing isn't a problem, it's just disk space and space is cheap. But, if
you do facets on all those price columns, that gets put into RAM which isn't
as cheap or plentiful. Your cache buffers may get overloaded a lot and
performance will suffer.
2000 price columns seems like a lot, could the
Thanks both for your replies
Eric,
Yep, I use the Analysis page extensively, but what I was directly looking
for was whether all of only the last line of values given by the analysis
page, where eventually indexed.
I think we've concluded it's only the last line.
Cheers,
Ben
On Wed, Apr 13,
Don't know of any other way to organize the documents. We need to have the
specific price that belongs to the user, so I don't think that the facets would
be the issue. The facet querying would be modified to the corresponding price
list field for that user. Let's say the customer belongs to
I found this link after googling for a few minutes.
http://wiki.eclipse.org/Jetty/Howto/Upgrade_from_Jetty_6_to_Jetty_7
I hope that helps
Also, a question like this may be more appropriate for a jetty mailing list.
On Wed, Apr 13, 2011 at 8:44 AM, ramires uy...@beriltech.com wrote:
hi
how to
: Subject: phpnative response writer in SOLR 3.1 ?
: References:
: 15647_1302703023_zzh0o1kefjfix.00_4da5abae.5070...@uni-bielefeld.de
: 0d30a85b-b981-4c27-9dbe-7fc8e0619...@gmail.com
: In-Reply-To: 0d30a85b-b981-4c27-9dbe-7fc8e0619...@gmail.com
is it necessary to update for solr ?
-
--- System
One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
1 Core with 31 Million Documents other Cores 100.000
- Solr1 for Search-Requests - commit every Minute - 5GB Xmx
-
Is your current solr installation with Jetty 6 working well for you in
a production environment?
I dont know enough about Jetty to help you further on this question.
On Wed, Apr 13, 2011 at 10:47 AM, stockii stock.jo...@googlemail.com wrote:
is it necessary to update for solr ?
-
Is NAME a product name? Why would it be multivalue? And why would it appear
on more than one document? Is each 'document' a package of products? And
the pricing tiers are on the package, not individual pieces?
So sounds like you could, potentially, have a PriceListX column for each
user. As your
Name equals the product name.
Each separate product can have 1 to n prices based upon pricelist.
A single document represents that single product.
doc
field name=id1/field
field name=nameThe product name./field
field name=price1.00/field
field
Hi,
I am a newbie to solr. I could see that the queries are not cached. Would
like to apply filterCache to queries in ruby. Can anyone provide me the
syntax for this please?
Thanks.
Uncomment solrconfig.xml at the following location.
!-- An optimization that attempts to use a filter to satisfy a search.
If the requested sort does not include score, then the filterCache
will be checked for a filter matching the query. If found, the filter
will be
Thanks for the reply Josh.
And where should I make changes in ruby to add filters?
Soumya
On Wed, Apr 13, 2011 at 11:20 AM, Joshua Bouchair
joshuabouch...@wasserstrom.com wrote:
Uncomment solrconfig.xml at the following location.
!-- An optimization that attempts to use a filter to
Hi,
As I know when using fl=*, score means we need to get all field and score as
returned search result. And if field is stored, all text will be returned as
part of result.
Now I have 2x fields, some of fields name have no prefix or fixed naming
rule and cannot be predicted what name will be.
I
Not cleanly currently. SOLR-2193: Re-architect Update Handler, should take care
of this though.
- Mark
On Apr 12, 2011, at 8:21 AM, stockii wrote:
Hello.
When is start an optimize (which takes more than 4 hours) no updates from
DIH are possible.
i thougt solr is copy the hole index and
Is sort order when 'score' is the same a Lucene thing? Should I ask on the
Lucene forum?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Result-order-when-score-is-the-same-tp2816127p2817330.html
Sent from the Solr - User mailing list archive at Nabble.com.
you could just explicitly send multiple sorts...from the tutorial:
sort=inStock asc, price desc
cheers.
On Wed, Apr 13, 2011 at 2:59 PM, kenf_nc ken.fos...@realestate.com wrote:
Is sort order when 'score' is the same a Lucene thing? Should I ask on the
Lucene forum?
--
View this
In real life though, it seems unlikely that the relevancy score will
ever be identical, so the second sort field will never be used. Is
relevancy score ever identical? Rarely at any rate.
On 4/13/2011 3:22 PM, Rob Casson wrote:
you could just explicitly send multiple sorts...from the
Au contraire, I have almost 4 million documents, representing businesses in
the US. And having the score be the same is a very common occurrence.
It is quite clear from testing that if score is the same, then it sorts on
indexed_at ascending. It seems silly to make me add a sort on every query,
You should just ask me.
Sent from my iPhone
On Apr 13, 2011, at 11:27 AM, soumya rao soumrao...@gmail.com wrote:
Thanks for the reply Josh.
And where should I make changes in ruby to add filters?
Soumya
On Wed, Apr 13, 2011 at 11:20 AM, Joshua Bouchair
Hey guys, how do you curl update all the XML inside a folder from A-D?
Example: curl http://localhost:8080/solr update
Sent from my iPhone
If you omitNorms and omitTermFreqAndPositions on the query field(s) and use no
funky boost functions, all results will have identical score in AND-queries
(or queries with one search term). IDF has no meaning because of AND,
queryNorm is the same across the resultset, fieldNorm is 1 and TF is
Either put all documents in a large file or loop over them with a simple shell
script.
Hey guys, how do you curl update all the XML inside a folder from A-D?
Example: curl http://localhost:8080/solr update
Sent from my iPhone
From the post.jar i think that you can do something like...
java -jar post.jar A*.xml
java -jar post.jar B*.xml
java -jar post.jar C*.xml
java -jar post.jar D*.xml
(im in windows)
On Wed, Apr 13, 2011 at 4:41 PM, Markus Jelsma
markus.jel...@openindex.iowrote:
Either put all documents in a
Sorting a large set is costly, the more fields you sort on, the more memory is
consumed (and likely cached).
If i remember correctly the result set will be ordered according to Lucene
DocID's if there's nothing to sort on.
If i read correctly, you don't want to specify those fixed sort
You have to specify the query. In the query you will have fq parameter which
means facet query.
http://wiki.apache.org/solr/solr-ruby
-Original Message-
From: soumya rao [mailto:soumrao...@gmail.com]
Sent: Wednesday, April 13, 2011 2:27 PM
To: solr-user@lucene.apache.org
Subject: Re:
Is a new DocID generated everytime a doc with the same UniqueID is added to
the index? If so, then docID must be incremental and would look like
indexed_at ascending. What I see (and why it's a problem for me) is the
following.
a search brings back the first 5 documents in a result set of say 60.
Hi all,
I'm wondering if there are any knobs or levers i can set in
solrconfig.xml that affect how pdfbox text extraction is performed by
the extraction handler. I would like to take advantage of pdfbox's
ability to normalize diacritics and ligatures [1], but that doesn't
seem to be the default
Is a new DocID generated everytime a doc with the same UniqueID is added to
the index? If so, then docID must be incremental and would look like
indexed_at ascending. What I see (and why it's a problem for me) is the
following.
Yes, Solr removes the old and inserts a new when updating an
As Hoss mentioned earlier in the thread, you can use the statistics page
from the admin console to view the current number of segments. But if you
want to know by looking at the files, each segment will have a unique
prefix, such as _u. There will be one unique prefix for every segment in
the
Hi,
I'm not sure how Solr allows for adjusting these Tika settings to get the
desired output. At least a few desirable Tika subsystems cannot be called from
the ExtractingRequestHandler such as Tika's BoilerPlateContentHandler. I'm
also not really sure if it's a good idea to normalize
all documents. But, I would want the sort to be at the system level, I dont'
want the overhead of sorting every query I ever make.
How would 'doing it at the system level' avoid the 'overhead of sorting
every query'? Every query has to be sorted, if you want it sorted.
Beyond setting a
Floyd,
You need to explicitly list all fields in fl=...
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
- Original Message
From: Floyd Wu floyd...@gmail.com
To: solr-user@lucene.apache.org
Sent: Wed, April
Hi Ken,
It sounds like you want to just sort by time changed/added (reverse chrono
order). I would not worry about issues just yet unless you have some reasons
to
think this is going to cause problems (e.g. giant index, low RAM). Jonathan is
right about commits, and the NRT-ness of search
Hi all,
Does anyone know if there is a Solr/Lucene user group /
birds-of-feather that meets in Seattle?
If not, I'd like to start one up. I'd love to learn and share tricks
pertaining to NRT, performance, distributed solr, etc.
Also, I am planning on attending the Lucene Revolution!
Let's
I have come across an issue with the DIH where I get a null exception when
pre-caching entities. I expect my entity to have null values so this is a bit
of a roadblock for me. The issue was described more succinctly in this
discussion:
: Does anyone know if there is a Solr/Lucene user group /
: birds-of-feather that meets in Seattle?
I don't live in seattle, but this group use to send meeting announvements
to solr-user promoting Seattle Hadoop/Lucene/NoSQL Meetups. They still
list solr in their keywords, but not in their
Can solr list fields in fl=... like this way? fl=!fieldName,score
Floyd
2011/4/14 Otis Gospodnetic otis_gospodne...@yahoo.com
Floyd,
You need to explicitly list all fields in fl=...
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search ::
There is a patch that fixes UTF-8 and performance issues with Jetty. So I
would recommend you use the patched version in 3.1/4.0.
On 4/13/11 9:47 AM, stockii stock.jo...@googlemail.com wrote:
is it necessary to update for solr ?
-
--- System
5G memory per JVM
--
View this message in context:
http://lucene.472066.n3.nabble.com/my-index-has-500-million-docs-how-to-improve-solr-search-performance-tp1902595p2819179.html
Sent from the Solr - User mailing list archive at Nabble.com.
68 matches
Mail list logo