I think one problem is that the featurePath is not set correctly.
Note that you are assuming PoS are written somewhere in some annotation
feature so this mean you should've setup the UIMA pipeline to include also,
for example, the HMM Tagger [1] which adds (by default) the posTag feature
to TokenAn
I'm not a solrcloud guru, but why not start your zookeeper quorum separately?
I also believe that you can specify a zoo.cfg file which will create a
zk quorum from solr
example zoo.cfg (from
http://zookeeper.apache.org/doc/current/zookeeperStarted.html#sc_RunningReplicatedZooKeeper)
tickTime=200
Hi All,
My schema consisted of field textForQuery which was defined as
After indexing 10 lakhs of documents I changed the field to
So documents that were indexed after that omiited the position information
of the terms.
As a result I was not able to search the text which rely on position
in
So I tested what I wrote, and man was that wrong. I have updated it
and created a JIRA for this issue. I also attached a patch which will
patch CloudState to address this issue. Feedback is appreciated.
https://issues.apache.org/jira/browse/SOLR-2799
On Wed, Sep 28, 2011 at 11:46 PM, Jamie Joh
Did you find out about this?
2011/8/2 Yury Kats :
> I have multiple SolrCloud instances, each running its own Zookeeper
> (Solr launched with -DzkRun).
>
> I would like to create an ensemble out of them. I know about -DzkHost
> parameter, but can I achieve the same programmatically? Either with
>
Can you monitor the DB side to see what results it returned for that query?
2011/8/30 于浩 :
> I am using solr1.3,I updated solr index throgh solr delta import every two
> hours. but the delta import is database connection wasteful.
> So i want to use full-import with entity name instead of delta im
@Darren: I feel that the question itself is misleading. Creating
shards is meant to separate out the data ... not keep the exact same
copy of it.
I think the two node setup that was attempted by Sam mislead him and
us into thinking that configuring two nodes which are to be named
"shard1" ... some
At first glance it seems like a simple localization issue as indicated by this:
> org.apache.uima.annotator.dict_annot.impl.DictionaryAnnotatorProcessException:
> EXCEPTION MESSAGE LOCALIZATION FAILED: java.util.MissingResourceException:
> Can't find bundle for base name
> org.apache.uima.annotato
I'll definitely create a JIRA for this. Looking at the code in
CloudState I think we could do the following
as we iterate over shardINames we check to see if the oldCloudState
had the slice already, if so get the state from there, otherwise do
what is already happening. Something like the follow
hi hoss,
This helps..
But as I understand TermsComponent does not allow sort on popularity..Just
coun|index. Or I m missing something?
If TermsComponent allows custom sorting i dont even have to use ngrams.
Any thoughts?
abhay
--
View this message in context:
http://lucene.472066.n3.nabble.c
No, we don't have any patches for it yet. You might make a JIRA issue for it?
I think the big win is a fairly easy one - basically, right now when we update
the cloud state, we look at the children of the 'shards' node, and then we read
the data at each node individually. I imagine this is the p
Thanks Mark found the TODO in ZkStateReader.java
// TODO: - possibly: incremental update rather than reread everything
Was there a patch they provided back to address this?
On Tue, Sep 27, 2011 at 9:20 PM, Mark Miller wrote:
>
> On Sep 26, 2011, at 11:42 AM, Jamie Johnson wrote:
>
>> Is there a
: If user starts typing "m" i wil show "mango" as suggestion. And other
: suggestions should come from the document title in index. So if I have a
: document in index with title "Man .." so suggestions would be
: "mango"
: "man"
...
: Is this doable ? any options ?
It's totally doable, an
: it looks to me as if Solr just brings back the URLs. what I want to do is to
: get the actual documents in the answer set, simplify their HTML and remove
: all the javascript, ads, etc., and append them into a single document.
:
: Now ... does Nutch already have the documents? can I get them fr
: I've run into another strange behavior related to LocalParams syntax in
: Solr 1.4.1. If I apply Dismax boosts using bq in LocalParams syntax,
: the contents of the boost queries get used by the highlighter.
: Obviously, when I use bq as a separate parameter, this is not an issue.
Yeah I will change the weight for str_category and make it higher . I
converted it to lowercase because we cannot expect users to type them in
the correct case
Thanks
Balaji
On Thu, Sep 29, 2011 at 3:52 AM, Way Cool wrote:
> I will give str_category more weight than ts_category because we want
: 1.) How should I deal with repeating parameters? If I use multiple
: boost queries, it seems that only the last one listed is used... for
: example:
:
: ((_query_:"{!dismax qf=\"title^500 author^300 allfields\"
bq=\"format:Book^50\" bq=\"format:Journal^150\"}test"))
Hmmm... that's either
Come cache hit problems can be fixed with the Large Pages feature.
http://www.google.com/search?q=large+pages
On Wed, Sep 28, 2011 at 3:30 PM, Federico Fissore wrote:
> Frederik Kraus, il 28/09/2011 23:16, ha scritto:
>
> Yep, I'm not getting more than 50-60% CPU during those load tests.
>>
>>
Frederik Kraus, il 28/09/2011 23:16, ha scritto:
Yep, I'm not getting more than 50-60% CPU during those load tests.
I would try reducing the number of shards. A part from the memory
discussion, this really seems to me a concurrency issue: too many
threads waiting for other threads to compl
I will give str_category more weight than ts_category because we want
str_category to win if they have "exact" matches ( you converted to
lowercase).
On Mon, Sep 26, 2011 at 10:23 PM, Balaji S wrote:
> Hi
>
> You mean to say copy the String field to a Text field or the reverse .
> This is the
(11/09/29 5:38), ntsrikanth wrote:
Hi,
I got a set of values which needs to be mapped to a facet. For example, I
want to map the codes
SC, AC to the facet value 'Catering',
HB to Half Board
AI, IN to All\ inclusive
I tried creating the following in the schema file.
Use KeywordToken
No, this is on a test system that is still smallish, approx 100,000
records of dummy data with Wikipedia articles as content at the time
this occurred.
I wouldn't expect rebuilding the index to stall the entire JVM, that
seems excessive...
Stephen Duncan Jr
www.stephenduncanjr.com
On Wed, Sep
Yep, I'm not getting more than 50-60% CPU during those load tests.
Am Mittwoch, 28. September 2011 um 23:01 schrieb Jaeger, Jay - DOT:
> Yes, that thread waits (in the sense that nothing useful gets done), but
> during that time, from the perspective of the applications and OS, that CPU
> is
On Sep 28, 2011, at 2:11 PM, Jaeger, Jay - DOT wrote:
> cores adminPath="/admij/cores"
>
> Was that a cut and paste? If so, the /admij/cores is presumably incorrect,
> and ought to be /admin/cores
>
No, that was a typo -- the config file is correct with admin/cores. Thanks for
pointin
cores adminPath="/admij/cores"
Was that a cut and paste? If so, the /admij/cores is presumably incorrect, and
ought to be /admin/cores
-Original Message-
From: Jaeger, Jay - DOT [mailto:jay.jae...@dot.wi.gov]
Sent: Wednesday, September 28, 2011 4:10 PM
To: solr-user@lucene.apac
Hi all,
I have the dictionary Annotator UIMA-solr running,
used my own dictionary file and it works,
it will match all the words (Nouns, Verbs and Adjectives) from my dictionary
file.
*but now, if I only want to match "Nouns", (ignore other part of speech)*
how can I configure it?
http://ui
One time when we had that problem, it was because one or more cores had a
broken XML configuration file.
Another time, it was because solr/home was not set right in the servlet
container.
Another time it was because we had an older EAR pointing to a newer release
Solr home directory. Given wha
Yes, that thread waits (in the sense that nothing useful gets done), but during
that time, from the perspective of the applications and OS, that CPU is busy:
it is not "waiting" in such a way that you can dispatch a different process.
The point is, that if this was actually the problem, it would
Is this a huge index? Keep in mind that most spellchecker implementations
rebuild the index which can stall the entire process if there are millions of
full text documents to process.
There is a new implementation called DirectSolrSpellchecker that doens't so a
complete rebuild but i haven't tr
On 9/28/2011 2:24 PM, Robert Petersen wrote:
Just go to localhost:8983 (or whatever other port you are using) and use
this path to see all the cores available on the box:
In your example this should give you a core list:
http://solrhost:8080/solr/
Now this is interesting.
If I have defaultCo
On Sep 28, 2011, at 1:24 PM, Robert Petersen wrote:
> Just go to localhost:8983 (or whatever other port you are using) and use
> this path to see all the cores available on the box:
>
> In your example this should give you a core list:
>
> http://solrhost:8080/solr/
>
I see "Welcome to Solr!"
Hi,
I got a set of values which needs to be mapped to a facet. For example, I
want to map the codes
SC, AC to the facet value 'Catering',
HB to Half Board
AI, IN to All\ inclusive
I tried creating the following in the schema file.
And in boardbasis_synonyms.txt
SC => Self\ Cater
Just go to localhost:8983 (or whatever other port you are using) and use
this path to see all the cores available on the box:
In your example this should give you a core list:
http://solrhost:8080/solr/
-Original Message-
From: Joshua Miller [mailto:jos...@itsecureadmin.com]
Sent: Wedne
On Sep 28, 2011, at 1:17 PM, Rahul Warawdekar wrote:
> Can you try updating your solr.xml as follows:
> Specify
> "" instead of
> ""
>
> Basically remove the extra text "cores" in the core element from the
> instanceDir attribute.
I gave that a try and it didn't change anything.
Thanks,
Josh
On Sep 28, 2011, at 1:03 PM, Shawn Heisey wrote:
> On 9/28/2011 1:40 PM, Joshua Miller wrote:
>> I am trying to get SOLR working with multiple cores and have a problem
>> accessing the admin page once I configure multiple cores.
>>
>> Problem:
>> When accessing the admin page via http://solrhost
Hi Joshua,
Can you try updating your solr.xml as follows:
Specify
"" instead of
""
Basically remove the extra text "cores" in the core element from the
instanceDir attribute.
Just try and let us know if it works.
On Wed, Sep 28, 2011 at 3:40 PM, Joshua Miller wrote:
> Hello,
>
> I am trying to
On 9/28/2011 1:40 PM, Joshua Miller wrote:
I am trying to get SOLR working with multiple cores and have a problem
accessing the admin page once I configure multiple cores.
Problem:
When accessing the admin page via http://solrhost:8080/solr/admin, I get a 404,
"missing core name in path".
Que
Hello,
I am trying to get SOLR working with multiple cores and have a problem
accessing the admin page once I configure multiple cores.
Problem:
When accessing the admin page via http://solrhost:8080/solr/admin, I get a 404,
"missing core name in path".
Question: when using the multicore opti
: I was worried because when i used to use only Lucene for the same indexing,
: before optimization there are many files but after optimization i always end
: up with just 3 files in my index filder. Just want to find out if this was
: ok.
It sounds like you were most likely using the "Compound F
Jaeger, Jay - DOT, il 28/09/2011 18:40, ha scritto:
That would still show up as the CPU being busy.
i don't know how the program (top, htop, whatever) displays the value
but when the cpu has a cache miss definitely that thread sits and waits
for a number of clock cycles
with 130GB of ram (
We have a separate Java process indexing to Solr using SolrJ. We are
using Solr 3.4.0, and Jetty version 8.0.1.v20110908. We experienced
Solr hanging today. For a period of approximately 10 minutes, it did
not respond to queries. Our indexer sends a query to build a
spellcheck index after commi
Hi Frank,
How is Solr deployed? And how did you upgrade?
The commons-lang library (containing ArrayUtils) is included in the
Solr war file.
Martijn
On 28 September 2011 09:16, Frank Romweber wrote:
> I use drupal for accessing the solr search engine. After updating an
> creating my new index ev
Thanks Chris. Yes, changing connector settings not just in solr but also in
all webapps that were sending queries into it solved the problem!
Appreciate the help.
R
On Tue, Sep 13, 2011 at 6:11 PM, Chris Hostetter
wrote:
>
> : Any idea why solr is unable to return the pound sign as-is?
> :
> :
Hi,
We extensively use date faceting in our application, but now since the index
has become very big we are dividing into shards. Since date/range faceting
don't work on Shards I was trying to apply the path to my Solr, currently
using 3.1 but planning for 3.4 upgrade.
https://issues.apache
Am Mittwoch, 28. September 2011 um 16:40 schrieb Toke Eskildsen:
> On Wed, 2011-09-28 at 12:58 +0200, Frederik Kraus wrote:
> > - 10 shards per server (needed for response times) running in a single
> > tomcat instance
>
> Have you tested that sharding actually decreases response times in your
Hi Ken,
the HttpConnectionManager was actually the first thing I looked at - and bumped
the Solr default of 20 up to 50, 100, 400, 1 (which should be more or less
unlimited ;) ). Unfortunately didn't really solve anything. I don't know if the
"static" HttpClient is a problem here as it w
That would still show up as the CPU being busy.
-Original Message-
From: Federico Fissore [mailto:feder...@fissore.org]
Sent: Wednesday, September 28, 2011 6:12 AM
To: solr-user@lucene.apache.org
Subject: Re: strange performance issue with many shards on one server
Frederik Kraus, il 28
Trying to add in synonyms at index time but it's not working as
expected. Here's the schema and example from synonyms.txt
synonyms.txt has :
watch, watches, watche, watchs
schema for the field :
positionIncrementGap="100">
words="stopwords_en.txt" enablePositionIncrement="true"/>
ignoreCase
Hi Frederik,
I haven't directly run into this issue with Solr, but I have experienced
similar issues in a related context.
In my case, I had a custom webapp that made SolrJ requests and then generated
some aggregated/analyzed results.
During load testing, we ran into a few different issues...
It will be nice if we can have dissum in addition to dismax. ;-)
On Tue, Sep 27, 2011 at 9:26 AM, lee carroll
wrote:
> see
>
>
> http://lucene.apache.org/java/2_4_0/api/org/apache/lucene/search/Similarity.html
>
>
>
> On 27 September 2011 16:04, Mark wrote:
> > I thought that a similarity class
excellent !
and yes, il fait très beau en France :)
-
Jouve
France.
--
View this message in context:
http://lucene.472066.n3.nabble.com/FieldCollapsing-don-t-return-every-groups-tp3376036p3376362.html
Sent from the Solr - User mailing list archive at Nabble.com.
You right one of the group is 'ltd". Thanks :)
I fixed this issue using a field that I know is unique for each merchant
(the merchant id).
Again thanks for your help Ludovic.
Sinon en France il fait beau? :)
On 28 September 2011 16:56, lboutros wrote:
> Ok, thanks for the schema.
>
> the mer
I just checked, you can disable the storing parameter and use this field:
Ludovic.
-
Jouve
France.
--
View this message in context:
http://lucene.472066.n3.nabble.com/FieldCollapsing-don-t-return-every-groups-tp3376036p3376316.html
Sent from the Solr - User mailing list archive at Nabble.c
we had an understanding problem:)
docs are the docs in index.
files are the files in the index directory (index parts).
during the optimization you don't delete docs if they are don't flagged as
deleted.
but you merge your index und delete the files in your index directory, thats
right.
after an
Ok, thanks for the schema.
the merchant "Cult Beauty Ltd" should be indexed like this:
cult
beauty
ltd
I think some other merchants contain at least one of these words.
you should try to group with a special field used for field collapsing:
I think you could even disable the stored value f
We tested it so many times.
1st time we optimize, the new index file is created (merged one), but
the existing index files are not deleted (because they might be still
open for reading)
2nd time optimize, other than the new index file, all else gets deleted.
This is happening specifically on Windo
2011/9/28 Manish Bafna
> >>Will it not merge the index?
>
yes
> >>While merging on windows, the old index files dont get deleted.
> >>(Windows has an issue where the file opened for reading cannot be
> >>deleted)
> >>
> >>So, if you call optimize again, it will delete the older index files.
>
On Wed, 2011-09-28 at 12:58 +0200, Frederik Kraus wrote:
> - 10 shards per server (needed for response times) running in a single tomcat
> instance
Have you tested that sharding actually decreases response times in your
case? I see the idea in decreasing response times with sharding at the
cost o
Hi Ludovic,
I'm not sure to understand which piece of my schema expose the analyzer so
you will find my schema here
https://github.com/lbdremy/solr-install/blob/master/conf/schema.xml. Hope
this will be helpfull :)
The merchant_name_t is a dynamic field matching the "*_t" pattern so this
field is
Will it not merge the index?
While merging on windows, the old index files dont get deleted.
(Windows has an issue where the file opened for reading cannot be
deleted)
So, if you call optimize again, it will delete the older index files.
On Wed, Sep 28, 2011 at 6:43 PM, Vadim Kisselmann
wrote:
>
Hey Community.
I write my first component and now i got a problem hear is my code:
@Override
public void prepare(ResponseBuilder rb) throws IOException {
try {
rb.req.getParams().getBool("topoffers.show", true);
String client = rb.req.getParams().get("client", "1");
Hi Remy,
could you paste the analyzer part of the field merchant_name_t please ?
And when you say "it should return more than that", could you explain why
with examples ?
If I'm not wrong, the field collapsing function is based on indexed values,
so if your analyzer is complex (not "string"),
Hello,
I'm using the field collapsing feature to group my products by merchant and
I don't understand why some merchant are missing on the result send by solr.
My request is
http:/localhost:8983/solr/select/?q=merchant_name_t:*&version=2.2&start=0&rows=2000&indent=on&group=true&group.field=merchan
Hmm, sorry don't know...
My ideas:
- tomcat generate this problem (for example: maxthreads, number of
connections...)
- JVM - Options, especially GC
- index locks, eventually an open issue in jira
Regards
Vadim
2011/9/28 Frederik Kraus
> I just had a look at the thread-dump, pasting 3 exampl
On Tue, Sep 27, 2011 at 10:58 PM, tamanjit.bin...@yahoo.co.in <
tamanjit.bin...@yahoo.co.in> wrote:
> Hi,
> 1. Just curious - you have your defaultsearchfield - defaultquery as not
> stored, how do you know that it contains what you think it contains?
> 2. the fieldType of defaultquery is query_te
if numDocs und maxDocs have the same mumber of docs nothing will be deleted
on optimize.
You only rebuild your index.
Regards
Vadim
2011/9/28 Kissue Kissue
> numDocs and maxDocs are same size.
>
> I was worried because when i used to use only Lucene for the same indexing,
> before optimizati
numDocs and maxDocs are same size.
I was worried because when i used to use only Lucene for the same indexing,
before optimization there are many files but after optimization i always end
up with just 3 files in my index filder. Just want to find out if this was
ok.
Thanks
On Wed, Sep 28, 2011 a
why should the optimization reduce the number of files?
It happens only when you indexing docs with same unique key.
Have you differences in numDocs und maxDocs after optimize?
If yes:
how is your optimize command ?
Regards
Vadim
2011/9/28 Manish Bafna
> Try to do optimize twice.
> The 2nd o
I just had a look at the thread-dump, pasting 3 examples here:
'pool-31-thread-8233' Id=11626, BLOCKED on
lock=org.apache.commons.httpclient.MultiThreadedHttpConnectionManager$ConnectionPool@19dd10d9,
total cpu time=20.ms user time=20.ms
at
org.apache.commons.httpclient.MultiThreadedHt
Try to do optimize twice.
The 2nd one will be quick and will delete lot of files.
On Wed, Sep 28, 2011 at 5:26 PM, Kissue Kissue wrote:
> Hi,
>
> I am using solr 3.3. I noticed that after indexing about 700, 000 records
> and running optimization at the end, i still have about 91 files in my ind
Hi,
I am using solr 3.3. I noticed that after indexing about 700, 000 records
and running optimization at the end, i still have about 91 files in my index
directory. I thought that optimization was supposed to reduce the number of
files.
My settings are the default that came with Solr (mergefact
Am Mittwoch, 28. September 2011 um 13:41 schrieb Vadim Kisselmann:
> Hi Fred,
>
> ok, it's a strange behavior with same queries.
> Another questions:
> -which solr version?
3.3 (might the NIOFSDirectory from 3.4 help?)
> -do you indexing during your load test? (because of index rebuilt)
nope
Hi Fred,
ok, it's a strange behavior with same queries.
Another questions:
-which solr version?
-do you indexing during your load test? (because of index rebuilt)
-do you replicate your index?
Regards
Vadim
2011/9/28 Frederik Kraus
> Hi Vladim,
>
> the thing is, that those exact same queries
Hi Vladim,
the thing is, that those exact same queries, that take longer during a load
test, perform just fine when executed at a slower request rate and are also
random, i.e. there is no pattern in bad/slow queries.
My first thought was some kind of contention and/or connection starvation for
Hi Fred,
analyze the queries which take longer.
We observe our queries and see the problems with q-time with queries which
are complex, with phrase queries or queries which contains numbers or
special characters.
if you don't know it:
http://www.hathitrust.org/blogs/large-scale-search/tuning-search
Frederik Kraus, il 28/09/2011 12:58, ha scritto:
Hi,
I am experiencing a strange issue doing some load tests. Our setup:
just because I've listened to JUG mates talking about that at the last
meeting, could it be that your CPUs are spending their time getting
things from RAM to CPU cache
Hello all,
I'm experimenting with the "Distributed Search" bits in the nightly
builds and I'm facing a problem.
I have on my schema.xml some dynamic fields defined like this:
multiValued="true" />
When hitting a single shard the following query works fine:
http:///select?q=*:*&fl=ts,$
Hi,
I am experiencing a strange issue doing some load tests. Our setup:
- 2 server with each 24 cpu cores, 130GB of RAM
- 10 shards per server (needed for response times) running in a single tomcat
instance
- each query queries all 20 shards (distributed search)
- each shard holds about 1.5
Thank you for the reply Chris.
Please find the sample query which is returning results even though id is
not having any value as "" in SOLR 1.4.1
http://localhost/solr/online/select/?q=%28%20state%20%29^1.8%20AND%20%20%28%20%28id:%22%22%29%29%20AND%20%20%28%20%28content_type_s:%22Video%22%29^1.5%2
Thanks a lot for your advice.
What really matters to me is that answers with NAME_ANALYZED=Tour Eiffel
appear first. Then, if "Tour Eiffel Tower By Helicopter" appears before or
after "Hotel la tour Eiffel" doesn't really matter.
Since I send fq=NAME_ANALYZED:tour eiffel, I am sure NAME_ANALYZED
I have been using solr 3.1 am planning to update to solr 3.4, whats the
steps to be followed or anything that needs to be take care of specifically
for the upgrade?
Regards,
Rohit
I use drupal for accessing the solr search engine. After updating an
creating my new index everthing works as before. Then I activate the
group=true and group.field=site and solr delivers me the wanted search
results but in Drupal nothing appears just an empty search page. I found
out that the
I use drupal for accessing the solr search engine. After updating an
creating my new index everthing works as before. Then I activate the
group=true and group.field=site and solr delivers me the wanted search
results but in Drupal nothing appears just an empty search page. I found
out that the gr
83 matches
Mail list logo