Thanks Stefan, that is what I want.
2013/6/12 Stefan Matheis matheis.ste...@gmail.com
The ticket for the legend is SOLR-3915, the definition came up in
SOLR-3174:
On Wed, 2013-06-12 at 23:05 +0200, smanad wrote:
Is this a limitation of solr/lucene, should I be considering using other
option like using Elasticsearch (which is also based on lucene)?
But I am sure search in multiple indexes is kind of a common problem.
You try to treat separate sources as
I apologize also for my obscure questions and I thanks you and the list for
your help so far and the very clear explanation you give about the
behaviour of Solr and SolrCell.
I am effectively an intermediary between the list and the dev, because our
development process is not efficient. The full
Hi ,
I am trying to index my Solr 4.3 from Apache Nutch 2.2 data. And for that I
have copied the schema-solr4.xml from Nutch2.2 runtime/local/conf and
pasted it to my SolrHome solr/collection1/conf.
My Solr4.3 is hosted in Tomcat. And initially when I tried
I did a little search around and did not find anything interesting. Anyone
know if some analyzers exists to better index source code (es C#, C++. Java
etc)?
Standard analyzer is quite good, but I wish to know if there are some more
specific analyzers that can do a better indexing. Es I did a
Erick, i think he didn't at the validate=false to a field, but global to
the schema.xml/solrconfig.xml (i don't remember where exactly define this
globally)
Von: Erick Erickson [via Lucene]
[mailto:ml-node+s472066n4070067...@n3.nabble.com]
Gesendet: Donnerstag, 13. Juni 2013 00:51
An: uwe72
How can i load this custom properties with solrJ?
Von: Erick Erickson [via Lucene]
[mailto:ml-node+s472066n4070068...@n3.nabble.com]
Gesendet: Donnerstag, 13. Juni 2013 00:53
An: uwe72
Betreff: Re: SOLR-4641: Schema now throws exception on illegal field
parameters.
But see Steve Rowe's
What would be the process to update a new record in an existing db using DIH?
Thanks
Daniel, DIH JdbcDataSource does not support integrated security. You must
provide a username and password for it to work.
On Wed, Jun 12, 2013 at 11:06 PM, Daniel Mosesson daniel.moses...@ipreo.com
wrote:
I currently have the following:
I am running the example-DIH instance of solr, and it
Hi,
I have solr server 1.4.1 with index file size 428GB.Now When I upgrade solr
Server 1.4.1 to Solr 3.5.0 by replication method. Size remains same.
But when optimize index for Solr 3.5.0 instance its size reaches 791GB.so
what is solutions for size remains same or lesser.
I optimize Solr 3.5 with
Shane:
You've covered all the config stuff that I can think of. There's one
other possibility. Do you have the soft commits turned on and are
they very short? Although soft commits shouldn't invalidate any
segment-level caches (but I'm not sure whether the sorting buffers
are low-level or not).
Well, WordDelimiterFilterFactory would split on the punctuation, so
you could add it to the analyzer chain along with StandardAnalyzer.
You could use one of the regex filters to break up tokens that make it
through the analyzer as you see fit.
But in general, this will be a bunch of compromises
Hello!
Optimize command needs to rewrite the segments, so while it is
still working you may see the index size to be doubled. However after
it is finished the index size will be usually lowered comparing to the
index size before optimize.
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/
Thanks Rafal for reply...
I agree with you. But Actually After optimization , it does not reduce size
and it remains double. so is there any thing we missed or need to do for
achieving index size reduction ?
Is there any special setting we need to configure for replication?
On 13 June 2013
Look further down in the stack trace in the Solr log for the final Caused
By:.
And better to start with the Solr 4.3 schema and config files and then merge
in your Nutch changes one line at a time.
-- Jack Krupansky
-Original Message-
From: Tony Mullins
Sent: Thursday, June 13,
Hello!
Do you have some backup after commit in your configuration? It would
also be good to see how your index directory looks like, can you list
that ?
--
Regards,
Rafał Kuć
Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch
Thanks Rafal for reply...
I agree with you. But
Thanks. And I apologize for the fact that Solr doesn't have a clean and true
REST API (like ElasticSearch!) - even though it's not my fault!
An app-specific REST API is the way to go. Solr is too much of a beast for
average app developers to master.
Let us know of any additional, specific
And sometimes useful projects come out from the annoying, confusing
corner situations like yours.
See if you can get permission to open-source your implementation and
you may find more people interested in the same thing. It could also
be a good visibility for your consultancy. Worst case, there
Hi.
I was hoping by replacing Nutch provided schema to my Solr schema ( as the
described by Nutch documentation) would solve all my problems.
So you are suggesting I edit my existing Solr schema and just add the
additional information found in Nutch-Solr schema line by line .
Thanks,
Tony.
On
I don't know what choice you have until somebody on the Nutch project takes
the time to do the same thing and update their schema to 4.3. They should
keep a schema for every Solr release.
-- Jack Krupansky
-Original Message-
From: Tony Mullins
Sent: Thursday, June 13, 2013 9:30 AM
Erick,
We do have soft commits turned. Initially, autoCommit was set at 15000 and
autoSoftCommit at 1000. We did up those to 120 and 60
respectively. However, since the core in question is a slave, we don't
actually do writes to the core but rely on replication only to populate the
Hi.
I was hoping by replacing Nutch provided schema to my Solr schema ( as the
described by Nutch documentation) would solve all my problems.
So you are suggesting I edit my existing Solr schema and just add the
additional information found in Nutch-Solr schema line by line
I hate to tell
OK. Thanks.
Tony.
On Thu, Jun 13, 2013 at 7:02 PM, Shawn Heisey s...@elyograg.org wrote:
Hi.
I was hoping by replacing Nutch provided schema to my Solr schema ( as
the
described by Nutch documentation) would solve all my problems.
So you are suggesting I edit my existing Solr
I am usng CloudSolrServer at my application. When I look at output I bunch
of that messages:
17:16:33.205 [tion(localhost:9983)] DEBUG ClientCnxn - Got ping response
for sessionid: 0x13f3c94662c0026 after 0ms
17:16:36.542 [tion(localhost:9983)] DEBUG ClientCnxn - Got ping response
for sessionid:
It could be pretty complicated to do well.
I'm pretty sure that Krugle is based on Solr: http://opensearch.krugle.org/
You might also look at the UI for Ohloh (used to be Koders):
http://code.ohloh.net/
wunder
On Jun 13, 2013, at 1:19 AM, Gian Maria Ricci wrote:
I did a little search around
I wrote a blog post about this stuff here:
http://searchhub.org/2013/05/21/schemaless-solr-part-1/. - Steve
On Jun 12, 2013, at 3:26 PM, Chris Hostetter hossman_luc...@fucit.org wrote:
: Dynamically adding fields to schema is yet to get released..
:
:
That was my thought exactly. Contribute a REST request handler. --wunder
On Jun 13, 2013, at 6:04 AM, Alexandre Rafalovitch wrote:
And sometimes useful projects come out from the annoying, confusing
corner situations like yours.
See if you can get permission to open-source your
Hello All,
How can I get docID of result from solr?
What I am doing currently is,
I do search request in solr.
I get certain records (Say 10).
solrurl/start=0rows=10
Now, again I do search request with below
solrurl/start=10rows=10
So i get next 10 records.
Now new records are
I'm trying to make Brüno come up in my results when the user types in
Bruno.
What's the best way to accomplish this?
Using Solr 4.2
--
View this message in context:
http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256.html
Sent from the Solr - User mailing list archive at
charFilter class=solr.MappingCharFilterFactory
mapping=mapping-ISOLatin1Accent.txt/
-- Jack Krupansky
-Original Message-
From: jimtronic
Sent: Thursday, June 13, 2013 11:31 AM
To: solr-user@lucene.apache.org
Subject: Best way to match umlauts
I'm trying to make Brüno come up in my
You can use a Solr transformer to get the Lucene docID in the fl parameter:
fl=id,[docid],score,my-field,...
But... you can't use the Lucene docId in a query.
Relevancy and sorting, not to mention updating of existing documents, can
change the order of results so that docId is not a good
On 6/13/2013 8:19 AM, Furkan KAMACI wrote:
17:16:56.560 [tion(localhost:9983)] DEBUG ClientCnxn - Got ping response
for sessionid: 0x13f3c94662c0026 after 0ms
17:16:59.897 [tion(localhost:9983)] DEBUG ClientCnxn - Got ping response
for sessionid: 0x13f3c94662c0026 after 0ms
17:17:03.232
Hello,
I just gave the parameter timeAllowed a try and noticed that in some cases
the actual query time exceeds the timeout specified by the timeAllowed
parameter, e.g., having set timeAllowed to 100 the actual query time is
300ms. Unfortunately, the documentation of the timeAllowed parameter is
I think the problem is the desire for the idempotent search results
across paging calls. Not sure if that explains it any better than the
original poster though. :-)
Basically, if the repeated search gets a different documents returned,
the offsets become somewhat problematic. Specifically, a
Someone tweeted with the #solr hashtag that Solr 5.0 is released.
https://twitter.com/nadr
This is not correct. At this time, version 4.3.0 is the current
release. I expect that the announcement for 4.3.1 will appear within
the next couple of days.
Right now, there is no timeframe for the
Thanks Jack, below is the actual problem,
suppose currently 4 records are there in solr engine. A,B, C and D.
query return
start=0rows=1 A
start=1rows=1 B
start=2rows=1
Thanks! Sorry for the basic question, but I was having trouble finding the
results through google.
On Thu, Jun 13, 2013 at 10:39 AM, Jack Krupansky-2 [via Lucene]
ml-node+s472066n4070262...@n3.nabble.com wrote:
charFilter class=solr.MappingCharFilterFactory
Should we worry about Stock Options being shorted?
Just kidding. :-)
Regards,
Alex.
P.s. As an aside, being relatively new, I do wonder what kind of
event/discussion will trigger version 5 branch-off. I guess it would
actually be more of a Lucene decision these days.
Personal website:
On 6/13/2013 10:02 AM, Alexandre Rafalovitch wrote:
P.s. As an aside, being relatively new, I do wonder what kind of
event/discussion will trigger version 5 branch-off. I guess it would
actually be more of a Lucene decision these days.
I foresee two likely reasons for a branch-off. Based on
Thanks for the suggestions, I'll try with the WordDelimiterFilterFactory. My
aim is not to have a perfect analysis, just a way to quick search for words
in the whole history of a codebase. :)
--
Gian Maria Ricci
Mobile: +39 320 0136949
I see that the threads parameter has been removed from DIH from all version
starting SOLR 4.x. Can someone let me know the best way to initiate indexing
in multi threaded mode when using DIH now? Is there a way to do that?
--
View this message in context:
Just to confirm even solr.ASCIIFoldingFilterFactory should solve the
purpose.
am i correct ?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256p4070317.html
Sent from the Solr - User mailing list archive at Nabble.com.
Also you might want to check this blog post, just went up today.
http://searchhub.org/2013/06/13/solr-cloud-document-routing/
On Wed, Jun 12, 2013 at 2:18 PM, James Thomas jtho...@camstar.com wrote:
This page has some good information on custom document routing:
On 6/13/2013 12:08 PM, bbarani wrote:
I see that the threads parameter has been removed from DIH from all version
starting SOLR 4.x. Can someone let me know the best way to initiate indexing
in multi threaded mode when using DIH now? Is there a way to do that?
That parameter was removed
Hi Gian Maria,
OpenGrok http://opengrok.github.io/OpenGrok/ has a bunch of JFlex-based
computer language tokenizers for Lucene:
https://github.com/OpenGrok/OpenGrok/tree/master/src/org/opensolaris/opengrok/analysis.
Not sure how much work it would be to use them in another project, though.
Yes, but it's the third best choice. It's a token filter, while the issue at
hand is a character filtering issue.
A second best choice would be to map for full ASCII folding at the character
level:
charFilter class=solr.MappingCharFilterFactory
mapping=mapping-FoldToASCII/
-- Jack
Hi folks,
I am trying to update multiple fields (assume q=id:*) and add a filed to
all of them. Is this possible?
If yes, what would be the syntax?
I am using the json update interface - /update/json ...
Thanks,
Siamak
On Jun 13, 2013, at 3:48 PM, Jack Krupansky j...@basetechnology.com wrote:
charFilter class=solr.MappingCharFilterFactory
mapping=mapping-FoldToASCII/
The mapping attribute above is missing the .txt file extension:
charFilter class=solr.MappingCharFilterFactory
That is not a feature available in Solr.
You can update a full document or do a partial update of a single document
based on its unique key, and you can update a batch of documents using those
two techniques.
You probably could implement custom code to do it.
Maybe even using a script
Thanks Jack.
I currently have written down a script which does that - effectively
retrieving all those documents and updating them one by one, atomically.
But I was hoping Solr does a more efficient implementation internally.
Is there any thinking about implementing such feature in future?
I haven't heard any mention of it, but it seems like a reasonable
enhancement.
There have been cases where people want to do things like add a new value to
every document.
I'll have to check into how easy it is to perform a query from an update
processor.
-- Jack Krupansky
-Original
Hi,
I am attempting to transform the XML output of Solr using the
XsltResponseWriter http://wiki.apache.org/solr/XsltResponseWriter to HTML.
This works, but I am wondering if there is a way for me to debug my creation
of XSL. If there is any problem in the XSL you simply get a stack trace
Your phrasing of the question may be convoluting things -- you refered to
DocID but it's not clear if you mean...
* the low level internal lucene doc id
* the uniqueKey field of your schema.xml
* some identifier whose providence you don't care about.
In the first case, you can use the
I've dug through the code and have narrowed the delay down
to TopFieldCollector$OneComparatorNonScoringCollector.setNextReader() at
the point where the comparator's setNextReader() method is called (line 98
in the lucene_solr_4_3 branch). That line is actually two method calls so
I'm not yet
Use command line Xalan, debug the stylesheet outside of Solr. You can
save the XML output to disk, and then transform that with xalan.
Upayavira
On Thu, Jun 13, 2013, at 10:45 PM, O. Olson wrote:
Hi,
I am attempting to transform the XML output of Solr using the
XsltResponseWriter
Hi Solr Guru's
I am trying to implement auto suggest where solr would suggest several
phrases that would return results as the user types in a query (as distinct
from autocomplete). e.g. say the user starts typing 'br' and we have
documents that contain brake pads and left disc brake, solr would
Hello,
I am evaluating solr for indexing about 45M product catalog info. Catalog
mainly contains title and description which takes most of the space (other
attributes are brand, category, price, etc)
The data is stored in cassandra and I am using datastax's solr (DSE 3.0.2)
which handles
this might be a dumb question. But can you please point me some key
difference between ASCIIFolding Filter and Character Filter using a map
File.
thanks
Aditya
--
View this message in context:
http://lucene.472066.n3.nabble.com/Best-way-to-match-umlauts-tp4070256p4070398.html
Sent from the
Hi,
I think you are talking about wanting instant search?
See https://github.com/fergiemcdowall/solrstrap
Otis
--
Solr ElasticSearch Support
http://sematext.com/
On Thu, Jun 13, 2013 at 7:43 PM, Brendan Grainger
brendan.grain...@gmail.com wrote:
Hi Solr Guru's
I am trying to implement
Gian,
Lucene in Action has a case study from Krugle about their analysis for a
code search engine, if you want to look there.
Otis
--
Solr ElasticSearch Support
http://sematext.com/
On Thu, Jun 13, 2013 at 4:19 AM, Gian Maria Ricci
alkamp...@nablasoft.comwrote:
I did a little search
Hi,
Hard to tell, but here are some tips:
* Those are massive caches. Rethink their size. More specifically,
plug in some monitoring tool and see what you are getting out of them.
Just today I looked at one Sematext's client's caches - 200K entries,
0 evictions == needless waste of JVM heap.
On 6/13/2013 5:53 PM, Utkarsh Sengar wrote:
*Problems:*
The initial training pulls 2000 documents from solr to find the most
probable matches and calculates score (PMI/NPMI). This query is extremely
slow. Also, a regular query also takes 3-4 seconds.
I am running solr currently on just one VM
Otis,Shawn,
Thanks for reply.
You can find my schema.xml and solrconfig.xml here:
https://gist.github.com/utkarsh2012/5778811
To answer your questions:
Those are massive caches. Rethink their size. More specifically,
plug in some monitoring tool and see what you are getting out of them.
Hi,
Changing cache sizes doesn't require indexing.
You have high IO Wait - waiting on your disks? Ideally your index
will be cached. Lower those cached, possibly reduce heap size, and
leave more RAM to the OS for caching and IO Wait will hopefully go
down. I'd try with just -Xmx4g and see.
On 6/13/2013 7:51 PM, Utkarsh Sengar wrote:
Sure, I will reduce the count and see how it goes. The problem I have is,
after such a change, I need to reindex everything again, which again is
slow and takes time (40-60hours).
There should be no need to reindex after changing most things in
Token filter character filter is a key difference.
-- Jack Krupansky
-Original Message-
From: adityab
Sent: Thursday, June 13, 2013 8:17 PM
To: solr-user@lucene.apache.org
Subject: Re: Best way to match umlauts
this might be a dumb question. But can you please point me some key
Do someone known Why the query is very slow before the optimize end of a few
minutes.
When the solr optimize, I have a loop query( curl query url and sleep one
second) every one second to check the query speed. It is normal, the query time
can be accept. But it always very slow before the
Hi,
What you pasted from console didn't come across well. Yes, optimizing
a static index is OK and yes, if your index is very unoptimized then
yes, it will be slower than when it is optimized not sure if that
addresses your concerns...
Otis
--
Solr ElasticSearch Support --
If is query suggestion what you are looking for, what we've done is storing the
user queries into a separated core and pull the suggestions from there.
- Mensaje original -
De: Brendan Grainger brendan.grain...@gmail.com
Para: solr-user@lucene.apache.org
Enviados: Jueves, 13 de Junio
Hi Christof,
In short: yes, known behaviour, you can't rely on timeAllowed as you'd
think - it is limited to only a portion of total execution.
See http://search-lucene.com/?q=timeallowedsort=newestOnTopfc_project=Solr
for previous answers to this Q.
Otis
--
Solr ElasticSearch Support --
Aditya,
Char filters are applied prior to tokenization, so they can affect
tokenization, but I can't think of any tokenization changes that accent
stripping would cause.
Token filters can be re-ordered to achieve certain objectives. For example, if
you want to use a stemmer that only
Hi Otis,
Sorry, it does not formatted.
TimequeryTime(ms), CPU % r/s w/s rMB/s wMB/s IO %
...
7:30:24 12 89 156.44 0 16.40 94.06
7:30:25 18 91 157 0 15.35 0 98.1
7:30:26 9 91 194 0 19.62 0 96.1
Thanks Barani. Could also work out this way provided we start with a large
set of suggestions initially to increase the likelihood of getting some
matches when filtering down with the second query.
On Wed, Jun 12, 2013 at 10:51 PM, bbarani bbar...@gmail.com wrote:
I would suggest you to take
73 matches
Mail list logo