Re: Multi-words synonyms matching
Are you sure with LUCENE_33 (Use of BitVector)? Am 31.05.2012 17:20, schrieb O. Klein: > I have been struggling with this as well and found that using LUCENE_33 gives > the best results. > > But as it will be deprecated this is no everlasting solution. May somebody > knows one? >
Re: How can I remove the home page priority of site home page from search results
Add &debugQuery=true to your query and check how the home page is scored. That should give you a clue why the title is not boosting the score enough. Maybe you simply need a higher boost for title, but let the debugQuery scoring be your guide. Actually, if you are explicitly referencing a field in your query ("title:abc"), that won't pick up the title boost from the "qf" field list. You would need an explicit boost in the query itself. But, I'm not sure I understand how your query gets expanded: q=title:'.$keywords.' Maybe you wanted: q=title:(.$keywords.), because otherwise spaces between the keywords would end the first "fielded term" and then proceed to reference the dismax field list (qf). -- Jack Krupansky -Original Message- From: Shameema Umer Sent: Friday, June 01, 2012 1:46 AM To: solr-user@lucene.apache.org Subject: How can I remove the home page priority of site home page from search results My query is like this: ?q=title:'.$keywords.'&defType=edismax&qf=title^10 url^9 content^5&start=0&rows=10&version=2.2&indent=on&hl=true&hl.fl=content&hl.fragsize=300 My results show site home page as the first result even though there are other pages with title scoring more for the given keywords. I need to give less priority to site home page than other pages. Please help. Thanks Shameema
Re: Stop Words in SpellCheckComponent
Your earlier email had this option in your spellcheck.de field type analyzer for the StopFilterFactory: words="german_stop_long.txt" But your most recent email referred to "stopword.txt". So, either add "the" to german_stop_long.txt, or change the "words" option of your stopfilter to refer to "stopwords.txt". BTW, I think you can actually have a comma-separated list of stopword files, so you can write: words="german_stop_long.txt,stopwords.txt" -- Jack Krupansky -Original Message- From: Matthias Müller Sent: Friday, June 01, 2012 1:44 AM To: solr-user@lucene.apache.org Subject: Re: Stop Words in SpellCheckComponent spellcheck_de That should reference a field, not a field type. Thanks for your help. But I did that, too. Here I'll show that even the solr example webapp makes suggestions for stopwords: I've ... 1. added "the" to the stopwords.txt 2. added "thex" to an example document (field name) 3. startet solr 4. indexed the example files (sh post.sh *.xml) 5. searched for "the solr" http://myhost:8983/solr/select?q=the+solr&spellcheck=true&wt=json 6. got the desired result, but also the wrong suggestion "thex" { "response" : { "docs" : [ {... "name" : "Solr, thex Enterprise Search Server", .. } ], "numFound" : 1, ... }, ... "spellcheck" : { "suggestions" : [ "the", {..."suggestion" : [ "thex" ] } ] } } Here's the complete diff between the original download and my 3 modifications: diff -r apache-solr-3.6.0/example/exampledocs/solr.xml apache-solr-3.6.0x/example/exampledocs/solr.xml 21c21 < Solr, the Enterprise Search Server --- Solr, thex Enterprise Search Server diff -r apache-solr-3.6.0/example/solr/conf/solrconfig.xml apache-solr-3.6.0x/example/solr/conf/solrconfig.xml 781a782,785 spellcheck 1122a1127 true diff -r apache-solr-3.6.0/example/solr/conf/stopwords.txt apache-solr-3.6.0x/example/solr/conf/stopwords.txt 14a15,16 the
Re: Stop Words in SpellCheckComponent
> spellcheck_de > > That should reference a field, not a field type. Thanks for your help. But I did that, too. Here I'll show that even the solr example webapp makes suggestions for stopwords: I've ... 1. added "the" to the stopwords.txt 2. added "thex" to an example document (field name) 3. startet solr 4. indexed the example files (sh post.sh *.xml) 5. searched for "the solr" http://myhost:8983/solr/select?q=the+solr&spellcheck=true&wt=json 6. got the desired result, but also the wrong suggestion "thex" { "response" : { "docs" : [ {... "name" : "Solr, thex Enterprise Search Server", .. } ], "numFound" : 1, ... }, ... "spellcheck" : { "suggestions" : [ "the", {..."suggestion" : [ "thex" ] } ] } } Here's the complete diff between the original download and my 3 modifications: diff -r apache-solr-3.6.0/example/exampledocs/solr.xml apache-solr-3.6.0x/example/exampledocs/solr.xml 21c21 < Solr, the Enterprise Search Server --- > Solr, thex Enterprise Search Server diff -r apache-solr-3.6.0/example/solr/conf/solrconfig.xml apache-solr-3.6.0x/example/solr/conf/solrconfig.xml 781a782,785 > >spellcheck > > 1122a1127 > true diff -r apache-solr-3.6.0/example/solr/conf/stopwords.txt apache-solr-3.6.0x/example/solr/conf/stopwords.txt 14a15,16 > > the
Re: index special characters solr
Special characters are filtered out of (most) "text" fields, but are preserved in "string" fields. String fields might suit your needs, but are inconvenient for keyword searching. You may be able to use the "types" option of the WordDelimiterFilterFactory to pass in a custom character type table that has the special characters treated as alphabetic characters. Otherwise, you may have to customize the code yourself. -- Jack Krupansky -Original Message- From: KPK Sent: Thursday, May 31, 2012 7:38 PM To: solr-user@lucene.apache.org Subject: index special characters solr Hi all Can somebody please tell me how can I build an index in solr where one of my field contains special characters like $ , % I would also like to search on the same characters on that particular field. Any advice would be appreciated. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/index-special-characters-solr-tp3987157.html Sent from the Solr - User mailing list archive at Nabble.com.
Challenge: Is dynamic data source possible for DataImportHandler JdbcDataSource?
Hi, The challenge I'm facing is some sort of dynamic data source. Your valuable input is highly appreciated. Below is my data-config.xml. I have one user database and two company databases. The user table in the user database has four columns which are id + name + company_dbname + company_id. Depending on the company_dbname, I need to look up either companydb0 or companydb1 to get the company name by the company_id. Is it doable to set the data source dynamically for the child entity? In my case, I would like to set company entity dataSource to "${USER.company_dbname}" which is returned from USER entity query. If it's not doable with current implementation, I would like to download the source code and customize it for my needs. Which source java file I should start with? Many many thanks, Kevin
index special characters solr
Hi all Can somebody please tell me how can I build an index in solr where one of my field contains special characters like $ , % I would also like to search on the same characters on that particular field. Any advice would be appreciated. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/index-special-characters-solr-tp3987157.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Fwd: Data Import Handler fields with different values in column and name
Hi Jack, Thanks for your help. I delete conf/data/* every restart so make sure to work with clean data. is there any other config I should do?. Maybe another xml file. Kind regards On Thu, May 31, 2012 at 5:18 PM, Jack Krupansky wrote: > It looks okay; renaming a column is fine. > > Maybe... maybe when you re-run it DIH is not replacing any documents that > already have id's in Solr, leaving them with their old field values. Maybe > you need to manually delete the old Solr documents and run a fresh full > import. > > > -- Jack Krupansky > > -Original Message- From: Rafael Taboada > Sent: Thursday, May 31, 2012 5:13 PM > To: solr-user@lucene.apache.org > Subject: Fwd: Data Import Handler fields with different values in column > and name > > > Please, > > Can anyone guide me through this issue? Thanks > > > > -- Forwarded message -- > From: Rafael Taboada > Date: Thu, May 31, 2012 at 12:30 PM > Subject: Data Import Handler fields with different values in column and > name > To: solr-user@lucene.apache.org > > > Hi folks, > > I'm using Solr 3.6 and I'm trying to import data from my database to solr > using Data Import Handler. My db-config is like this: > > > url="jdbc:oracle:thin:@**localhost:1521:XE" user="admin" password="admin" > /> > > > > > > > > > > My problem is when I'm trying to use a different values in the field tag, > for example > > > > When I use different name from column, this field is omitted. Please can > you help me with this issue? > > My schema.xml is: > > > /> > > > > > required="true" /> > /> > stored="true" /> > > > Thanks in advance! > > -- > Rafael Taboada > > > > > > > -- > Rafael Taboada > > /* > * Phone >> 992 741 026 > */ > -- Rafael Taboada /* * Phone >> 992 741 026 */
Re: Fwd: Data Import Handler fields with different values in column and name
It looks okay; renaming a column is fine. Maybe... maybe when you re-run it DIH is not replacing any documents that already have id's in Solr, leaving them with their old field values. Maybe you need to manually delete the old Solr documents and run a fresh full import. -- Jack Krupansky -Original Message- From: Rafael Taboada Sent: Thursday, May 31, 2012 5:13 PM To: solr-user@lucene.apache.org Subject: Fwd: Data Import Handler fields with different values in column and name Please, Can anyone guide me through this issue? Thanks -- Forwarded message -- From: Rafael Taboada Date: Thu, May 31, 2012 at 12:30 PM Subject: Data Import Handler fields with different values in column and name To: solr-user@lucene.apache.org Hi folks, I'm using Solr 3.6 and I'm trying to import data from my database to solr using Data Import Handler. My db-config is like this: My problem is when I'm trying to use a different values in the field tag, for example When I use different name from column, this field is omitted. Please can you help me with this issue? My schema.xml is: Thanks in advance! -- Rafael Taboada -- Rafael Taboada /* * Phone >> 992 741 026 */
Re: Solr with UIMA
Is it failing on the first document? I see "uid 5", suggests that it is not. If not, how is this document different from the others? I see the exception org.apache.uima.resource.ResourceInitializationException, suggesting that some file cannot be loaded. It sounds like it may be having trouble loading "aePath" ("analysisEngine"). Or maybe some other file? -- Jack Krupansky -Original Message- From: debdoot Sent: Thursday, May 31, 2012 11:59 AM To: solr-user@lucene.apache.org Subject: Re: Solr with UIMA Hi Tommaso, I have followed the steps you have listed to try to deploy the example RoomNumberAnnotator with Solr 3.5. Here is the error trace that I get: org.apache.solr.common.SolrException: processing error: null. uid=5, text="Test Room HAW GN-K35..." at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:107) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:158) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:192) at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:89) at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:919) at com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1016) at com.ibm.ws.webcontainer.webapp.WebApp.handleRequest(WebApp.java:3703) at com.ibm.ws.webcontainer.webapp.WebGroup.handleRequest(WebGroup.java:304) at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:953) at com.ibm.ws.webcontainer.WSWebContainer.handleRequest(WSWebContainer.java:1655) at com.ibm.ws.webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:195) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:452) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewRequest(HttpInboundLink.java:511) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.processRequest(HttpInboundLink.java:305) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.ready(HttpInboundLink.java:276) at com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.sendToDiscriminators(NewConnectionInitialReadCallback.java:214) at com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.complete(NewConnectionInitialReadCallback.java:113) at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165) at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217) at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161) at com.ibm.io.async.AsyncFuture.completed(AsyncFuture.java:138) at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:204) at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:775) at com.ibm.io.async.ResultHandler$2.run(ResultHandler.java:905) at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1650) Caused by: org.apache.uima.resource.ResourceInitializationException at org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE(OverridingParamsAEProvider.java:86) at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processText(UIMAUpdateRequestProcessor.java:144) at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:77) ... 30 more Caused by: java.lang.NullPointerException at org.apache.uima.util.XMLInputSource.(XMLInputSource.java:118) at org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE(OverridingParamsAEProvider.java:58) ... 32 more at com.ibm.ws.webcontainer.webapp.WebAppDispatcherContext.sendError(WebAppDispatcherContext.java:624) at com.ibm.ws.webcontainer.webapp.WebAppDispatcherContext.sendError(WebAppDispatcherContext.java:642) at com.ibm.ws.webcontainer.srt.SRTServletResponse.sendError(SRTServletResponse.java:1235) at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:380) at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:326) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:265) Please let me know if you have any insights on what could be the issue. Thanks in advance, Debdoot -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3987056.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Strip html
: I make a transformation XSLT which return : : --- : si les ruches d’abeilles prouvent la : monarchie, les fourmillières, les troupes d’éléphants ou : de castors prouvent la république. : --- : i put this html in solr: $doc->addField('body_strip_html', $body_norm); ... : But this don't work! : I want to return this xml files (look exemple) if i search "castor". I'm confused. a) you said you've already transformed your input XML into plain text -- so i don't see what you need HTML striping at all. b) your current problem doesn't seem to have anything to do with HTML or XML ... you're asking why a document containing "castors" (plural) doesn't match a query for "castor" (singular) but the field type you say are using has a very simple analyzer that doens't do any stemming of any kind... >> >> >> >> ..since there is no HTML in your input, HTMLStripCharFilterFactory is a no-op. which leaves StandardTokenizerFactory which just does tokenization. It seems like all you need to do is add a stemmer (and for efficiency: remove the HTMLStripCharFilterFactory). I'm no expert, but it looks like you are indexing french, so i would suggest using a french stemmer... https://wiki.apache.org/solr/LanguageAnalysis#French -Hoss
Re: possible status codes from solr during a (DIH) data import process
there is at least one scenario where no error is reported when it should be, if the host runs out of disk when optimizing, it is not reported. There is a jira issue open I think -- View this message in context: http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-tp3987110p3987144.html Sent from the Solr - User mailing list archive at Nabble.com.
Fwd: Data Import Handler fields with different values in column and name
Please, Can anyone guide me through this issue? Thanks -- Forwarded message -- From: Rafael Taboada Date: Thu, May 31, 2012 at 12:30 PM Subject: Data Import Handler fields with different values in column and name To: solr-user@lucene.apache.org Hi folks, I'm using Solr 3.6 and I'm trying to import data from my database to solr using Data Import Handler. My db-config is like this: My problem is when I'm trying to use a different values in the field tag, for example When I use different name from column, this field is omitted. Please can you help me with this issue? My schema.xml is: Thanks in advance! -- Rafael Taboada -- Rafael Taboada /* * Phone >> 992 741 026 */
Re: possible status codes from solr during a (DIH) data import process
Hi, Thats correct. For failure, you have to check for the text *"Indexing failed. Rolled back changes"* under the tag. One more thing to note here is that there may be a time during the indexing process where the indexing is complete but the index is not committed and optimized yet. You would need to check if the response listed below is present along with the success message to term it as a complete success. *2012-05-31 15:10:45 2012-05-31 15:10:45* On Thu, May 31, 2012 at 3:42 PM, geeky2 wrote: > hello all, > > i have been asked to write a small polling script (bash) to periodically > check the status of an import on our Master. our import times are small, > but there are business reasons why we want to know the status of an import > after a specified amount of time. > > i need to perform certain actions based on the "status" of the import, and > therefore need to quantify which tags to check and their appropriate > states. > > i am using the command from the DataImportHandler HTTP API to get the > status > of the import: > > OUTPUT=$(curl -v > http://${SERVER}:${PORT}/somecore/dataimport?command=status) > > > > > can someone tell me if i have these rules correct? > > 1) during an import - the status tag will have a busy state: > > example: > > busy > > 2) at the completion of an import (regardless of failure or success) the > status tag will have an "idle" state: > > example: > > idle > > > 3) to determine if an import failed or succeeded - you must interrogate the > tags underand specifically look for : > > success: > Indexing completed. Added/Updated: 603378 documents. Deleted 0 > documents. > > failure: > Indexing completed. Added/Updated: 603378 documents. Deleted 0 > documents. > > thank you, > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-tp3987110.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Thanks and Regards Rahul A. Warawdekar
RE: possible status codes from solr during a (DIH) data import process
You've got it right. Here's a summary: - "status" = "busy" means its in-process. - "status" = "idle" means its finished (success or failure). - You can drill down further by looking at sub-elements under "statusMessages" : > if there is , it means the last import was cancelled > with "command=abort" > look at the body of . o If it begins with "Indexing completed.", then it finished with a success. o If it begins with "Indexing failed.", then it finished with a failure. Just be careful to test your script whenever you change DIH versions. This status screen isn't the best and no doubt it will change sometime in the future. Also, keep in mind that as soon as the next import begins the old statuses get lost so you'll need to plan your script runs around that. Someday it'll be nice if we can come up with a better way than this to programitically interact with DIH... James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: geeky2 [mailto:gee...@hotmail.com] Sent: Thursday, May 31, 2012 2:43 PM To: solr-user@lucene.apache.org Subject: possible status codes from solr during a (DIH) data import process hello all, i have been asked to write a small polling script (bash) to periodically check the status of an import on our Master. our import times are small, but there are business reasons why we want to know the status of an import after a specified amount of time. i need to perform certain actions based on the "status" of the import, and therefore need to quantify which tags to check and their appropriate states. i am using the command from the DataImportHandler HTTP API to get the status of the import: OUTPUT=$(curl -v http://${SERVER}:${PORT}/somecore/dataimport?command=status) can someone tell me if i have these rules correct? 1) during an import - the status tag will have a busy state: example: busy 2) at the completion of an import (regardless of failure or success) the status tag will have an "idle" state: example: idle 3) to determine if an import failed or succeeded - you must interrogate the tags underand specifically look for : success: Indexing completed. Added/Updated: 603378 documents. Deleted 0 documents. failure: Indexing completed. Added/Updated: 603378 documents. Deleted 0 documents. thank you, -- View this message in context: http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-tp3987110.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: index merge
Hi All, I have a basic doubt about index merging in Solr. The setup that I have followed is as follows: Setup: I used the schema.xml that comes with the solr example. I had three cores - core0, core1 and core2. I tried merging the indexes of core 0 and core 1 to core2. I copied the same schema.xml from SOLR_HOME/example/solr/conf to core 0 and core 1 but changed the name field alone as core0 and core1 respectively. Operations: I indexed different files to core0 and core1. The search *:* in Solr showed 6 files and 9 files for core0 and core1 respectively. Then merged the indexes of core0 and core1 to core2. As expected the search *:* showed 15 files for core2. I added 2 new files to the index of core0 and 1 file to core1 and merged again to core2. This time to my surprise "*" showed the total number of files showed to be 33 = (15+18) instead of just 18. This duplication continued for each merge operation which is not efficient. Also the merged files were available for search only after restarting the Jetty server. Am I missing something or doing things wrongly? Is there a way to restart only a specific core to read the new index/reflect the merged changes? Please explain the merge operation. Thanks, Sudarshan -- View this message in context: http://lucene.472066.n3.nabble.com/index-merge-tp472904p3987121.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Cannot get highlighting to work
Try a query that uses a term that doesn't split an alphanumeric term into two terms. Then check to see what field type you used for the symbol and marker_symbol fields and whether the analyzer for that field type has changed in 3.6. -- Jack Krupansky -Original Message- From: Asfand Qazi Sent: Thursday, May 31, 2012 12:32 PM To: solr-user@lucene.apache.org Subject: Cannot get highlighting to work Hello, I am having problems doing highlighting a Solr 3.6 instance, while it was working just fine before on our 1.4 instance. The solrconfig.xml and schema.xml files are located here: https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/schema.xml (please note the incorrect line wrapping - it should be on one line) https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/solrconfig.xml (please note the incorrect line wrapping - it should be on one line) The query I fire off (which worked on the 1.4 instance) is: /solr/main/select?q=Cbx1&wt=json&hl=true&hl.fl=*&hl.usePhraseHighlighter=true (please note the incorrect line wrapping - it should be on one line) I expect a section like: { MGI:105369: { symbol: [ "Cbx1" ], marker_symbol: [ "Cbx1" ] } } I get: { MGI:105369: { } } Can anyone help? Thanks -- Regards, Asfand Yar Qazi Team 87 - High Throughput Gene Targeting Wellcome Trust Sanger Institute -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
Re: Using Data Import Handler to invoke a stored procedure with output (cursor) parameter
Can you add a new stored procedure that uses your current one? It would operate like the DIH expects. I don't remember if DB cursors are a standard part of JDBC. If they are, it would be a great addition to the DIH if they work right. On Thu, May 31, 2012 at 10:44 AM, Niran Fajemisin wrote: > Thanks for your response, Michael. Unfortunately changing the stored > procedure is not really an option here. > > From what I'm seeing, it would appear that there's really no way of somehow > instructing the Data Import Handler to get a handle on the output parameter > from the stored procedure. It's a bit surprising though that no one has ran > into this scenario but I suppose most people just work around it. > > Anyone else care to shed some more light on alternative approaches? Thanks > again. > > > >> >> From: Michael Della Bitta >>To: solr-user@lucene.apache.org >>Sent: Thursday, May 31, 2012 9:40 AM >>Subject: Re: Using Data Import Handler to invoke a stored procedure with >>output (cursor) parameter >> >>I could be wrong about this, but Oracle has a table() function that I >>believe turns the output of a function as a table. So possibly you >>could wrap your procedure in a function that returns the cursor, or >>convert the procedure to a function. >> >>Michael Della Bitta >> >> >>Appinions, Inc. -- Where Influence Isn’t a Game. >>http://www.appinions.com >> >> >>On Thu, May 31, 2012 at 8:00 AM, Niran Fajemisin wrote: >>> Hi all, >>> >>> I've seen a few questions asked around invoking stored procedures from >>> within Data Import Handler but none of them seem to indicate what type of >>> output parameters were being used. >>> >>> I have a stored procedure created in Oracle database that takes a couple >>> input parameters and has an output parameter that is a reference cursor. >>> The cursor is expected to be used as a way of iterating through the >>> returned table rows. I'm using the following format to invoke my stored >>> procedure in the Data Import Handler's data config XML: >>> >>> ... >>> >>> I have tested that this query works prior to attempting to use it from >>> within the DIH. But when I attempt to invoke this stored procedure, it >>> naturally complains that the output parameter is not specified (essentially >>> a mismatch in the number of parameters). >>> >>> I don't know of anyway to pass in a cursor parameter (or any output >>> parameter for that matter) to the stored procedure invocation from within >>> the definition. I would greatly appreciate if anyone could >>> provide any pointers or hints on how to proceed. >>> >>> Thanks so much for your time >>> >> >> >> -- Lance Norskog goks...@gmail.com
Re: Merging Remote Solr Indexes?
Merging indexes is not really useful- it won't make distributed search any faster. There are features that don't work with distributed search. Really, you are better off having shards with enough documents so that relevance scoring is balanced. On Thu, May 31, 2012 at 11:04 AM, sudarshan wrote: > Hi All, > I'm new to Solr. I saw this post relating to Merging of indexes. I > have a similar doubt. From the post, I understand that merging of indexes > across different cores is possible only if the cores exist o a single > machine. I want to merge indexes of different machines. Can you please > explain me the different ways of doing this? > > Say I have N+1 Solr engines of which there are N different masters and the > remaining 1 is meant for merging all N indexes together. How I have decided > to merge N indexes to 1 is this. > > 1. Dynamically edit the solrconfig.xml file of the N+1st system to point as > a slave to different master each time. Hence a total of N trials would be > needed to cover all N masters. > 2. During every trial I shall replicate the index of the master and store it > in a different folder. Say index1 from master1, index2 from master2 . > indexn from masterN. > 3. After all indexes are replicated and moved/renamed to local directory, I > shall perform a merge of all indexes. > > > What problems will I have in implementing this? How efficient would be this? > I believe all index folders will have to be available locally to perform > merging. If not, please tell me how better can I do merge remote indexes. > > Another question I have is about MergeFactor. If I set the mergefactor as 5, > will Solr automatically takes care of merging the segments to 1 if the > number of segments reach 5? How this can be exploited? > > Your assistance is sincerely appreciated. > > Regards, > Sudarshan > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Merging-Remote-Solr-Indexes-tp3434412p3987090.html > Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: Is optimize needed on slaves if it replicates from optimized master?
http://wiki.apache.org/solr/SolrPerformanceFactors#mergeFactor The defaults are very good. I have never changed them, and I've had Solr in production at two major sites, Netflix and Chegg. Don't spend any more time worrying about merges. wunder On May 31, 2012, at 10:51 AM, sudarshan wrote: > Walter, > Thanks again. Can you specify the criteria based on which Solr > optimizes/force merges segments automatically. Is this defined by the > MergeFactor parameter - like if the mergefactor is 10, then merge happens > for every 10 segments? Please explain. > > Thanks, > Sudarshan > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Is-optimize-needed-on-slaves-if-it-replicates-from-optimized-master-tp3241604p3987086.html > Sent from the Solr - User mailing list archive at Nabble.com.
possible status codes from solr during a (DIH) data import process
hello all, i have been asked to write a small polling script (bash) to periodically check the status of an import on our Master. our import times are small, but there are business reasons why we want to know the status of an import after a specified amount of time. i need to perform certain actions based on the "status" of the import, and therefore need to quantify which tags to check and their appropriate states. i am using the command from the DataImportHandler HTTP API to get the status of the import: OUTPUT=$(curl -v http://${SERVER}:${PORT}/somecore/dataimport?command=status) can someone tell me if i have these rules correct? 1) during an import - the status tag will have a busy state: example: busy 2) at the completion of an import (regardless of failure or success) the status tag will have an "idle" state: example: idle 3) to determine if an import failed or succeeded - you must interrogate the tags underand specifically look for : success: Indexing completed. Added/Updated: 603378 documents. Deleted 0 documents. failure: Indexing completed. Added/Updated: 603378 documents. Deleted 0 documents. thank you, -- View this message in context: http://lucene.472066.n3.nabble.com/possible-status-codes-from-solr-during-a-DIH-data-import-process-tp3987110.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stop Words in SpellCheckComponent
Spellcheck wants a field, not a field type. You have a spellcheck_de field type, but you need a field as well. spellcheck_de That should reference a field, not a field type. -- Jack Krupansky -Original Message- From: Matthias Müller Sent: Thursday, May 31, 2012 3:23 PM To: solr-user@lucene.apache.org Subject: Re: Stop Words in SpellCheckComponent is it possible to configure a stopword list to the SpellCheckComponent? Add a stopwordfilter to your spellcheck field. Hmm, I did. Could it be another mistake? This is the schema definition: This is the solrconfig: edismax 10 text_de title_de^5 text_de title_de^5 true 0 spellcheck_de textSpell default spellcheck_de spellchecker_de true true
Fwd: Strip html
If I'm not mistaken, that's TEI, and I suggest you consult with the TEI community for strategies for document indexing, as there are a lot of branching-style tags in TEI. My guess is that you'll hear that it's best to perform some sort of term expansion on the document as a preprocessing step. Michael Della Bitta Appinions, Inc. -- Where Influence Isn’t a Game. http://www.appinions.com -Original Message- From: Tigunn Sent: Thursday, May 31, 2012 11:30 AM To: solr-user@lucene.apache.org Subject: Strip html Hello, I have an index full text on xml files. Exemple: --- si les ruches d’abeilles > > prouvent la > monarchie, les fourmillières, les troupes d’éléphants ou > de > > C > c > astors prouvent la > république. --- I use solr 1.4.1 to make full text search with php. When i search "castor", i can't fund this one. But if i search "c astor" it's ok: problem I make a transformation XSLT which return : --- si les ruches d’abeilles prouvent la monarchie, les fourmillières, les troupes d’éléphants ou de castors prouvent la république. --- i put this html in solr: $doc->addField('body_strip_html', $body_norm); In schema.xml: AND But this don't work! I want to return this xml files (look exemple) if i search "castor". Can you help me, please? thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Strip-html-tp3987051.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Stop Words in SpellCheckComponent
>> is it possible to configure a stopword list to the SpellCheckComponent? > Add a stopwordfilter to your spellcheck field. Hmm, I did. Could it be another mistake? This is the schema definition: This is the solrconfig: edismax 10 text_de title_de^5 text_de title_de^5 true 0 spellcheck_de textSpell default spellcheck_de spellchecker_de true true
Re: Strip html
There is no option in the Strip HTML filter to discard whitespace between elements. And it certainly doesn't know the semantics of some XML schema for "choice". You'll have to pre-process that semantics before Solr ingestion, or do your own custom filter. -- Jack Krupansky -Original Message- From: Tigunn Sent: Thursday, May 31, 2012 11:30 AM To: solr-user@lucene.apache.org Subject: Strip html Hello, I have an index full text on xml files. Exemple: --- si les ruches d’abeilles prouvent la monarchie, les fourmillières, les troupes d’éléphants ou de C c astors prouvent la république. --- I use solr 1.4.1 to make full text search with php. When i search "castor", i can't fund this one. But if i search "c astor" it's ok: problem I make a transformation XSLT which return : --- si les ruches d’abeilles prouvent la monarchie, les fourmillières, les troupes d’éléphants ou de castors prouvent la république. --- i put this html in solr: $doc->addField('body_strip_html', $body_norm); In schema.xml: AND But this don't work! I want to return this xml files (look exemple) if i search "castor". Can you help me, please? thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Strip-html-tp3987051.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Data Import Handler fields with different values in column and name
Jack, Thanks for your help. I restarted solr when I was changing schema.xml anytime. Any doc about this mentions it is possible to map the column with another name value. But I can't. Thanks again. Rafael On Thu, May 31, 2012 at 1:27 PM, Jack Krupansky wrote: > Is there any chance that you added the "anotherasunto" field and then > forgot to shut down and reload Solr? Any time you edit schema.xml or > solrconfig.xml you need to reload Solr for the changes to take effect. > > -- Jack Krupansky > > -Original Message- From: Rafael Taboada > Sent: Thursday, May 31, 2012 1:30 PM > To: solr-user@lucene.apache.org > Subject: Data Import Handler fields with different values in column and > name > > > Hi folks, > > I'm using Solr 3.6 and I'm trying to import data from my database to solr > using Data Import Handler. My db-config is like this: > > > url="jdbc:oracle:thin:@**localhost:1521:XE" user="admin" password="admin" > /> > > > > > > > > > > My problem is when I'm trying to use a different values in the field tag, > for example > > > > When I use different name from column, this field is omitted. Please can > you help me with this issue? > > My schema.xml is: > > > /> > > > > > required="true" /> > /> > stored="true" /> > > > Thanks in advance! > > -- > Rafael Taboada > -- Rafael Taboada /* * Phone >> 992 741 026 */
Re: Data Import Handler fields with different values in column and name
Is there any chance that you added the "anotherasunto" field and then forgot to shut down and reload Solr? Any time you edit schema.xml or solrconfig.xml you need to reload Solr for the changes to take effect. -- Jack Krupansky -Original Message- From: Rafael Taboada Sent: Thursday, May 31, 2012 1:30 PM To: solr-user@lucene.apache.org Subject: Data Import Handler fields with different values in column and name Hi folks, I'm using Solr 3.6 and I'm trying to import data from my database to solr using Data Import Handler. My db-config is like this: My problem is when I'm trying to use a different values in the field tag, for example When I use different name from column, this field is omitted. Please can you help me with this issue? My schema.xml is: Thanks in advance! -- Rafael Taboada
Re: Merging Remote Solr Indexes?
Hi All, I'm new to Solr. I saw this post relating to Merging of indexes. I have a similar doubt. From the post, I understand that merging of indexes across different cores is possible only if the cores exist o a single machine. I want to merge indexes of different machines. Can you please explain me the different ways of doing this? Say I have N+1 Solr engines of which there are N different masters and the remaining 1 is meant for merging all N indexes together. How I have decided to merge N indexes to 1 is this. 1. Dynamically edit the solrconfig.xml file of the N+1st system to point as a slave to different master each time. Hence a total of N trials would be needed to cover all N masters. 2. During every trial I shall replicate the index of the master and store it in a different folder. Say index1 from master1, index2 from master2 . indexn from masterN. 3. After all indexes are replicated and moved/renamed to local directory, I shall perform a merge of all indexes. What problems will I have in implementing this? How efficient would be this? I believe all index folders will have to be available locally to perform merging. If not, please tell me how better can I do merge remote indexes. Another question I have is about MergeFactor. If I set the mergefactor as 5, will Solr automatically takes care of merging the segments to 1 if the number of segments reach 5? How this can be exploited? Your assistance is sincerely appreciated. Regards, Sudarshan -- View this message in context: http://lucene.472066.n3.nabble.com/Merging-Remote-Solr-Indexes-tp3434412p3987090.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Is optimize needed on slaves if it replicates from optimized master?
Walter, Thanks again. Can you specify the criteria based on which Solr optimizes/force merges segments automatically. Is this defined by the MergeFactor parameter - like if the mergefactor is 10, then merge happens for every 10 segments? Please explain. Thanks, Sudarshan -- View this message in context: http://lucene.472066.n3.nabble.com/Is-optimize-needed-on-slaves-if-it-replicates-from-optimized-master-tp3241604p3987086.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Stop Words in SpellCheckComponent
Add a stopwordfilter to your spellcheck field. -Original message- > From:Matthias Müller > Sent: Thu 31-May-2012 18:39 > To: solr-user@lucene.apache.org > Subject: Stop Words in SpellCheckComponent > > Hi, > > is it possible to configure a stopword list to the SpellCheckComponent? > > For example: > When searching for "the indexs" "the" is filtered, because it is a stopword. > The SpellCheckComponent gives me a false suggestion for "the". > But the SpellCheckComponent should only give a suggestion for "index" > because "the" is a stopword. > > Kind Regards > > Matthias >
Re: Solr with UIMA
Further observation on the error: All requests to add documents through the /update URL land up with the same error, irrespective of the fields contained in the document. If I don't use the UIMAUpdateRequestProcessor, I can add/update documents successfully. Here are the snippets relevant to updateRequestProcessor declarations in my solrconfig.xml uima C:\ex1\RoomNumberAnnotator.xml false false content org.apache.uima.tutorial.RoomNumber building UIMAname Please help. Thanks Debdoot -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3987083.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using Data Import Handler to invoke a stored procedure with output (cursor) parameter
Thanks for your response, Michael. Unfortunately changing the stored procedure is not really an option here. From what I'm seeing, it would appear that there's really no way of somehow instructing the Data Import Handler to get a handle on the output parameter from the stored procedure. It's a bit surprising though that no one has ran into this scenario but I suppose most people just work around it. Anyone else care to shed some more light on alternative approaches? Thanks again. > > From: Michael Della Bitta >To: solr-user@lucene.apache.org >Sent: Thursday, May 31, 2012 9:40 AM >Subject: Re: Using Data Import Handler to invoke a stored procedure with >output (cursor) parameter > >I could be wrong about this, but Oracle has a table() function that I >believe turns the output of a function as a table. So possibly you >could wrap your procedure in a function that returns the cursor, or >convert the procedure to a function. > >Michael Della Bitta > > >Appinions, Inc. -- Where Influence Isn’t a Game. >http://www.appinions.com > > >On Thu, May 31, 2012 at 8:00 AM, Niran Fajemisin wrote: >> Hi all, >> >> I've seen a few questions asked around invoking stored procedures from >> within Data Import Handler but none of them seem to indicate what type of >> output parameters were being used. >> >> I have a stored procedure created in Oracle database that takes a couple >> input parameters and has an output parameter that is a reference cursor. The >> cursor is expected to be used as a way of iterating through the returned >> table rows. I'm using the following format to invoke my stored procedure in >> the Data Import Handler's data config XML: >> >> ... >> >> I have tested that this query works prior to attempting to use it from >> within the DIH. But when I attempt to invoke this stored procedure, it >> naturally complains that the output parameter is not specified (essentially >> a mismatch in the number of parameters). >> >> I don't know of anyway to pass in a cursor parameter (or any output >> parameter for that matter) to the stored procedure invocation from within >> the definition. I would greatly appreciate if anyone could provide >> any pointers or hints on how to proceed. >> >> Thanks so much for your time >> > > >
Data Import Handler fields with different values in column and name
Hi folks, I'm using Solr 3.6 and I'm trying to import data from my database to solr using Data Import Handler. My db-config is like this: My problem is when I'm trying to use a different values in the field tag, for example When I use different name from column, this field is omitted. Please can you help me with this issue? My schema.xml is: Thanks in advance! -- Rafael Taboada
Strip html
Hello, I have an index full text on xml files. Exemple: --- si les ruches d’abeilles > prouvent la > monarchie, les fourmillières, les troupes d’éléphants ou > de > > C > c > astors prouvent la > république. --- I use solr 1.4.1 to make full text search with php. When i search "castor", i can't fund this one. But if i search "c astor" it's ok: problem I make a transformation XSLT which return : --- si les ruches d’abeilles prouvent la monarchie, les fourmillières, les troupes d’éléphants ou de castors prouvent la république. --- i put this html in solr: $doc->addField('body_strip_html', $body_norm); In schema.xml: AND But this don't work! I want to return this xml files (look exemple) if i search "castor". Can you help me, please? thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/Strip-html-tp3987051.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr with UIMA
Hi Tommaso, I have followed the steps you have listed to try to deploy the example RoomNumberAnnotator with Solr 3.5. Here is the error trace that I get: org.apache.solr.common.SolrException: processing error: null. uid=5, text="Test Room HAW GN-K35..." at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:107) at org.apache.solr.handler.XMLLoader.processUpdate(XMLLoader.java:158) at org.apache.solr.handler.XMLLoader.load(XMLLoader.java:79) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at com.ibm.ws.webcontainer.filter.FilterInstanceWrapper.doFilter(FilterInstanceWrapper.java:192) at com.ibm.ws.webcontainer.filter.WebAppFilterChain.doFilter(WebAppFilterChain.java:89) at com.ibm.ws.webcontainer.filter.WebAppFilterManager.doFilter(WebAppFilterManager.java:919) at com.ibm.ws.webcontainer.filter.WebAppFilterManager.invokeFilters(WebAppFilterManager.java:1016) at com.ibm.ws.webcontainer.webapp.WebApp.handleRequest(WebApp.java:3703) at com.ibm.ws.webcontainer.webapp.WebGroup.handleRequest(WebGroup.java:304) at com.ibm.ws.webcontainer.WebContainer.handleRequest(WebContainer.java:953) at com.ibm.ws.webcontainer.WSWebContainer.handleRequest(WSWebContainer.java:1655) at com.ibm.ws.webcontainer.channel.WCChannelLink.ready(WCChannelLink.java:195) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleDiscrimination(HttpInboundLink.java:452) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.handleNewRequest(HttpInboundLink.java:511) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.processRequest(HttpInboundLink.java:305) at com.ibm.ws.http.channel.inbound.impl.HttpInboundLink.ready(HttpInboundLink.java:276) at com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.sendToDiscriminators(NewConnectionInitialReadCallback.java:214) at com.ibm.ws.tcp.channel.impl.NewConnectionInitialReadCallback.complete(NewConnectionInitialReadCallback.java:113) at com.ibm.ws.tcp.channel.impl.AioReadCompletionListener.futureCompleted(AioReadCompletionListener.java:165) at com.ibm.io.async.AbstractAsyncFuture.invokeCallback(AbstractAsyncFuture.java:217) at com.ibm.io.async.AsyncChannelFuture.fireCompletionActions(AsyncChannelFuture.java:161) at com.ibm.io.async.AsyncFuture.completed(AsyncFuture.java:138) at com.ibm.io.async.ResultHandler.complete(ResultHandler.java:204) at com.ibm.io.async.ResultHandler.runEventProcessingLoop(ResultHandler.java:775) at com.ibm.io.async.ResultHandler$2.run(ResultHandler.java:905) at com.ibm.ws.util.ThreadPool$Worker.run(ThreadPool.java:1650) Caused by: org.apache.uima.resource.ResourceInitializationException at org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE(OverridingParamsAEProvider.java:86) at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processText(UIMAUpdateRequestProcessor.java:144) at org.apache.solr.uima.processor.UIMAUpdateRequestProcessor.processAdd(UIMAUpdateRequestProcessor.java:77) ... 30 more Caused by: java.lang.NullPointerException at org.apache.uima.util.XMLInputSource.(XMLInputSource.java:118) at org.apache.solr.uima.processor.ae.OverridingParamsAEProvider.getAE(OverridingParamsAEProvider.java:58) ... 32 more at com.ibm.ws.webcontainer.webapp.WebAppDispatcherContext.sendError(WebAppDispatcherContext.java:624) at com.ibm.ws.webcontainer.webapp.WebAppDispatcherContext.sendError(WebAppDispatcherContext.java:642) at com.ibm.ws.webcontainer.srt.SRTServletResponse.sendError(SRTServletResponse.java:1235) at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:380) at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:326) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:265) Please let me know if you have any insights on what could be the issue. Thanks in advance, Debdoot -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-with-UIMA-tp3863324p3987056.html Sent from the Solr - User mailing list archive at Nabble.com.
Stop Words in SpellCheckComponent
Hi, is it possible to configure a stopword list to the SpellCheckComponent? For example: When searching for "the indexs" "the" is filtered, because it is a stopword. The SpellCheckComponent gives me a false suggestion for "the". But the SpellCheckComponent should only give a suggestion for "index" because "the" is a stopword. Kind Regards Matthias
Cannot get highlighting to work
Hello, I am having problems doing highlighting a Solr 3.6 instance, while it was working just fine before on our 1.4 instance. The solrconfig.xml and schema.xml files are located here: https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/schema.xml (please note the incorrect line wrapping - it should be on one line) https://github.com/mpi2/mpi2_solr/blob/master/multicore/main/conf/solrconfig.xml (please note the incorrect line wrapping - it should be on one line) The query I fire off (which worked on the 1.4 instance) is: /solr/main/select?q=Cbx1&wt=json&hl=true&hl.fl=*&hl.usePhraseHighlighter=true (please note the incorrect line wrapping - it should be on one line) I expect a section like: { MGI:105369: { symbol: [ "Cbx1" ], marker_symbol: [ "Cbx1" ] } } I get: { MGI:105369: { } } Can anyone help? Thanks -- Regards, Asfand Yar Qazi Team 87 - High Throughput Gene Targeting Wellcome Trust Sanger Institute -- The Wellcome Trust Sanger Institute is operated by Genome Research Limited, a charity registered in England with number 1021457 and a company registered in England with number 2742969, whose registered office is 215 Euston Road, London, NW1 2BE.
Re: per-fieldtype similarity not working
On Thu, May 31, 2012 at 11:23 AM, Markus Jelsma wrote: > We simply declare the following in our fieldType: > > Thats not enough, see the example: http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/core/src/test-files/solr/conf/schema-sim.xml -- lucidimagination.com
per-fieldtype similarity not working
Hi, We intend to use different similarity implemenations for some field types configured according to SOLR-2338. I doubled checked with the schema in test-files and everything seems fine. However, the result is not correct and debugQuery shows the default configured similarity implementation is being used. We simply declare the following in our fieldType: Thanks, Markus
Re: Multi-words synonyms matching
I have been struggling with this as well and found that using LUCENE_33 gives the best results. But as it will be deprecated this is no everlasting solution. May somebody knows one? -- View this message in context: http://lucene.472066.n3.nabble.com/Multi-words-synonyms-matching-tp3898950p3987048.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Accent Characters
Hello, guys. Now it's working. Thank you both Jack and Sami. I fixed my issue by just using server.query(query, METHOD.POST) in solrJ and yes, I was using HttpSolrServer. I have to move on to CommonsHttpSolrServer. Thank you very much. -- View this message in context: http://lucene.472066.n3.nabble.com/Accent-Characters-tp3985931p3987046.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using Data Import Handler to invoke a stored procedure with output (cursor) parameter
I could be wrong about this, but Oracle has a table() function that I believe turns the output of a function as a table. So possibly you could wrap your procedure in a function that returns the cursor, or convert the procedure to a function. Michael Della Bitta Appinions, Inc. -- Where Influence Isn’t a Game. http://www.appinions.com On Thu, May 31, 2012 at 8:00 AM, Niran Fajemisin wrote: > Hi all, > > I've seen a few questions asked around invoking stored procedures from within > Data Import Handler but none of them seem to indicate what type of output > parameters were being used. > > I have a stored procedure created in Oracle database that takes a couple > input parameters and has an output parameter that is a reference cursor. The > cursor is expected to be used as a way of iterating through the returned > table rows. I'm using the following format to invoke my stored procedure in > the Data Import Handler's data config XML: > > ... > > I have tested that this query works prior to attempting to use it from within > the DIH. But when I attempt to invoke this stored procedure, it naturally > complains that the output parameter is not specified (essentially a mismatch > in the number of parameters). > > I don't know of anyway to pass in a cursor parameter (or any output parameter > for that matter) to the stored procedure invocation from within the > definition. I would greatly appreciate if anyone could provide any pointers > or hints on how to proceed. > > Thanks so much for your time >
RE: spellcheck collate with fq parameters SOLR-2010
Thanks James, that works nicely! -Original message- > From:Dyer, James > Sent: Thu 31-May-2012 16:05 > To: solr-user@lucene.apache.org > Subject: RE: spellcheck collate with fq parameters SOLR-2010 > > Markus, > > When you set "spellcheck.maxCollationTries" to a value greater than zero, the > spellchecker will query each collation candidate to determine how many hits > it would return. If the collation will not yield any hits, it throws it away > then tries some more (up to whatever value you set). You can verify the > correctness of this by setting "spellcheck.maxCollationTries" to zero (no > checking) and then re-trying the collation(s) it suggests by hand (with the > same "fq" params, etc). > > James Dyer > E-Commerce Systems > Ingram Content Group > (615) 213-4311 > > -Original Message- > From: Markus Jelsma [mailto:markus.jel...@openindex.io] > Sent: Thursday, May 31, 2012 8:45 AM > To: solr-user@lucene.apache.org > Subject: spellcheck collate with fq parameters SOLR-2010 > > Hi, > > It seems it doesn't work or i cannot get it to work. I've tried both the > IndexSpellchecker in Solr 3.2 and the DirectSpellchecker of trunk. The > correctly spelled flag is correct when considering the fq parameters but the > collation is never when using a filter. I've also tried > spellcheck.maxCollationTries on trunk but any value higher than 0 (even very > high) makes the collation element to disappear. Are there any (open) issues > that i'm not aware of? > > Thanks, > Markus >
RE: spellcheck collate with fq parameters SOLR-2010
Markus, When you set "spellcheck.maxCollationTries" to a value greater than zero, the spellchecker will query each collation candidate to determine how many hits it would return. If the collation will not yield any hits, it throws it away then tries some more (up to whatever value you set). You can verify the correctness of this by setting "spellcheck.maxCollationTries" to zero (no checking) and then re-trying the collation(s) it suggests by hand (with the same "fq" params, etc). James Dyer E-Commerce Systems Ingram Content Group (615) 213-4311 -Original Message- From: Markus Jelsma [mailto:markus.jel...@openindex.io] Sent: Thursday, May 31, 2012 8:45 AM To: solr-user@lucene.apache.org Subject: spellcheck collate with fq parameters SOLR-2010 Hi, It seems it doesn't work or i cannot get it to work. I've tried both the IndexSpellchecker in Solr 3.2 and the DirectSpellchecker of trunk. The correctly spelled flag is correct when considering the fq parameters but the collation is never when using a filter. I've also tried spellcheck.maxCollationTries on trunk but any value higher than 0 (even very high) makes the collation element to disappear. Are there any (open) issues that i'm not aware of? Thanks, Markus
Re: XInclude Multiple Elements
I've also tried a lot of tricks to get xpointer working with multiple child elements, to no success. In the end, I've resorted to a less pretty, other-way-around solution. I do something like this: solrconfig_common.xml -> no xml declaration, no root tag, no nothing ... For each file that I need the common stuff into, I'd do something like this: solrconfig_master.xml/solrconfig_slave.xml/etc. ]> &solrconfigcommon; Solr starts with 0 warnings, the configuration is properly loaded, etc. Property substitution also works, including inside the solrconfig_common.xml. Hope it helps anyone. -- View this message in context: http://lucene.472066.n3.nabble.com/XInclude-Multiple-Elements-tp3167658p3987029.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Hightlighting and excerpt
> I need something like http://cl.ly/2o2E0g0S422d2p1X203h . See how TCMB > was stressed? Hi Tolga, I think, you can easily learn the basic using one of the following books. http://lucene.apache.org/solr/books.html
Re: Hightlighting and excerpt
The Solr example. As in the Solr tutorial. See: http://lucene.apache.org/solr/api/doc-files/tutorial.html Index books.json from exampledocs and then enter a /browse request in your web browser. Add the "&wt=xml" query parameter so that you can see the raw XML response that shows the "highlighting" section rather than the VelocityWriter output. Since you said that highlighting was working for you, please post an example of the "highlighting" section of a Solr response. -- Jack Krupansky -Original Message- From: Tolga Sent: Thursday, May 31, 2012 9:42 AM To: solr-user@lucene.apache.org Subject: Re: Hightlighting and excerpt You mean http:///www.example.com:8983/solr/browse? It says "unknown field 'cat'" On 5/31/12 4:16 PM, Jack Krupansky wrote: Yes, that is what highlighting does - it extracts an excerpt and highlights search terms. You said you have highlighting working, so what else is it that you need? Try "/browse" in the Solr example. It does exactly what your example shows. So, what else is it that you are trying to do? Or if something isn't working, what specifically isn't working? -- Jack Krupansky -Original Message- From: Tolga Sent: Thursday, May 31, 2012 9:08 AM To: solr-user@lucene.apache.org Subject: Re: Hightlighting and excerpt I need something like http://cl.ly/2o2E0g0S422d2p1X203h . See how TCMB was stressed? On 5/31/12 3:54 PM, Jack Krupansky wrote: Since highlighting, by definition, does highlight terms in "excerpts" (snippets or fragments from a text field), what else is it that you need? -- Jack Krupansky -Original Message- From: Tolga Sent: Thursday, May 31, 2012 4:55 AM To: solr-user@lucene.apache.org Subject: Hightlighting and excerpt Hi, Two separate things asked in one thread... I am crawling my websites with nutch. When I index them, I'd like to be able to highlight my keyword and display en excerpt containing that keyword. I found a solution with highlight, but what can I about excerpt? Thanks and regards,
spellcheck collate with fq parameters SOLR-2010
Hi, It seems it doesn't work or i cannot get it to work. I've tried both the IndexSpellchecker in Solr 3.2 and the DirectSpellchecker of trunk. The correctly spelled flag is correct when considering the fq parameters but the collation is never when using a filter. I've also tried spellcheck.maxCollationTries on trunk but any value higher than 0 (even very high) makes the collation element to disappear. Are there any (open) issues that i'm not aware of? Thanks, Markus
Re: Hightlighting and excerpt
You mean http:///www.example.com:8983/solr/browse? It says "unknown field 'cat'" On 5/31/12 4:16 PM, Jack Krupansky wrote: Yes, that is what highlighting does - it extracts an excerpt and highlights search terms. You said you have highlighting working, so what else is it that you need? Try "/browse" in the Solr example. It does exactly what your example shows. So, what else is it that you are trying to do? Or if something isn't working, what specifically isn't working? -- Jack Krupansky -Original Message- From: Tolga Sent: Thursday, May 31, 2012 9:08 AM To: solr-user@lucene.apache.org Subject: Re: Hightlighting and excerpt I need something like http://cl.ly/2o2E0g0S422d2p1X203h . See how TCMB was stressed? On 5/31/12 3:54 PM, Jack Krupansky wrote: Since highlighting, by definition, does highlight terms in "excerpts" (snippets or fragments from a text field), what else is it that you need? -- Jack Krupansky -Original Message- From: Tolga Sent: Thursday, May 31, 2012 4:55 AM To: solr-user@lucene.apache.org Subject: Hightlighting and excerpt Hi, Two separate things asked in one thread... I am crawling my websites with nutch. When I index them, I'd like to be able to highlight my keyword and display en excerpt containing that keyword. I found a solution with highlight, but what can I about excerpt? Thanks and regards,
Re: Hightlighting and excerpt
Yes, that is what highlighting does - it extracts an excerpt and highlights search terms. You said you have highlighting working, so what else is it that you need? Try "/browse" in the Solr example. It does exactly what your example shows. So, what else is it that you are trying to do? Or if something isn't working, what specifically isn't working? -- Jack Krupansky -Original Message- From: Tolga Sent: Thursday, May 31, 2012 9:08 AM To: solr-user@lucene.apache.org Subject: Re: Hightlighting and excerpt I need something like http://cl.ly/2o2E0g0S422d2p1X203h . See how TCMB was stressed? On 5/31/12 3:54 PM, Jack Krupansky wrote: Since highlighting, by definition, does highlight terms in "excerpts" (snippets or fragments from a text field), what else is it that you need? -- Jack Krupansky -Original Message- From: Tolga Sent: Thursday, May 31, 2012 4:55 AM To: solr-user@lucene.apache.org Subject: Hightlighting and excerpt Hi, Two separate things asked in one thread... I am crawling my websites with nutch. When I index them, I'd like to be able to highlight my keyword and display en excerpt containing that keyword. I found a solution with highlight, but what can I about excerpt? Thanks and regards,
Efficiently mining or parsing data out of XML source files
I'm just wondering what the general consensus is on indexing XML data to Solr in terms of parsing and mining the relevant data out of the file and putting them into Solr fields. Assume that this is the XML file and resulting Solr fields: XML data: foo garbage data Solr Fields: Id=1234 Title=foo Bar=val1 I'd previously set this process up using XSLT and have since tested using XMLBeans, JAXB, etc. to get the relevant data. The speed at which this occurs, however, is not acceptable. 2800 objects take 11 minutes to parse and index into Solr. The big slowdown appears to be that I'm parsing the data with an XML parser. So, now I'm testing mining the data by opening the file as just a text file (using Groovy) and picking out relevant data using regular expression matching. I'm now able to parse (mine) the data and index the 2800 files in 72 seconds. So I'm wondering if the typical solution people use is to go with a non-XML solution. It seems to make sense considering the search index would only want to store (as much data) as possible and not rely on the incoming documents being xml compliant. Thanks in advance for any thoughts on this! -Kristian
Re: Hightlighting and excerpt
I need something like http://cl.ly/2o2E0g0S422d2p1X203h . See how TCMB was stressed? On 5/31/12 3:54 PM, Jack Krupansky wrote: Since highlighting, by definition, does highlight terms in "excerpts" (snippets or fragments from a text field), what else is it that you need? -- Jack Krupansky -Original Message- From: Tolga Sent: Thursday, May 31, 2012 4:55 AM To: solr-user@lucene.apache.org Subject: Hightlighting and excerpt Hi, Two separate things asked in one thread... I am crawling my websites with nutch. When I index them, I'd like to be able to highlight my keyword and display en excerpt containing that keyword. I found a solution with highlight, but what can I about excerpt? Thanks and regards,
Re: Hightlighting and excerpt
Since highlighting, by definition, does highlight terms in "excerpts" (snippets or fragments from a text field), what else is it that you need? -- Jack Krupansky -Original Message- From: Tolga Sent: Thursday, May 31, 2012 4:55 AM To: solr-user@lucene.apache.org Subject: Hightlighting and excerpt Hi, Two separate things asked in one thread... I am crawling my websites with nutch. When I index them, I'd like to be able to highlight my keyword and display en excerpt containing that keyword. I found a solution with highlight, but what can I about excerpt? Thanks and regards,
Re: Query elevation / boosting or something else to guarantee document position
Hi Wenca, I'm a bit late. but maybe you're still interested. There's no such functionality in standard Solr. With sorting, this is not possible, because sort functions only rank each single document, they know nothing about the position of the others. And query elevation is similar, you'll raise the score of independent documents. To achive this, you'll need an own QueryComponent. This isn't too complicated. You can't change the SolrIndexSearcher easily, this does the search job. But you can subclass org.apache.solr.handler.component.QueryComponent and overwrite process(). Alas the single main line - searcher.search() - is buried deeply in the huge monster method process(), and you first have to check for shards, grouping and twentythousand other parameters until you've arrived the code line you may want to expand. Before calling search(), set the GET_DOCSET flag in your QueryCommand object, then execute the search. To check whether there's a document of the particular manufacturer in the result list, you can either a) fetch the appropriate field value from the default field cache for every single result document until you found one; or b) call getDocSet() on the SolrIndexSearcher with the manufacturer query as the parameter, and perform and and() operation on the resulting DocSet with the DocSet of your main query. (That's why you set the flag before.) You can then check which document that matches both the manufacturer and the main query fits best. If you found a matching document, but it's behind pos. 5 in the resulting DocList, the you simoply have to re-order your list. If there's no such document within the DocList (which is limited by your rows parameter), but there are some in the joined DocSet from strategy b), then you can simply choose one of them and ignore the fact that this is probably not the best matching one. Or you have to patch Solr and modify getDocListNC() in solrIndexSearcher (or one of the Collector classes), which is much more complicated. Good luck! -Kuli Am 29.05.2012 14:26, schrieb Wenca: Hi all, I have an index with thousands of products with various fields (manufacturer, price, popularity, type, color, ...) and I want to guarantee at least one product by a particular manufacturer to be within the first 5 results. The search is done mainly by using filter params and results are ordered by function e.g.: "product(price, popularity) asc" or by "discount desc" And I need to guarantee that if there is any product matching the given filters made by a concrete manufacturer, then it will be on the 5th position at worst, even if the position by the order function is worse. It seems to me that the Query elevation component is not the right thing for me. I don't know the query in advance (or the set of filter criteria) and I don't know concrete product that will be the best for the criteria within the order. And also I don't think that I can construct a function with such requirements to use it directly for ordering the results. Of course I can make a second query in case there is no desired product on the first page of results and put it there, but it requires additional request to solr and complicates results processing and further pagination. Can anybody suggest any solution? Thanks Wenca
Using Data Import Handler to invoke a stored procedure with output (cursor) parameter
Hi all, I've seen a few questions asked around invoking stored procedures from within Data Import Handler but none of them seem to indicate what type of output parameters were being used. I have a stored procedure created in Oracle database that takes a couple input parameters and has an output parameter that is a reference cursor. The cursor is expected to be used as a way of iterating through the returned table rows. I'm using the following format to invoke my stored procedure in the Data Import Handler's data config XML: ... I have tested that this query works prior to attempting to use it from within the DIH. But when I attempt to invoke this stored procedure, it naturally complains that the output parameter is not specified (essentially a mismatch in the number of parameters). I don't know of anyway to pass in a cursor parameter (or any output parameter for that matter) to the stored procedure invocation from within the definition. I would greatly appreciate if anyone could provide any pointers or hints on how to proceed. Thanks so much for your time
Re: difference between Katta and SolrCloud (replicator factor)
Hi, responses please -- Jamel E -- View this message in context: http://lucene.472066.n3.nabble.com/difference-between-Katta-and-SolrCloud-replicator-factor-tp3986791p3986998.html Sent from the Solr - User mailing list archive at Nabble.com.
AW: Creating custom Filter / Tokenizer / Request Handler for integration of NER-Framework
Thanks for all the responses. I went with the UpdateRequestProcessor and it works. -Ursprüngliche Nachricht- Von: Lance Norskog [mailto:goks...@gmail.com] Gesendet: Samstag, 26. Mai 2012 01:53 An: solr-user@lucene.apache.org Betreff: Re: Creating custom Filter / Tokenizer / Request Handler for integration of NER-Framework Another problem (just discovered this): TokenizerFactories do not get resource handlers. So, you can't go read config or model files for your Tokenizer. TokenFilters do, so you can use the KeywordTokenizer (make one big term) and do your work in a TokenFilter that gets the whole thing. On Thu, May 24, 2012 at 7:33 AM, Jan Høydahl wrote: > As Ahmet says, The Update Chain is probably the place to integrate such > document oriented processing. > See http://www.cominvent.com/2011/04/04/solr-architecture-diagram/ for how it > integrates with Solr. > > -- > Jan Høydahl, search solution architect Cominvent AS - > www.facebook.com/Cominvent Solr Training - www.solrtraining.com > > On 24. mai 2012, at 14:04, Wunderlich, Tobias wrote: > >> Hey Guys, >> >> I am recently working on a project to integrate a >> Named-Entity-Recognition-Framework (NER) in an existing searchplatform based >> on Solr. The Platform uses ManifoldCF to automatically gather the content >> from various repositories. The NER-Framework creates Annotations/Metadata >> from given content which I then want to integrate into the search-platform >> as metadata to use for faceting. Since MCF handles all content gathering, I >> need a way to integrate the NER-Framework directly into Solr. The Goal is to >> get all Annotations per document into a multivalued field. My first thought >> was to create a custom filter, which just takes the content and gives back >> only the Annotations. But as I understand it, a filter only processes >> predetermined Tokens, which is useless for my purpose, since the >> NER-Framework needs to process the whole content of a document. What about a >> custom Tokenizer? Would it be possible to process the whole text and give >> back only the Annotations as Tokens? A third thought was to manipulate the >> ExtractRequestHandler (Solr Cell) used by MCF to somehow add the Annotations >> as Metadata when the content and metadata is distributed to the different >> fields. >> >> I hope my problem description is sufficient. Does anybody have any thoughts >> on that subject? >> >> Best regards, >> Tobias > -- Lance Norskog goks...@gmail.com
Hightlighting and excerpt
Hi, Two separate things asked in one thread... I am crawling my websites with nutch. When I index them, I'd like to be able to highlight my keyword and display en excerpt containing that keyword. I found a solution with highlight, but what can I about excerpt? Thanks and regards,
Re: Poll: What do you use for Solr performance monitoring?
Hi Otis, done :) Till now we use Graphite, Ganglia and Zabbix. For our JVM monitoring JStatsD. Best regards Vadim 2012/5/31 Otis Gospodnetic : > Hi, > > Super quick poll: What do you use for Solr performance monitoring? > Vote here: > http://blog.sematext.com/2012/05/30/poll-what-do-you-use-for-solr-performance-monitoring/ > > > I'm collecting data for my Berlin Buzzwords talk that will touch on Solr, so > your votes will be greatly appreciated! > > Thanks, > Otis
Re: how to read fieldValueCacheStatistics
ok, thanks a lot for the answer. Elisabeth 2012/5/31 Chris Hostetter > > : When I read fieldValueCache statistics I have something that looks like > : > : item_ABC_FACET : > : > {field=ABC_FACET,memSize=4224,tindexSize=32,time=92,phase1=92,nTerms=0,bigTerms=0,termInstances=0,uses=11} > : > : > : is there a doc somewhere that explains what are > > ...technically that's one stat, showing you and "UnInvertedField" > instance in the cache (that's the string-ification of that > UnInvertedField) > > the specifics of what those numbers mean are definitely what i would > consider "expert level" ... off the top of my head the only ones i am > fairly sure of are: > > memSize - how many bytes of ram it's using > time - how long it took to build > nTerms - number of unique terms in that field > bigTerms - number of "big" terms, ie: terms that have such a high docFreq, > they weren't un-inverted because it would be too ineffectient. > > In general, this level of detail is the kind of thing where you should > probably review the code. > > > -Hoss >