Re: Use of solr.ASCIIFoldingFilterFactory
Hello, Thank's, your response solve my problem. Thank's for all, On Sun, Feb 7, 2010 at 4:00 PM, Sven Maurmann sven.maurm...@kippdata.dewrote: Hi, you might have run into an encoding problem. If you use Tomcat as the container for Solr you should probably consult the following http://wiki.apache.org/solr/SolrTomcat#URI_Charset_Config Cheers, Sven --On Freitag, 5. Februar 2010 15:41 +0100 Yann PICHOT ypic...@gmail.com wrote: Hi, I have define this type in my schema.xml file : fieldType name=text class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.ASCIIFoldingFilterFactory / filter class=solr.LowerCaseFilterFactory / /analyzer analyzer type=query tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.ASCIIFoldingFilterFactory / filter class=solr.LowerCaseFilterFactory/ /analyzer /fieldType Fields definition : fields field name=id type=string indexed=true stored=true required=true / field name=idProd type=string indexed=false stored=false required=false / field name=description type=text indexed=true stored=true required=false / field name=artiste type=text indexed=true stored=true required=false / field name=collection type=text indexed=true stored=true required=false / field name=titre type=text indexed=true stored=true required=false / field name=all type=text indexed=true stored=true required=false / /fields copyField source=description dest=all/ copyField source=collection dest=all/ copyField source=artiste dest=all/ copyField source=titre dest=all/ I have import my documents with DataImportHandler (my orginals documents are in RDBMS). I test query this query string on SOLR web application : all:chateau. Results (content of the field all) : CHATEAU D'AMBOISE [CHATEAU EN FRANCE, BABELON] ope dvd rene chateau CHATEAU DE LA LOIRE DE CHATEAU EN CHATEAU ENTRE LA LOIRE ET LE CHER [LE CHATEAU AMBULANT, HAYAO MIYAZAKI] [Chambres d'hôtes au château, Moreau] [ARCHIMEDE, LA VIE DE CHATEAU, KRAHENBUHL] [NEUF, NAISSANCE D UN CHATEAU FORT, MACAULAY] [ARCHIMEDE, LA VIE DE CHATEAU, KRAHENBUHL] Now i try this query string : all:château. No result :( I don't understand. I think the second query respond the same result of the first query but it is not the case. I use SOLR 1.4 (Solr Implementation Version: 1.4.0 833479 - grantingersoll - 2009-11-06 12:33:40). Java 32 bits : Java(TM) SE Runtime Environment (build 1.6.0_17-b04) OS : Windows Seven 64 bits Regards, -- Yann -- Yann
DataImportHandler - case sensitivity of column names
I encountered the problem with Oracle converting column names to upper case. As a result SolrInputDocument is created with field names in upper case and Document [null] missing required field: id exception is thrown ( although ID field is defined ). I do not specify field elements explicitly. I know that I can rewrite all my queries to select id as id, body as body from document format, but is there any other workaround for this? case insensitive option or something? Here's my data-config: dataConfig dataSource convertType=true driver=oracle.jdbc.driver.OracleDriver password=oracle url=jdbc:oracle:thin:@localhost:1521:xe user=SYSTEM/ document name=items entity name=root pk=id preImportDeleteQuery=db:db1 query=select id, body from document transformer=TemplateTransformer entity name=nested1 query=select category from document_category where doc_id='${root.id}'/ entity name=nested2 query=select tag from document_tag where doc_id='${root.id}'/ field column=db template=db1/ /entity /document /dataConfig Alexey
How to configure multiple data import types
I have got a dataimport request handler configured to index data by selecting data from a DB view I now need to index additional data sets from other views so that I can support other search queries I defined additional entity .. definitions within the document .. section of my data-config.xml But I only seem to pull in data for the 1st entity .. and not both Is there an xsd (or dtd) for data-config.xml schema.xml slrconfig.xml As these might help with understanding how to construct usable conf files Regards Stefan Maric BT Innovate Design | Collaboration Platform - Customer Innovation Solutions
Re: SV: Running Solr (LucidWorks) as a Windows Server
assuming you have the example running from example folder in the standard distribution by doing java -jar start.jar this is what I did to get the same running as a service download the jetty distribution (I used 6.1.21) copy the bin folder over to example copy etc\jetty-win32-service.xml to example\etc copy lib\win32 folder over to example\lib in example\bin\jetty-service.conf add this after wrapper.java.additional.2=-Djetty.logs=../logs wrapper.java.additional.3=-Dsolr.solr.home=../solr in example\solr\conf change this line dataDir${solr.data.dir:./solr/data}/dataDir to dataDir${solr.data.dir:../solr/data}/dataDir (double .. instead of single . before /solr ) then as administrator from cmd prompt in the bin folder do Jetty-Service.exe --install jetty-service.conf net start jetty6-service HTH Ron - Original Message - From: Roland Villemoes r...@alpha-solutions.dk To: solr-user@lucene.apache.org Sent: Friday, 5 February, 2010 6:07:03 PM Subject: SV: Running Solr (LucidWorks) as a Windows Server Hi All, Thanks a lot for your help in this. I have tried to use the Win32Wrapper, and the Jetty-Service.exe but still no success. I was actually hoping the some of you guys out there actually had a running copy so I could so how to configure it? Looks like it must go the Tomcat way... Roland -Oprindelig meddelelse- Fra: Ron Chan [mailto:rc...@i-tao.com] Sendt: 5. februar 2010 12:55 Til: solr-user@lucene.apache.org Emne: Re: Running Solr (LucidWorks) as a Windows Server jetty can be run as a Windows Service, see http://docs.codehaus.org/display/JETTY/Win32Wrapper - Original Message - From: Roland Villemoes r...@alpha-solutions.dk To: solr-user@lucene.apache.org Sent: Thursday, 4 February, 2010 7:18:57 PM Subject: Running Solr (LucidWorks) as a Windows Server Hi, I need to have Solr/Jetty running as a Windows Service. I am using the Lucid distribution. Does anyone have a running example and tool for this? med venlig hilsen/best regards
Re: How to configure multiple data import types
are you referring to nested entities? http://wiki.apache.org/solr/DIHQuickStart#Index_data_from_multiple_tables_into_Solr On Mon, Feb 8, 2010 at 5:42 PM, stefan.ma...@bt.com wrote: I have got a dataimport request handler configured to index data by selecting data from a DB view I now need to index additional data sets from other views so that I can support other search queries I defined additional entity .. definitions within the document .. section of my data-config.xml But I only seem to pull in data for the 1st entity .. and not both Is there an xsd (or dtd) for data-config.xml schema.xml slrconfig.xml As these might help with understanding how to construct usable conf files Regards Stefan Maric BT Innovate Design | Collaboration Platform - Customer Innovation Solutions -- - Noble Paul | Systems Architect| AOL | http://aol.com
RE: How to configure multiple data import types
No my views have already taken care of pulling the related data together I've indexed my first data set and now want to configure a second (non-related) data set so that a User can issue a query for data set #1 whilst another user might be querying for data set #2 Should I be defining multiple document .. or entity .. entries Or what ?? Thanks Stefan Maric
Re: How to configure multiple data import types
On Mon, Feb 8, 2010 at 6:03 PM, stefan.ma...@bt.com wrote: No my views have already taken care of pulling the related data together I've indexed my first data set and now want to configure a second (non-related) data set so that a User can issue a query for data set #1 whilst another user might be querying for data set #2 Should I be defining multiple document .. or entity .. entries Or what ?? You can define multiple entities (all at the root level) to import all your views at once. -- Regards, Shalin Shekhar Mangar.
Re: DataImportHandler - case sensitivity of column names
On Mon, Feb 8, 2010 at 3:59 PM, Alexey Serba ase...@gmail.com wrote: I encountered the problem with Oracle converting column names to upper case. As a result SolrInputDocument is created with field names in upper case and Document [null] missing required field: id exception is thrown ( although ID field is defined ). I do not specify field elements explicitly. I know that I can rewrite all my queries to select id as id, body as body from document format, but is there any other workaround for this? case insensitive option or something? Here's my data-config: dataConfig dataSource convertType=true driver=oracle.jdbc.driver.OracleDriver password=oracle url=jdbc:oracle:thin:@localhost:1521:xe user=SYSTEM/ document name=items entity name=root pk=id preImportDeleteQuery=db:db1 query=select id, body from document transformer=TemplateTransformer entity name=nested1 query=select category from document_category where doc_id='${root.id}'/ entity name=nested2 query=select tag from document_tag where doc_id='${root.id}'/ field column=db template=db1/ /entity /document /dataConfig Fields are imported in a case-insensitive manner as long as they are not specified explicitly. In this case, however, the problem is that the ${ root.id} is case sensitive. There is no way right now to resolve variables in a case-insensitive manner. -- Regards, Shalin Shekhar Mangar.
Request time out in solr
Hi I had indexed the solr index by DIH, I am using Webservice::solr perl Module to update/delte my solr index at run time from frontend. I want to know How can I set request timeout through perl by webservice::solr end or solr end so that I could hanlde request timeout exception. -- Thank you, Vijayant Kumar Software Engineer Website Toolbox Inc. http://www.websitetoolbox.com 1-800-921-7803 x211
Multi-word synonyms containing commas
Hi, is it possible to have a synonym file where single synonyms can also contain commas, e.g. names like Washington, George. Perhaps it would suffice to tell the SynonymFilterFacotry to use another separator character (instead of the comma)? I tried this and changed the line where the parseRules-method is called in the original implementation of SynonymFilterFactory (simply replacing , with#), but this didn't work as expected. Thanks Matthias
unloading a solr core doesn't free any memory
To me it doesn't look like unloading a Solr Core frees the memory that the core has used. Is this how it should be? I have a big index with 50 million documents. After loading a core it takes 300 MB RAM. After a query with a couple of sort fields Solr takes about 8 GB RAM. Then I unload (CoreAdminRequest.unloadCore) the core. The core is not shown in /solr/ anymore. Solr still takes 8 GB RAM. Creating new cores is super slow because I have hardly any memory left. Do I need to free the memory explicitly somehow? /Tim
RE: How to configure multiple data import types
It sounds like you are doing it correctly, Stefan. Must be something syntactical. The schema.xml and solrconfig.xml does not factor into your problem, only the data-config. I do the same thing you are trying to do. A watered down version is: dataConfig dataSource type=JdbcDataSource name=bdb-1 driver=oracle.jdbc.driver.OracleDriver url=jdbc:oracle:thin:@(DESCRIPTION = (LOAD_BALANCE = on) (FAILOVER = on) (ADDRESS_LIST = (ADDRESS = (PROTOCOL = TCP)(HOST = server.domain.com)(PORT = 1528))) (CONNECT_DATA = (SERVICE_NAME = instance.domain.COM))) user=scott password=tiger/ document name=monitors entity name=bdbmon dataSource=bdb-1 query=SELECT column from table /entity entity name=bug dataSource=bdb-1 query=SELECT another_column from another_table /entity /document /dataConfig Hope this helps... -Original Message- From: stefan.ma...@bt.com [mailto:stefan.ma...@bt.com] Sent: Monday, February 08, 2010 7:34 AM To: solr-user@lucene.apache.org; noble.p...@gmail.com Subject: RE: How to configure multiple data import types No my views have already taken care of pulling the related data together I've indexed my first data set and now want to configure a second (non-related) data set so that a User can issue a query for data set #1 whilst another user might be querying for data set #2 Should I be defining multiple document .. or entity .. entries Or what ?? Thanks Stefan Maric
trouble with DTD
hi @all, using solr and dataimport stuff to import ends up in RuntimeException. Caused by: java.lang.RuntimeException: [com.ctc.wstx.exc.WstxLazyException] com.ctc.wstx.exc.WstxParsingException: Undeclared general entity eacute at [row,col {unknown-source}]: [49,23] Browsing the code shows that DTD is disabled is there a other chance to get entity parsing work? i'm the only one with entity usage in XML? i'm trying to import DBLP XML data to solr. -- Jens Kapitza
Re: trouble with DTD
Are you sure this isn't just a typo? eacute - execute? On Mon, Feb 8, 2010 at 9:15 AM, Jens Kapitza j.kapi...@schwarze-allianz.dewrote: hi @all, using solr and dataimport stuff to import ends up in RuntimeException. Caused by: java.lang.RuntimeException: [com.ctc.wstx.exc.WstxLazyException] com.ctc.wstx.exc.WstxParsingException: Undeclared general entity eacute at [row,col {unknown-source}]: [49,23] Browsing the code shows that DTD is disabled is there a other chance to get entity parsing work? i'm the only one with entity usage in XML? i'm trying to import DBLP XML data to solr. -- Jens Kapitza
Re: trouble with DTD
On 2/8/2010 3:15 PM, Jens Kapitza wrote: hi @all, using solr and dataimport stuff to import ends up in RuntimeException. Caused by: java.lang.RuntimeException: [com.ctc.wstx.exc.WstxLazyException] com.ctc.wstx.exc.WstxParsingException: Undeclared general entity eacute at [row,col {unknown-source}]: [49,23] eacute; is an entity defined for (X)HTML. XML only uses quot; amp; apos; lt; gt and #; So if you want to use the é character you'll have to either use the character itself or something like #x00c9; Regards, gwk
Dynamic fields with more than 100 fields inside
Hey, I'm thinking about using dynamic fields. I need one or more user specific field in my schema, for example, concept_user_*, and I will have maybe more than 200 users using this feature. One user will send and retrieve values from its field. It will then be used to filter result. How would it impact query performance ? Thanks, Xavier S.
Re: unloading a solr core doesn't free any memory
What Garbage Collection parameters is the JVM using ? the memory will not always be freed immediately after an event like unloading a core or starting a new searcher. 2010/2/8 Tim Terlegård tim.terleg...@gmail.com To me it doesn't look like unloading a Solr Core frees the memory that the core has used. Is this how it should be? I have a big index with 50 million documents. After loading a core it takes 300 MB RAM. After a query with a couple of sort fields Solr takes about 8 GB RAM. Then I unload (CoreAdminRequest.unloadCore) the core. The core is not shown in /solr/ anymore. Solr still takes 8 GB RAM. Creating new cores is super slow because I have hardly any memory left. Do I need to free the memory explicitly somehow? /Tim
Call URL, simply parse the results using SolrJ
Sorry for the poorly worded title... For SOLR-1761 I want to pass in a URL and parse the query response... However it's non-obvious to me how to do this using the SolrJ API, hence asking the experts here. :)
Indexing / querying multiple data types
OK - so I've now got my data-config.xml sorted so that I'm pulling in the expected number of indexed documents for my two data sets So I've defined two entities (name1 name2) and they both make use of the same fields -- I'm not sure if this is a good thing to have done When I run a query I include qt=name1 (or qt=name2) and am expecting to only get the number of results from the appropriate data set -- in fact I'm getting the sum total from both Does the entity name=name1 equate to the query qt=name1 In my solrconfig.xml I have defined two requestHandlers (name1 name2) using the common set of fields So how do ensure that my query http://localhost:7001/solr/select/?q=foodqt=name1 or http://localhost:7001/solr/select/?q=foodqt=name2 Will operate on the correct data set as loaded via the data import -- entity name=name1 or entity name=name2 Thankss Stefan Maric BT Innovate Design | Collaboration Platform - Customer Innovation Solutions
Re: Multi-word synonyms containing commas
Hi, is it possible to have a synonym file where single synonyms can also contain commas, e.g. names like Washington, George. Sure, you just need to escape that comma. e.g. Washington\, George, wg a\,a = b\,b
Re: Call URL, simply parse the results using SolrJ
So here's what happens if I pass in a URL with parameters, SolrJ chokes: Exception in thread main java.lang.RuntimeException: Invalid base url for solrj. The base URL must not contain parameters: http://locahost:8080/solr/main/select?q=videoqt=dismax at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.init(CommonsHttpSolrServer.java:205) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.init(CommonsHttpSolrServer.java:180) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.init(CommonsHttpSolrServer.java:152) at org.apache.solr.util.QueryTime.main(QueryTime.java:20) On Mon, Feb 8, 2010 at 9:32 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Sorry for the poorly worded title... For SOLR-1761 I want to pass in a URL and parse the query response... However it's non-obvious to me how to do this using the SolrJ API, hence asking the experts here. :)
Re: Call URL, simply parse the results using SolrJ
Here's what I did to resolve this: XMLResponseParser parser = new XMLResponseParser(); URL urlo = new URL(url); InputStreamReader isr = new InputStreamReader(urlo.openConnection().getInputStream()); NamedListObject namedList = parser.processResponse(isr); QueryResponse response = new QueryResponse(namedList, null); On Mon, Feb 8, 2010 at 10:03 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: So here's what happens if I pass in a URL with parameters, SolrJ chokes: Exception in thread main java.lang.RuntimeException: Invalid base url for solrj. The base URL must not contain parameters: http://locahost:8080/solr/main/select?q=videoqt=dismax at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.init(CommonsHttpSolrServer.java:205) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.init(CommonsHttpSolrServer.java:180) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.init(CommonsHttpSolrServer.java:152) at org.apache.solr.util.QueryTime.main(QueryTime.java:20) On Mon, Feb 8, 2010 at 9:32 AM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Sorry for the poorly worded title... For SOLR-1761 I want to pass in a URL and parse the query response... However it's non-obvious to me how to do this using the SolrJ API, hence asking the experts here. :)
Re: Call URL, simply parse the results using SolrJ
So here's what happens if I pass in a URL with parameters, SolrJ chokes: Exception in thread main java.lang.RuntimeException: Invalid base url for solrj. The base URL must not contain parameters: http://locahost:8080/solr/main/select?q=videoqt=dismax You can't pass url with parameters to CommonsHttpSolrServer constructor. You need to create a SolrQuery representing your parameters and vaules. Your url can be translated into something like: server = new CommonsHttpSolrServer(http://locahost:8080/solr/main/;); final SolrQuery query = new SolrQuery(); query.setQueryType(dismax); query.setQuery(video); final QueryResponse rsp = server.query(query);
RE: Multi-word synonyms containing commas
Ok, that works (now I found it also in the example synonyms-file...) But what if I overwrite the synonyms-file after SOLR-startup? Is core-reloading the only way to do this? I think of this steps: 1. Generate new synonym-file 2. Reload core and wait a minute 3. Re-index (as I'm using synonyms at index time) -Original Message- From: Ahmet Arslan [mailto:iori...@yahoo.com] Sent: Montag, 08. Februar 2010 18:48 To: solr-user@lucene.apache.org Subject: Re: Multi-word synonyms containing commas Hi, is it possible to have a synonym file where single synonyms can also contain commas, e.g. names like Washington, George. Sure, you just need to escape that comma. e.g. Washington\, George, wg a\,a = b\,b
Re: Call URL, simply parse the results using SolrJ
Ahmet, Thanks, though that isn't quite what I was going for, and it's resolved besides... On Mon, Feb 8, 2010 at 10:24 AM, Ahmet Arslan iori...@yahoo.com wrote: So here's what happens if I pass in a URL with parameters, SolrJ chokes: Exception in thread main java.lang.RuntimeException: Invalid base url for solrj. The base URL must not contain parameters: http://locahost:8080/solr/main/select?q=videoqt=dismax You can't pass url with parameters to CommonsHttpSolrServer constructor. You need to create a SolrQuery representing your parameters and vaules. Your url can be translated into something like: server = new CommonsHttpSolrServer(http://locahost:8080/solr/main/;); final SolrQuery query = new SolrQuery(); query.setQueryType(dismax); query.setQuery(video); final QueryResponse rsp = server.query(query);
Trouble parsing XML from replication?command=status
Via Firefox on Ubuntu I downloaded the results of replication?command=status to a file, then wrote a little app to parse out the XML. Unfortunately it's not parsing. I'm wondering if it's because it's in XML, which nothing in Solr parses (SnapPuller for example is using javabin). Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[3,1908] Message: error reading value:LST at org.apache.solr.client.solrj.impl.XMLResponseParser.readArray(XMLResponseParser.java:319) at org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:240) at org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:239) at org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:125) ... 3 more
Re: Collating results from multiple indexes
Hi, There is no JOIN functionality in Solr. The common solution is either to accept the high volume update churn, or to add client side code to build a join layer on top of the two indices. I know that Attivio (www.attivio.com) have built some kind of JOIN functionality on top of Solr in their AIE product, but do not know the details or the actual performance. Why not open a JIRA issue, if there is no such already, to request this as a feature? -- Jan Høydahl - search architect Cominvent AS - www.cominvent.com On 25. jan. 2010, at 22.01, Aaron McKee wrote: Is there any somewhat convenient way to collate/integrate fields from separate indices during result writing, if the indices use the same unique keys? Basically, some sort of cross-index JOIN? As a bit of background, I have a rather heavyweight dataset of every US business (~25m records, an on-disk index footprint of ~30g, and 5-10 hours to fully index on a decent box). Given the size and relatively stability of the dataset, I generally only update this monthly. However, I have separate advertising-related datasets that need to be updated either hourly or daily (e.g. today's coupon, click revenue remaining, etc.) . These advertiser feeds reference the same keyspace that I use in the main index, but are otherwise significantly lighter weight. Importing and indexing them discretely only takes a couple minutes. Given that Solr/Lucene doesn't support field updating, without having to drop and re-add an entire document, it doesn't seem practical to integrate this data into the main index (the system would be under a constant state of churn, if we did document re-inserts, and the performance impact would probably be debilitating). It may be nice if this data could participate in filtering (e.g. only show advertisers), but it doesn't need to participate in scoring/ranking. I'm guessing that someone else has had a similar need, at some point? I can have our front-end query the smaller indices separately, using the keys returned by the primary index, but would prefer to avoid the extra sequential roundtrips. I'm hoping to also avoid a coding solution, if only to avoid the maintenance overhead as we drop in new builds of Solr, but that's also feasible. Thank you for your insight, Aaron
DataImportHandler
It looks like the dataimporter.functions.escapeSql(String) function escapes quotes, but fails to escape '\' characters which are problematic especially when the field value ends in a \. Also, on failure, I get an alarming notice of a possible resource leak. I couldn't find Jira issues for either. -Sean (field names and data below have been sanitized) config query line: query=SELECT SUM(fielda) AS A, SUM(fieldb) AS B FROM tablea where fieldc='${dataimporter.functions.escapeSql(outer_entity.fieldc)}' SEVERE: Full Import failed org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to execute query: SELECT SUM(fielda) AS A, SUM(fieldb) AS B FROM tablea where fieldc='somedata\' Processing Document # 1587 at org.apache.solr.handler.dataimport.DataImportHandlerException.wrapAndThrow(DataImportHandlerException.java:72) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:253) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:210) at org.apache.solr.handler.dataimport.JdbcDataSource.getData(JdbcDataSource.java:39) at org.apache.solr.handler.dataimport.SqlEntityProcessor.initQuery(SqlEntityProcessor.java:58) at org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:71) at org.apache.solr.handler.dataimport.EntityProcessorWrapper.nextRow(EntityProcessorWrapper.java:237) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:357) at org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:383) at org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:242) at org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:180) at org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:331) at org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:389) at org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:370) Caused by: com.mysql.jdbc.exceptions.MySQLSyntaxErrorException: You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near ''somedata\'' at line 1 at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:936) at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:2985) at com.mysql.jdbc.MysqlIO.sendCommand(MysqlIO.java:1631) at com.mysql.jdbc.MysqlIO.sqlQueryDirect(MysqlIO.java:1723) at com.mysql.jdbc.Connection.execSQL(Connection.java:3277) at com.mysql.jdbc.Connection.execSQL(Connection.java:3206) at com.mysql.jdbc.Statement.execute(Statement.java:727) at org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.init(JdbcDataSource.java:246) ... 12 more Feb 8, 2010 3:22:51 PM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: start rollback Feb 8, 2010 3:22:51 PM org.apache.solr.update.DirectUpdateHandler2 rollback INFO: end_rollback Feb 8, 2010 3:22:53 PM org.apache.solr.update.SolrIndexWriter finalize SEVERE: SolrIndexWriter was not closed prior to finalize(), indicates a bug -- POSSIBLE RESOURCE LEAK!!!
Re: Trouble parsing XML from replication?command=status
javabin parses fine, which leads me to believe there's a bug lurking... Though I'm not going spend time solving it. On Mon, Feb 8, 2010 at 12:18 PM, Jason Rutherglen jason.rutherg...@gmail.com wrote: Via Firefox on Ubuntu I downloaded the results of replication?command=status to a file, then wrote a little app to parse out the XML. Unfortunately it's not parsing. I'm wondering if it's because it's in XML, which nothing in Solr parses (SnapPuller for example is using javabin). Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[3,1908] Message: error reading value:LST at org.apache.solr.client.solrj.impl.XMLResponseParser.readArray(XMLResponseParser.java:319) at org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:240) at org.apache.solr.client.solrj.impl.XMLResponseParser.readNamedList(XMLResponseParser.java:239) at org.apache.solr.client.solrj.impl.XMLResponseParser.processResponse(XMLResponseParser.java:125) ... 3 more
TermInfosReader.get ArrayIndexOutOfBoundsException
Hello all, After optimizing rather large indexes on 10 shards (each index holds about 500,000 documents and is about 270-300 GB in size) we started getting intermittent TermInfosReader.get() ArrayIndexOutOfBounds exceptions. The exceptions sometimes seem to occur on all 10 shards at the same time and sometimes on one shard but not the others. We also sometimes get an Internal Server Error but that might be either a cause or an effect of the array index out of bounds. Here is the top part of the message: java.lang.ArrayIndexOutOfBoundsException: -14127432 at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:246) Any suggestions for troubleshooting would be appreciated. Trace from tomcat logs appended below. Tom Burton-West --- Feb 5, 2010 8:09:02 AM org.apache.solr.common.SolrException log SEVERE: java.lang.ArrayIndexOutOfBoundsException: -14127432 at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:246) at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:218) at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:943) at org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:308) at org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java:144) at org.apache.lucene.search.Similarity.idf(Similarity.java:481) at org.apache.lucene.search.TermQuery$TermWeight.init(TermQuery.java:44) at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:146) at org.apache.lucene.search.BooleanQuery$BooleanWeight.init(BooleanQuery.java:186) at org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:366) at org.apache.lucene.search.Query.weight(Query.java:95) at org.apache.lucene.search.Searcher.createWeight(Searcher.java:230) at org.apache.lucene.search.Searcher.search(Searcher.java:171) at org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:651) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:545) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:581) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:903) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:176) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:548) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) at java.lang.Thread.run(Thread.java:619) Feb 5, 2010 8:09:02 AM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: Internal Server Error Internal Server Error request: http://solr-sdr-search-10:8081/serve-10/select at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:423) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:242) at
Re: Embedded Solr problem
Hi Ranveer, I assume that you have enough knowlesge in Java. You should essentially your code for instantiating the server (depending on what you intend to do this may be done in a separate class or in a method of the class doing the queries). Then you use this instance to handle all the queries using for example the method query of SolrServer. For further information you may want to consult either the API documentation or the url http://wiki.apache.org/solr/Solrj from the wiki. Cheers, Sven --On Montag, 8. Februar 2010 08:53 +0530 Ranveer Kumar ranveer.s...@gmail.com wrote: Hi Sven, thanks for reply. yes i notice that every time when request, new instance is created of solr server. could you please guide me to do the same ( initialization to create an instance of SolrServer, once during first request). On Mon, Feb 8, 2010 at 2:11 AM, Sven Maurmann sven.maurm...@kippdata.dewrote: Hi, would it be possible that you instantiate a new instance of your SolrServer every time you do a query? You should use the code you quoted in your mail once during initialization to create an instance of SolrServer (the interface being implemented by EmbeddedSolrServer) and subsquently use the query method of SolrServer to do the query. Cheers, Sven --On Sonntag, 7. Februar 2010 21:54 +0530 Ranveer Kumar ranveer.s...@gmail.com wrote: Hi All, I am still very new to solr. Currently I am facing problem to use EmbeddedSolrServer. following is my code: File home = new File(D:/ranveer/java/solr_home/solr/first); CoreContainer coreContainer = new CoreContainer(); SolrConfig config = null; config = new SolrConfig(home + /core1,solrconfig.xml,null); CoreDescriptor descriptor = new CoreDescriptor(coreContainer,core1,home + /core1); SolrCore core = new SolrCore(core1, home+/core1/data, config, new IndexSchema(config, schema.xml,null), descriptor); coreContainer.register(core.getName(), core, true); final EmbeddedSolrServer server = new EmbeddedSolrServer(coreContainer, core1); Now my problem is every time when I making request for search SolrCore is initializing the core. I want if the core/instance of core is already start then just use previously started core. Due to this problem right now searching is taking too much time. I tried to close core after search but same thing when fresh search result is made, solr is starting from very basic. please help.. thanks
Re: Indexing / querying multiple data types
Hi, could you be a little more precise about your configuration? It may be much easier to answer your question then. Cheers, Sven --On Montag, 8. Februar 2010 17:39 + stefan.ma...@bt.com wrote: OK - so I've now got my data-config.xml sorted so that I'm pulling in the expected number of indexed documents for my two data sets So I've defined two entities (name1 name2) and they both make use of the same fields -- I'm not sure if this is a good thing to have done When I run a query I include qt=name1 (or qt=name2) and am expecting to only get the number of results from the appropriate data set -- in fact I'm getting the sum total from both Does the entity name=name1 equate to the query qt=name1 In my solrconfig.xml I have defined two requestHandlers (name1 name2) using the common set of fields So how do ensure that my query http://localhost:7001/solr/select/?q=foodqt=name1 or http://localhost:7001/solr/select/?q=foodqt=name2 Will operate on the correct data set as loaded via the data import -- entity name=name1 or entity name=name2 Thankss Stefan Maric BT Innovate Design | Collaboration Platform - Customer Innovation Solutions
DataImportHandler can't understand query
I have a complex query (runs fine in database), which I am trying to include in DataImportHandler query. Query has case statements with in it e.g. case when (ASSIGNED_TO '' and TRANSLATE(ASSIGNED_TO, '', '0123456789')='') DataImportHandler failes to understand query with following error and complaining about symbol. How to go about this? Note; query is valid and runs fine in database. [Fatal Error] :26:26: The value of attribute query associated with an element type entity must not contain the '' character. Feb 8, 2010 6:02:09 PM org.apache.solr.handler.dataimport.DataImportHandler inform SEVERE: Exception while loading DataImporter org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:190) Thansk, -- View this message in context: http://old.nabble.com/DataImportHandler-can%27t-understand-query-tp27507918p27507918.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: DataImportHandler can't understand query
Note I already tried to escape character with \ but still it throws same error. Any idea? Thanks, javaxmlsoapdev wrote: I have a complex query (runs fine in database), which I am trying to include in DataImportHandler query. Query has case statements with in it e.g. case when (ASSIGNED_TO '' and TRANSLATE(ASSIGNED_TO, '', '0123456789')='') DataImportHandler failes to understand query with following error and complaining about symbol. How to go about this? Note; query is valid and runs fine in database. [Fatal Error] :26:26: The value of attribute query associated with an element type entity must not contain the '' character. Feb 8, 2010 6:02:09 PM org.apache.solr.handler.dataimport.DataImportHandler inform SEVERE: Exception while loading DataImporter org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImporter.java:190) Thansk, -- View this message in context: http://old.nabble.com/DataImportHandler-can%27t-understand-query-tp27507918p27508214.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: DataImportHandler can't understand query
Did you try lt; gt; Nirmal Shah Remedy Consultant|Column Technologies|Cell: (630) 244-1648 -Original Message- From: javaxmlsoapdev [mailto:vika...@yahoo.com] Sent: Monday, February 08, 2010 5:42 PM To: solr-user@lucene.apache.org Subject: Re: DataImportHandler can't understand query Note I already tried to escape character with \ but still it throws same error. Any idea? Thanks, javaxmlsoapdev wrote: I have a complex query (runs fine in database), which I am trying to include in DataImportHandler query. Query has case statements with in it e.g. case when (ASSIGNED_TO '' and TRANSLATE(ASSIGNED_TO, '', '0123456789')='') DataImportHandler failes to understand query with following error and complaining about symbol. How to go about this? Note; query is valid and runs fine in database. [Fatal Error] :26:26: The value of attribute query associated with an element type entity must not contain the '' character. Feb 8, 2010 6:02:09 PM org.apache.solr.handler.dataimport.DataImportHandler inform SEVERE: Exception while loading DataImporter org.apache.solr.handler.dataimport.DataImportHandlerException: Exception occurred while initializing context at org.apache.solr.handler.dataimport.DataImporter.loadDataConfig(DataImpor ter.java:190) Thansk, -- View this message in context: http://old.nabble.com/DataImportHandler-can%27t-understand-query-tp27507 918p27508214.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr multicore and nfs
Solr generally does not work well over NFS. This looks like a transient NFS error; apps have to assume that NFS will randomly fail and that they have to try again. This may be due to a locking problem. There is a LockFactory class in Lucene that controls how indexes are shared between programs. Solr includes a control for this in solrconfig.xml. The java apps and Solr apps have to agree on the lock strategy. And it has to work over NFS. http://wiki.apache.org/lucene-java/AvailableLockFactories solrconfig.xml: !-- As long as Solr is the only process modifying your index, it is safe to use Lucene's in process locking mechanism. But you may specify one of the other Lucene LockFactory implementations in the event that you have a custom situation. none = NoLockFactory (typically only used with read only indexes) single = SingleInstanceLockFactory (suggested) native = NativeFSLockFactory simple = SimpleFSLockFactory ('simple' is the default for backwards compatibility with Solr 1.2) -- lockTypesingle/lockType On Thu, Feb 4, 2010 at 6:54 AM, Valérie TAESCH v.tae...@greenivory.com wrote: Hello, We are using Solr(v 1.3.0 694707 with Lucene version 2.4-dev 691741) in multicore mode with an average of 400 indexes (all indexes have the same structure). These indexes are stored on a nfs disk. A java process writes continuously in these indexes while solr is only used to read those indexes. We often got this exception : HTTP Status 500 - No such file or directory java.io.IOException: No such file or directory at java.io.RandomAccessFile.readBytes(Native Method) at java.io.RandomAccessFile.read(RandomAccessFile.java:322) at org.apache.lucene.store.FSDirectory$FSIndexInput.readInternal(FSDirectory.java:596) at org.apache.lucene.store.BufferedIndexInput.readBytes(BufferedIndexInput.java:136) at org.apache.lucene.index.CompoundFileReader$CSIndexInput.readInternal(CompoundFileReader.java:247) at org.apache.lucene.store.BufferedIndexInput.refill(BufferedIndexInput.java:157) at org.apache.lucene.store.BufferedIndexInput.readByte(BufferedIndexInput.java:38) at org.apache.lucene.store.IndexInput.readVInt(IndexInput.java:78) at org.apache.lucene.index.TermBuffer.read(TermBuffer.java:64) at org.apache.lucene.index.SegmentTermEnum.next(SegmentTermEnum.java:127) at org.apache.lucene.index.SegmentTermEnum.scanTo(SegmentTermEnum.java:158) at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:270) at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:217) at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:744) at org.apache.lucene.index.MultiSegmentReader.docFreq(MultiSegmentReader.java:375) at org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java:87) at org.apache.lucene.search.Similarity.idf(Similarity.java:457) at org.apache.lucene.search.TermQuery$TermWeight.init(TermQuery.java:44) at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:146) at org.apache.lucene.search.Query.weight(Query.java:95) at org.apache.lucene.search.Searcher.createWeight(Searcher.java:185) at org.apache.lucene.search.Searcher.search(Searcher.java:126) at org.apache.lucene.search.Searcher.search(Searcher.java:105) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:966) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1204) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at
Re: Is it posible to exclude results from other languages?
There is On Thu, Feb 4, 2010 at 10:07 AM, Raimon Bosch raimon.bo...@gmail.com wrote: Yes, It's true that we could do it in index time if we had a way to know. I was thinking in some solution in search time, maybe measuring the % of stopwords of each document. Normally, a document of another language won't have any stopword of its main language. If you know some external software to detect the language of a source text, it would be useful too. Thanks, Raimon Bosch. Ahmet Arslan wrote: In our indexes, sometimes we have some documents written in other languages different to the most common index's language. Is there any way to give less boosting to this documents? If you are aware of those documents, at index time you can boost those documents with a value less than 1.0: add doc boost=0.5 // document written in other languages field name=../field field name=../field /doc /add http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_on_.22doc.22 -- View this message in context: http://old.nabble.com/Is-it-posible-to-exclude-results-from-other-languages--tp27455759p27457165.html Sent from the Solr - User mailing list archive at Nabble.com. -- Lance Norskog goks...@gmail.com
Re: source tree for lucene
The Solr trunk/lib directory contains lucene libs as of January 18. (They were checked in on that date.) On Thu, Feb 4, 2010 at 4:28 PM, Joe Calderon calderon@gmail.com wrote: i want to recompile lucene with http://issues.apache.org/jira/browse/LUCENE-2230, but im not sure which source tree to use, i tried using the implied trunk revision from the admin/system page but solr fails to build with the generated jars, even if i exclude the patches from 2230... im wondering if there is another lucene tree i should grab to use to build solr? --joe -- Lance Norskog goks...@gmail.com
Re: Fundamental questions of how to build up solr for huge portals
In general, search is disk/intensive. Lots of RAM (up to 32G at today's prices) and fast hard disks matter. For administration, the single biggest disruptor is updating the index. If you can keep index updates to off-peak hours it will be ok. If not, index on one server and server queries from another. Use local hard disks. http://wiki.apache.org/solr/SolrPerformanceData http://wiki.apache.org/solr/SolrPerformanceFactors On Fri, Feb 5, 2010 at 5:19 AM, Fuad Efendi f...@efendi.ca wrote: - whats the best way to use solr to get the best performance for an huge portal with 5000 users that might expense fastly? 5000 users: 200 TPS, for instance, equal to 1200 concurrent users (each user makes 1 request per minute); so that single SOLR instance is more than enough. Why 200TPS? It is bottom line, for fuzzy search (I recently improved it). In real life, real hardware, 1000TPS (using caching, not frequently using fuzzy search, etc.) which is equal to 6 concurrent users, subsequently to more than 600,000 of total users. The rest depends on your design... If you have separate portals A, B, C - create a field with values A, B, C. Liferay Portal nicely integrates with SOLR... each kind of Portlet object (Forum Post, Document, Journal Article, etc.) can implement searchable and be automatically indexed. But Liferay is Java-based, JSR-168, JSR-286 (and it supports PHP-portlets, but I never tried). Fuad Efendi +1 416-993-2060 http://www.linkedin.com/in/liferay -Original Message- From: Peter [mailto:zarato...@gmx.net] Sent: January-16-10 10:17 AM To: solr-user@lucene.apache.org Subject: Fundamental questions of how to build up solr for huge portals Hello! Our team wants to use solr for an community portal built up out of 3 and more sub portals. We are unsure in which way we sould build up the whole architecture, because we have more than one portal and we want to make them all connected and searchable by solr. Could some experts help us on these questions? - whats the best way to use solr to get the best performance for an huge portal with 5000 users that might expense fastly? - which client to use (Java,PHP...)? Now the portal is almost PHP/MySQL based. But we want to make solr as best as it could be in all ways (performace, accesibility, way of good programming, using the whole features of lucene - like tagging, facetting and so on...) We are thankful of every suggestions :) Thanks, Peter -- Lance Norskog goks...@gmail.com
Re: Slow QueryComponent.process() when queries have numbers in them
The single-digit numbers are probably in all of docs. You might want to rip them out with a SynonymFilter. The more docs that a query finds, the longer the query takes. On Fri, Feb 5, 2010 at 1:23 PM, Simon Wistow si...@thegestalt.org wrote: On Wed, Feb 03, 2010 at 07:38:13PM -0800, Lance Norskog said: The debugQuery parameter shows you how the query is parsed into a tree of Lucene query objects. Well, that's kind of what I'm asking - I know how the query is being parsed: str name=rawquerystringmyers 8e psychology chapter 9/str str name=querystringmyers 8e psychology chapter 9/str str name=parsedquery +((DisjunctionMaxQuery((content:myer^0.8 | title:myer^1.5)~0.01) DisjunctionMaxQuery((content:8 e~2^0.8 | title:8 e~2^1.5)~0.01) DisjunctionMaxQuery((content:psycholog^0.8 | title:psycholog^1.5)~0.01) DisjunctionMaxQuery((content:chapter^0.8 | title:chapter^1.5)~0.01) DisjunctionMaxQuery((content:9^0.8 | title:9^1.5)~0.01))~4) () /str str name=parsedquery_toString +(((content:myer^0.8 | title:myer^1.5)~0.01 (content:8 e~2^0.8 | title:8 e~2^1.5)~0.01 (content:psycholog^0.8 | title:psycholog^1.5)~0.01 (content:chapter^0.8 | title:chapter^1.5)~0.01 (content:9^0.8 | title:9^1.5)~0.01)~4) () /str But that's sort of besides the point - I was really asking if this is a known issue (i.e queries with numbers in them can be very slow) and whether there are any workarounds -- Lance Norskog goks...@gmail.com
Re: unloading a solr core doesn't free any memory
The 'jconsole' program lets you monitor GC operation in real-time. http://java.sun.com/developer/technicalArticles/J2SE/jconsole.html On Mon, Feb 8, 2010 at 8:44 AM, Simon Rosenthal simon_rosent...@yahoo.com wrote: What Garbage Collection parameters is the JVM using ? the memory will not always be freed immediately after an event like unloading a core or starting a new searcher. 2010/2/8 Tim Terlegård tim.terleg...@gmail.com To me it doesn't look like unloading a Solr Core frees the memory that the core has used. Is this how it should be? I have a big index with 50 million documents. After loading a core it takes 300 MB RAM. After a query with a couple of sort fields Solr takes about 8 GB RAM. Then I unload (CoreAdminRequest.unloadCore) the core. The core is not shown in /solr/ anymore. Solr still takes 8 GB RAM. Creating new cores is super slow because I have hardly any memory left. Do I need to free the memory explicitly somehow? /Tim -- Lance Norskog goks...@gmail.com
Re: TermInfosReader.get ArrayIndexOutOfBoundsException
The index is corrupted. In some places ArrayIndex and NPE are not wrapped as CorruptIndexException. Try running your code with the Lucene assertions on. Add this to the JVM arguments: -ea:org.apache.lucene... On Mon, Feb 8, 2010 at 1:02 PM, Burton-West, Tom tburt...@umich.edu wrote: Hello all, After optimizing rather large indexes on 10 shards (each index holds about 500,000 documents and is about 270-300 GB in size) we started getting intermittent TermInfosReader.get() ArrayIndexOutOfBounds exceptions. The exceptions sometimes seem to occur on all 10 shards at the same time and sometimes on one shard but not the others. We also sometimes get an Internal Server Error but that might be either a cause or an effect of the array index out of bounds. Here is the top part of the message: java.lang.ArrayIndexOutOfBoundsException: -14127432 at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:246) Any suggestions for troubleshooting would be appreciated. Trace from tomcat logs appended below. Tom Burton-West --- Feb 5, 2010 8:09:02 AM org.apache.solr.common.SolrException log SEVERE: java.lang.ArrayIndexOutOfBoundsException: -14127432 at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:246) at org.apache.lucene.index.TermInfosReader.get(TermInfosReader.java:218) at org.apache.lucene.index.SegmentReader.docFreq(SegmentReader.java:943) at org.apache.solr.search.SolrIndexReader.docFreq(SolrIndexReader.java:308) at org.apache.lucene.search.IndexSearcher.docFreq(IndexSearcher.java:144) at org.apache.lucene.search.Similarity.idf(Similarity.java:481) at org.apache.lucene.search.TermQuery$TermWeight.init(TermQuery.java:44) at org.apache.lucene.search.TermQuery.createWeight(TermQuery.java:146) at org.apache.lucene.search.BooleanQuery$BooleanWeight.init(BooleanQuery.java:186) at org.apache.lucene.search.BooleanQuery.createWeight(BooleanQuery.java:366) at org.apache.lucene.search.Query.weight(Query.java:95) at org.apache.lucene.search.Searcher.createWeight(Searcher.java:230) at org.apache.lucene.search.Searcher.search(Searcher.java:171) at org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:651) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:545) at org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:581) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:903) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:884) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:341) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:176) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1299) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:338) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:241) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:172) at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:548) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:174) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:875) at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:665) at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:528) at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:81) at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689) at java.lang.Thread.run(Thread.java:619) Feb 5, 2010 8:09:02 AM org.apache.solr.common.SolrException log SEVERE:
Re: Dynamic fields with more than 100 fields inside
On Mon, Feb 8, 2010 at 9:47 PM, Xavier Schepler xavier.schep...@sciences-po.fr wrote: Hey, I'm thinking about using dynamic fields. I need one or more user specific field in my schema, for example, concept_user_*, and I will have maybe more than 200 users using this feature. One user will send and retrieve values from its field. It will then be used to filter result. How would it impact query performance ? Can you give an example of such a query? -- Regards, Shalin Shekhar Mangar.