Re: Missing required field: id Using ExtractingRequestHandler
Doh! I get it. Ignore my questions in the previous e-mail. The XML files have the id in them. For Word/Excel/PDF etc., it's up to the client (crawler) or whatever to create a unique id if I want a unique id. Thanks again for pointing me in the right direction. I'm really impressed with how easy it's been for a non-Java/web app guy to get Solr going. Excellent work! On Thu, 2009-03-19 at 16:51 -0700, Chris Harris wrote: Unless there's a regression in the ExtractingRequestHandler, then this should be caused because both A) you have an id field defined in your solr schema file that's marked as a required field and B) you did not specify an ID parameter when you submitted your document to the handler. If you don't want your Solr docs to have an id field, then mark that field as not required in your schema. If you *do* want your Solr docs to have a required field called id, then you'll need to specify the ID when you submit your document. One way is using an ext.literal parameter, more or less like this: startofURL...ext.literal.id=13...restofURL Alternatively, you can try the field mapping mechanism, which is hopefully described on the wiki page. Cheers, Chris On Thu, Mar 19, 2009 at 3:46 PM, Larry Reid lcr...@jadesystems.ca wrote: I trying to index Word, PDF and other documents with Solr. I installed the latest nightly build of Solr on March 17. I followed the instructions in the Wiki for ExtractingRequestHandler at http://wiki.apache.org/solr/ExtractingRequestHandler#head-c95841f9eda007b6b4e4594ead12a04223cf7b6e. I have produced text output from tiki in the nightly build directories from PDF files. When I try the suggested test curl commands in the Getting Started with the Solr Examle section of the Wiki page, I get the following. Any idea what I've done wrong? Thanks in advance for your help. $ curl http://localhost:8983/solr/update/extract?ext.idx.attr=true \ext.def.fl=text -F myfi...@tutorial.pdf html head meta http-equiv=Content-Type content=text/html; charset=ISO-8859-1/ titleError 500 /title /head bodyh2HTTP ERROR: 500/h2preorg.apache.solr.common.SolrException: Document [null] missing required field: id org.apache.solr.common.SolrException: org.apache.solr.common.SolrException: Document [null] missing required field: id at org.apache.solr.handler.extraction.ExtractingDocumentLoader.load(ExtractingDocumentLoader.java:169) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:54) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1333) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:303) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:232) at org.mortbay.jetty.servlet.ServletHandler $CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection $RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:641) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:202) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector $Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool $PoolThread.run(BoundedThreadPool.java:442) Caused by: org.apache.solr.common.SolrException: Document [null] missing required field: id at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:292) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:59) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.doAdd(ExtractingDocumentLoader.java:90) at org.apache.solr.handler.extraction.ExtractingDocumentLoader.addDoc(ExtractingDocumentLoader.java:95) at
Issue with Facet Query
Hi, I am searching the indexes with facet query. Below is the query. q=Answerversion=2.2start=0rows=10indent=onqt=dismaxrequestfacet=truefacet.field=productPrice_product_str_s:[0%20TO%2020] It is giving me an exception saying: str name=exceptionorg.apache.solr.common.SolrException: undefined field productPrice_product_str_s:[0 TO 20] at org.apache.solr.schema.IndexSchema.getField(IndexSchema.java:994) at org.apache.solr.request.SimpleFacets.getTermCounts(SimpleFacets.java:152) at org.apache.solr.request.SimpleFacets.getFacetFieldCounts(SimpleFacets.java:182) at org.apache.solr.request.SimpleFacets.getFacetCounts(SimpleFacets.java:96) at org.apache.solr.handler.component.FacetComponent.process(FacetComponent.java:70) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:169) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131) at org.apache.solr.core.SolrCore.execute Can someone please guide me, how to prevent this exception. I guess, I am missing some entries in some config file like solrConfig or schema. I would appreciate if someone can tell me the specific entries, I need to make in any config file. Thanks a lot. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22615577.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with Facet Query
On Fri, Mar 20, 2009 at 1:14 PM, dabboo ag...@sapient.com wrote: Hi, I am searching the indexes with facet query. Below is the query. q=Answerversion=2.2start=0rows=10indent=onqt=dismaxrequestfacet=truefacet.field=productPrice_product_str_s:[0%20TO%2020] facet.field takes a field name. It does not accept queries. Use facet.query for getting count of a query. Use fq to restrict facets by a certain query. See http://wiki.apache.org/solr/SimpleFacetParameters -- Regards, Shalin Shekhar Mangar.
Re: Issue with Facet Query
Thanks a lot for this information. But is there any way, I can impose the range on the facet. for e.g. If I want to search the data between a specific range, how should I form my query. Do I need to make some entries some where. Thanks, Amit Garg Shalin Shekhar Mangar wrote: On Fri, Mar 20, 2009 at 1:14 PM, dabboo ag...@sapient.com wrote: Hi, I am searching the indexes with facet query. Below is the query. q=Answerversion=2.2start=0rows=10indent=onqt=dismaxrequestfacet=truefacet.field=productPrice_product_str_s:[0%20TO%2020] facet.field takes a field name. It does not accept queries. Use facet.query for getting count of a query. Use fq to restrict facets by a certain query. See http://wiki.apache.org/solr/SimpleFacetParameters -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22615979.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
== where are you seeing it as Solène as opposed to the correct way of solène? I have Solène in my Mysql DATA BASE ! so i don't know if this is correct or not ? i gess that Solène is solène in UTF-8 ?! I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so when i try with solène everything is ok ! but when i try with Solène (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!! I think that ISOLatin1AccentFilterFactory take only string with Charset ISO-8859-1 . So any solution to transform my string to ISO-8859-1 before indexing process. May be by creating transformer in DataImportHandler ? (Never code in java :( ) Thank you all. Koji Sekiguchi-2 wrote: aerox7 wrote: Hi, I have a mysql data base in UTF-8. I have a row with Solène (solène). I want to transforme this to solene, so i use Solr ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8 so normaly ISOLatin1AccentFilterFactory have to replace the accent ... any ideas ? i use DataImportHandler. If a mapping rule è to e is always true in your field, you can try to use MappingCharFilter instead of ISOLatin1AccentFilter. Add the following line to mapping-ISOLatin1Accent.txt: è = e and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. Koji -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22616220.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with Facet Query
Shalin, thanks a lot. One quick question: Now, after putting the query in the way, you suggested, I am getting: - lst name=facet_counts - lst name=facet_queries int name=productPrice_product_s:[0 TO 20]23315/int /lst lst name=facet_fields / lst name=facet_dates / /lst But it is not returning me records. Do I need to enter this field entry in schema.xml to get the records or anywhere else. Thanks, Amit Garg Shalin Shekhar Mangar wrote: On Fri, Mar 20, 2009 at 1:49 PM, dabboo ag...@sapient.com wrote: Thanks a lot for this information. But is there any way, I can impose the range on the facet. for e.g. If I want to search the data between a specific range, how should I form my query. Use a filter query, fq=productPrice_product_str_s:[0 TO 20] -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22616536.html Sent from the Solr - User mailing list archive at Nabble.com.
Search transparently with Solr with multiple cores, different indexes, common response type
Hello all, here I am with another question... :-) I figured that I have to change approach to implement the requirements I have :-( Here it is what I have to index: 1) data A in an Oracle DB Table A 2) data B in an Oracle DB Table B 3) data C in different files Data A, B, and C are slightly different, thus they are indexed differently; obviously the client receives the search results for all data types in a consistent/common format. The client application shall be able to search among each or all data types (A, B, C). The order will be configurable, like: return the first 5 from data A, the first 10 from B, all C. At first I thought of using only one Solr with different datasources, and one huge index, but I figured that delta imports would be very hard/expensive/impossible. Reading some other posts I thought that maybe a better approach would be as following: 1) one Solr core for each data type (one for A, one for B, one for C) 2) one index fora each data type, thus one document type for A, one for B, and one for C 3) client applications shall be able to search on one or all cores 4) the cores shall return search results in a common XML format 5) search results shall be aggregated in a configurable way Can you please tell me if this architecture is possible with Solr? Obviously I am not looking for an out-of-the-.box solution, I just need to understand what I have to develop myself and what is already available. 1) is a multicore architecture: I know it is possible and I tested that it works great 2) same as above, no problems here :-) 3) I want to hide the different cores to the client application; the client application should send the requests to one guy that parses the request and forwards it to the cores. Is this a custom RequestHandler? Any link (to the Wiki?) to understand better? Or is there anything already available to achieve this? 4) The guy that parses the request and forwards it to the cores shall aggregate and return results in a common XML format: is this a custom ResponseHandler? 5) I know this is just my business logic :-) Any thougts/warning/advice about this? Thanks a lot in advance! Giovanni
Re: Issue with Facet Query
On Fri, Mar 20, 2009 at 2:27 PM, dabboo ag...@sapient.com wrote: Shalin, thanks a lot. One quick question: Now, after putting the query in the way, you suggested, I am getting: - lst name=facet_counts - lst name=facet_queries int name=productPrice_product_s:[0 TO 20]23315/int /lst lst name=facet_fields / lst name=facet_dates / /lst But it is not returning me records. Do I need to enter this field entry in schema.xml to get the records or anywhere else. facet.query returns the number of documents matching that query after applying any filters (fq) that you may have specified. Can you tell us your use-case? -- Regards, Shalin Shekhar Mangar.
Re: Issue with Facet Query
Thanks Shalin, thanks a lot. I appreciate your help in resolving this issue. Thanks, Amit Shalin Shekhar Mangar wrote: On Fri, Mar 20, 2009 at 2:27 PM, dabboo ag...@sapient.com wrote: Shalin, thanks a lot. One quick question: Now, after putting the query in the way, you suggested, I am getting: - lst name=facet_counts - lst name=facet_queries int name=productPrice_product_s:[0 TO 20]23315/int /lst lst name=facet_fields / lst name=facet_dates / /lst But it is not returning me records. Do I need to enter this field entry in schema.xml to get the records or anywhere else. facet.query returns the number of documents matching that query after applying any filters (fq) that you may have specified. Can you tell us your use-case? -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22616724.html Sent from the Solr - User mailing list archive at Nabble.com.
RE: Special character indexing
Hi Shalin, Thanks for the suggestion. I tried following code, (not sure about the exact usage) CommonsHttpSolrServer ess = new CommonsHttpSolrServer(http://localhost:8983/solr;); ess.setRequestWriter(new BinaryRequestWriter()); SolrInputDocument solrdoc = new SolrInputDocument(); solrdoc.addField(id, Kimi); solrdoc.addField(name, 03 Kimi Räikkönen ); ess.add(solrdoc); But got following exception on the server WARNING: The @Deprecated SolrUpdateServlet does not accept query parameters: wt=javabin If you are using solrj, make sure to register a request handler to /update rather then use this servlet. Add: requestHandler name=/update class=solr.XmlUpdateRequestHandler to your solrconfig.xml Mar 20, 2009 3:14:48 PM org.apache.solr.common.SolrException log SEVERE: Error processing legacy update command:com.ctc.wstx.exc.WstxUnexpectedCharException: Illegal character ((CTRL- CHAR, code 1)) at [row,col {unknown-source}]: [1,1] at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:675) at com.ctc.wstx.sr.StreamScanner.throwInvalidSpace(StreamScanner.java:660) at com.ctc.wstx.sr.BasicStreamReader.readSpacePrimary(BasicStreamReader.java:4916) at com.ctc.wstx.sr.BasicStreamReader.nextFromProlog(BasicStreamReader.java:2003) at com.ctc.wstx.sr.BasicStreamReader.next(BasicStreamReader.java:1069) at org.apache.solr.handler.XmlUpdateRequestHandler.processUpdate(XmlUpdateRequestHandler.java:148) at org.apache.solr.handler.XmlUpdateRequestHandler.doLegacyUpdate(XmlUpdateRequestHandler.java:393) at org.apache.solr.servlet.SolrUpdateServlet.doPost(SolrUpdateServlet.java:78) at javax.servlet.http.HttpServlet.service(HttpServlet.java:727) at javax.servlet.http.HttpServlet.service(HttpServlet.java:820) at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:487) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1098) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:295) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1089) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:365) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:712) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:211) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:139) at org.mortbay.jetty.Server.handle(Server.java:285) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:502) at org.mortbay.jetty.HttpConnection$RequestHandler.content(HttpConnection.java:835) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:723) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:208) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:378) at org.mortbay.jetty.bio.SocketConnector$Connection.run(SocketConnector.java:226) at org.mortbay.thread.BoundedThreadPool$PoolThread.run(BoundedThreadPool.java:442) Thanks in advance for help. Siddharth -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Friday, March 20, 2009 10:35 AM To: solr-user@lucene.apache.org Subject: Re: Special character indexing On Fri, Mar 20, 2009 at 10:17 AM, Gargate, Siddharth sgarg...@ptc.comwrote: I tried with Jetty but the same issue. Just a guess, but looks like the fix for SOLR-973 might have introduced this issue. I'm not sure how SOLR-973 can cause this issue. Can you try using the BinaryRequestWriter and see if it succeeds? http://wiki.apache.org/solr/Solrj#head-ddc28af4033350481a3cbb27bc1d25bffd801af0 -- Regards, Shalin Shekhar Mangar.
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
Hi, My guess is that *although* your DB is in UTF-8, the database engine sends you the rows in ISO-Latin1, so before doing *anything* after receiving the data, you should transcode from ISO-Latin1 to UTF-8 and then send that to SolR. I'm no Java expert, but in perl (MySQL DB in utf-8) I have to do with any row: $row=decode(iso-8859-1,$row); ... and before building the xml to invoque and add document to SolR: $row=encode(utf8,$row); On Fri, Mar 20, 2009 at 10:55 AM, aerox7 amyne.berr...@me.com wrote: I add : è = e to mapping-ISOLatin1Accent.txt and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType By still have the same probleme ! it's only work when i store ISO string into UTF-8 data base (ex: store solène not solène) :,( aerox7 wrote: == where are you seeing it as Solène as opposed to the correct way of solène? I have Solène in my Mysql DATA BASE ! so i don't know if this is correct or not ? i gess that Solène is solène in UTF-8 ?! I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so when i try with solène everything is ok ! but when i try with Solène (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!! I think that ISOLatin1AccentFilterFactory take only string with Charset ISO-8859-1 . So any solution to transform my string to ISO-8859-1 before indexing process. May be by creating transformer in DataImportHandler ? (Never code in java :( ) Thank you all. Koji Sekiguchi-2 wrote: aerox7 wrote: Hi, I have a mysql data base in UTF-8. I have a row with Solène (solène). I want to transforme this to solene, so i use Solr ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8 so normaly ISOLatin1AccentFilterFactory have to replace the accent ... any ideas ? i use DataImportHandler. If a mapping rule è to e is always true in your field, you can try to use MappingCharFilter instead of ISOLatin1AccentFilter. Add the following line to mapping-ISOLatin1Accent.txt: è = e and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. Koji -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html Sent from the Solr - User mailing list archive at Nabble.com. -- “I may not believe in myself, but I believe in what I'm doing.” -- Jimmy Page
Re: Issue with Facet Query
On Fri, Mar 20, 2009 at 1:49 PM, dabboo ag...@sapient.com wrote: Thanks a lot for this information. But is there any way, I can impose the range on the facet. for e.g. If I want to search the data between a specific range, how should I form my query. Use a filter query, fq=productPrice_product_str_s:[0 TO 20] -- Regards, Shalin Shekhar Mangar.
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
I add : è = e to mapping-ISOLatin1Accent.txt and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType By still have the same probleme ! it's only work when i store ISO string into UTF-8 data base (ex: store solène not solène) :,( aerox7 wrote: == where are you seeing it as Solène as opposed to the correct way of solène? I have Solène in my Mysql DATA BASE ! so i don't know if this is correct or not ? i gess that Solène is solène in UTF-8 ?! I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so when i try with solène everything is ok ! but when i try with Solène (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!! I think that ISOLatin1AccentFilterFactory take only string with Charset ISO-8859-1 . So any solution to transform my string to ISO-8859-1 before indexing process. May be by creating transformer in DataImportHandler ? (Never code in java :( ) Thank you all. Koji Sekiguchi-2 wrote: aerox7 wrote: Hi, I have a mysql data base in UTF-8. I have a row with Solène (solène). I want to transforme this to solene, so i use Solr ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8 so normaly ISOLatin1AccentFilterFactory have to replace the accent ... any ideas ? i use DataImportHandler. If a mapping rule è to e is always true in your field, you can try to use MappingCharFilter instead of ISOLatin1AccentFilter. Add the following line to mapping-ISOLatin1Accent.txt: è = e and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. Koji -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html Sent from the Solr - User mailing list archive at Nabble.com.
how can I check field which are indexed but not stored?
Hi I've an issue, I've some data which come up but I've applied a filtre on it and it shouldnt, when I check in my database mysql I've obviously the document which has been updated so I will like to see how it is in solr. if I do : /solr/video/select?q=id:8582006 I will just see field which has been stored. Is there a way to see how data are indexed for other field of my schema which are not stored but indexed. Like a bit in the console dataimporthandler, which with verbose activated I can see every field of my schema. Otherwise what would you reckon in this case, a document which has not been updated ? how can I sort it out? Thanks a lot guys for your excellent help -- View this message in context: http://www.nabble.com/how-can-I-check-field-which-are-indexed-but-not-stored--tp22617914p22617914.html Sent from the Solr - User mailing list archive at Nabble.com.
FW: Special character indexing
Thanks Shalin, Adding BinaryUpdateRequestHandler solved the issue. Thank you very much. Just one query, shouldn't XmlUpdateRequestHandler also work for these characters? I saw another user mentioning the same issue and it was working with DirectXmlRequest. -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: Friday, March 20, 2009 3:58 PM To: solr-user@lucene.apache.org Subject: Re: Special character indexing On Fri, Mar 20, 2009 at 3:19 PM, Gargate, Siddharth sgarg...@ptc.comwrote: Hi Shalin, Thanks for the suggestion. I tried following code, (not sure about the exact usage) CommonsHttpSolrServer ess = new CommonsHttpSolrServer( http://localhost:8983/solr;); ess.setRequestWriter(new BinaryRequestWriter()); SolrInputDocument solrdoc = new SolrInputDocument(); solrdoc.addField(id, Kimi); solrdoc.addField(name, 03 Kimi Räikkönen ); ess.add(solrdoc); But got following exception on the server WARNING: The @Deprecated SolrUpdateServlet does not accept query parameters: wt=javabin If you are using solrj, make sure to register a request handler to /update rather then use this servlet. Add: requestHandler name=/update class=solr.XmlUpdateRequestHandler to your solrconfig.xml Yes, you need to add the following to your solrconfig.xml requestHandler name=/update/javabin class=solr.BinaryUpdateRequestHandler / -- Regards, Shalin Shekhar Mangar.
Re: alternative lucene directories support
Otis, The fact is that some code instantiates FSDirectory indirectly by using deprecated constructors. I provided a patch here https://issues.apache.org/jira/browse/SOLR-465 but I don't have rights to re-open the issue. Also there is logic in Solr which is tied to file system usage even if file system index is not used: - SolrCore chechs index existence by looking at file system directory existense which is incorrect in the case of non-fs directory - Spell checker has FSDirectory hard-code IMO this code is ought to be changed too. I'm ready to contribute all these changes if it's appropriate. Is it better to write to dev maillist for that? On Thu, Mar 19, 2009 at 8:58 PM, Otis Gospodnetic otis_gospodne...@yahoo.com wrote: My quick grep of the sources and scan of the results doesn't see any problematic areas, but if you see some places that still need a fix, yes, please reopen the issue and submit the patch. Do you also plan on submitting the actual alternative Directory impl? $ ffjg FSDire | egrep 'SolrIndexW|SolrCore|UpdateH' ./src/java/org/apache/solr/core/SolrCore.java:import org.apache.lucene.store.FSDirectory; ./src/java/org/apache/solr/core/SolrCore.java://return new SolrIndexSearcher(this, schema, main, IndexReader.open(FSDirectory.getDirectory(getIndexDir()), readOnly), true, false); Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Andrey Klochkov akloch...@griddynamics.com To: solr-user@lucene.apache.org Sent: Thursday, March 19, 2009 10:22:57 AM Subject: alternative lucene directories support Hi all We want to use Solr with lucene Directory implementation which places index into Coherence data grid. I fact I managed to run Solr in such configuration although I had to patch it. I think that the issue about alternate directories support (SOLR-465) should be re-opened because there are some places in source code where FSDirectory hard-coding is still present (SolrCore, SolrIndexWriter and UpdateHandler). I can provide a patch to fix it. WDYT? -- Andrew Klochkov -- Andrew Klochkov
Facet Query Results Issue
Hi, this is my facet query. facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100] This is my query and these are results, I am getting: int name=100202/int int name=1057/int int name=10.614/int int name=10.211/int int name=10.6710/int int name=10.89/int int name=10.999/int int name=1.337/int int name=16/int int name=10.45/int int name=10.344/int int name=1.012/int int name=1.22/int int name=1.662/int int name=10.632/int int name=10.662/int int name=1.41/int int name=1.71/int int name=1.81/int int name=10.331/int int name=10.751/int int name=10.91/int int name=.010/int int name=.20/int int name=100.050/int int name=100.070/int int name=100.130/int int name=100.20/int int name=100.250/int int name=100.330/int int name=100.40/int int name=100.450/int int name=100.530/int int name=100.60/int int name=100.670/int int name=100.730/int int name=100.80/int int name=100.870/int int name=100.950/int int name=100.960/int int name=1010/int int name=101.10/int int name=101.130/int int name=101.20/int int name=101.270/int int name=101.330/int int name=101.40/int int name=101.470/int int name=101.60/int int name=101.670/int int name=101.730/int int name=101.80/int int name=101.870/int int name=1020/int int name=102.070/int int name=102.190/int int name=102.20/int int name=102.270/int int name=102.330/int int name=102.40/int int name=102.530/int int name=102.60/int int name=102.670/int int name=102.80/int int name=102.870/int int name=102.930/int int name=1022.40/int int name=1030/int It is only returning results, which are having values started with 2, 3, 4 or some other integer instead of only 1. It is not returning records in which value is 10 and 100. Please suggest. thanks, Amit -- View this message in context: http://www.nabble.com/Facet-Query-Results-Issue-tp22617883p22617883.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with Facet Query
Hi Shalin, One more thing, facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100] This is my query and these are results, I am getting: int name=100202/int int name=1057/int int name=10.614/int int name=10.211/int int name=10.6710/int int name=10.89/int int name=10.999/int int name=1.337/int int name=16/int int name=10.45/int int name=10.344/int int name=1.012/int int name=1.22/int int name=1.662/int int name=10.632/int int name=10.662/int int name=1.41/int int name=1.71/int int name=1.81/int int name=10.331/int int name=10.751/int int name=10.91/int int name=.010/int int name=.20/int int name=100.050/int int name=100.070/int int name=100.130/int int name=100.20/int int name=100.250/int int name=100.330/int int name=100.40/int int name=100.450/int int name=100.530/int int name=100.60/int int name=100.670/int int name=100.730/int int name=100.80/int int name=100.870/int int name=100.950/int int name=100.960/int int name=1010/int int name=101.10/int int name=101.130/int int name=101.20/int int name=101.270/int int name=101.330/int int name=101.40/int int name=101.470/int int name=101.60/int int name=101.670/int int name=101.730/int int name=101.80/int int name=101.870/int int name=1020/int int name=102.070/int int name=102.190/int int name=102.20/int int name=102.270/int int name=102.330/int int name=102.40/int int name=102.530/int int name=102.60/int int name=102.670/int int name=102.80/int int name=102.870/int int name=102.930/int int name=1022.40/int int name=1030/int It is only returning results, which are having values started with 2, 3, 4 or some other integer instead of only 1. It is not returning records in which value is 10 and 100. Please suggest. thanks, Amit dabboo wrote: Thanks Shalin, thanks a lot. I appreciate your help in resolving this issue. Thanks, Amit Shalin Shekhar Mangar wrote: On Fri, Mar 20, 2009 at 2:27 PM, dabboo ag...@sapient.com wrote: Shalin, thanks a lot. One quick question: Now, after putting the query in the way, you suggested, I am getting: - lst name=facet_counts - lst name=facet_queries int name=productPrice_product_s:[0 TO 20]23315/int /lst lst name=facet_fields / lst name=facet_dates / /lst But it is not returning me records. Do I need to enter this field entry in schema.xml to get the records or anywhere else. facet.query returns the number of documents matching that query after applying any filters (fq) that you may have specified. Can you tell us your use-case? -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22617745.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Special character indexing
On Fri, Mar 20, 2009 at 3:19 PM, Gargate, Siddharth sgarg...@ptc.comwrote: Hi Shalin, Thanks for the suggestion. I tried following code, (not sure about the exact usage) CommonsHttpSolrServer ess = new CommonsHttpSolrServer( http://localhost:8983/solr;); ess.setRequestWriter(new BinaryRequestWriter()); SolrInputDocument solrdoc = new SolrInputDocument(); solrdoc.addField(id, Kimi); solrdoc.addField(name, 03 Kimi Räikkönen ); ess.add(solrdoc); But got following exception on the server WARNING: The @Deprecated SolrUpdateServlet does not accept query parameters: wt=javabin If you are using solrj, make sure to register a request handler to /update rather then use this servlet. Add: requestHandler name=/update class=solr.XmlUpdateRequestHandler to your solrconfig.xml Yes, you need to add the following to your solrconfig.xml requestHandler name=/update/javabin class=solr.BinaryUpdateRequestHandler / -- Regards, Shalin Shekhar Mangar.
Re: how can I check field which are indexed but not stored?
On Fri, 2009-03-20 at 03:41 -0700, sunnyfr wrote: Hi I've an issue, I've some data which come up but I've applied a filtre on it and it shouldnt, when I check in my database mysql I've obviously the document which has been updated so I will like to see how it is in solr. if I do : /solr/video/select?q=id:8582006 I will just see field which has been stored. Is there a way to see how data are indexed for other field of my schema which are not stored but indexed. /solr/admin/luke will show you a lot of information concering stored and indexed fields. Hope this is what you meant. Like a bit in the console dataimporthandler, which with verbose activated I can see every field of my schema. Otherwise what would you reckon in this case, a document which has not been updated ? how can I sort it out? Thanks a lot guys for your excellent help
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
I'm using DataImportHandler to send my data to Solr ! so you mean it possible to apply a transformer in db-config.xml with a perl script ? Óscar Marín Miró wrote: Hi, My guess is that *although* your DB is in UTF-8, the database engine sends you the rows in ISO-Latin1, so before doing *anything* after receiving the data, you should transcode from ISO-Latin1 to UTF-8 and then send that to SolR. I'm no Java expert, but in perl (MySQL DB in utf-8) I have to do with any row: $row=decode(iso-8859-1,$row); ... and before building the xml to invoque and add document to SolR: $row=encode(utf8,$row); On Fri, Mar 20, 2009 at 10:55 AM, aerox7 amyne.berr...@me.com wrote: I add : è = e to mapping-ISOLatin1Accent.txt and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType By still have the same probleme ! it's only work when i store ISO string into UTF-8 data base (ex: store solène not solène) :,( aerox7 wrote: == where are you seeing it as Solène as opposed to the correct way of solène? I have Solène in my Mysql DATA BASE ! so i don't know if this is correct or not ? i gess that Solène is solène in UTF-8 ?! I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so when i try with solène everything is ok ! but when i try with Solène (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!! I think that ISOLatin1AccentFilterFactory take only string with Charset ISO-8859-1 . So any solution to transform my string to ISO-8859-1 before indexing process. May be by creating transformer in DataImportHandler ? (Never code in java :( ) Thank you all. Koji Sekiguchi-2 wrote: aerox7 wrote: Hi, I have a mysql data base in UTF-8. I have a row with Solène (solène). I want to transforme this to solene, so i use Solr ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8 so normaly ISOLatin1AccentFilterFactory have to replace the accent ... any ideas ? i use DataImportHandler. If a mapping rule è to e is always true in your field, you can try to use MappingCharFilter instead of ISOLatin1AccentFilter. Add the following line to mapping-ISOLatin1Accent.txt: è = e and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. Koji -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html Sent from the Solr - User mailing list archive at Nabble.com. -- “I may not believe in myself, but I believe in what I'm doing.” -- Jimmy Page -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22618085.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: FW: Special character indexing
On Fri, Mar 20, 2009 at 4:13 PM, Gargate, Siddharth sgarg...@ptc.comwrote: Thanks Shalin, Adding BinaryUpdateRequestHandler solved the issue. Thank you very much. Just one query, shouldn't XmlUpdateRequestHandler also work for these characters? I saw another user mentioning the same issue and it was working with DirectXmlRequest. It should. I'll run a few tests to see where is the problem. -- Regards, Shalin Shekhar Mangar.
Re: Issue with Facet Query
On Fri, Mar 20, 2009 at 4:00 PM, dabboo ag...@sapient.com wrote: Hi Shalin, One more thing, facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100] This is my query and these are results, I am getting: int name=100202/int int name=1057/int int name=10.614/int int name=10.211/int int name=10.6710/int int name=10.89/int int name=10.999/int int name=1.337/int int name=16/int int name=10.45/int int name=10.344/int int name=1.012/int int name=1.22/int int name=1.662/int int name=10.632/int int name=10.662/int int name=1.41/int int name=1.71/int int name=1.81/int int name=10.331/int int name=10.751/int int name=10.91/int int name=.010/int int name=.20/int int name=100.050/int int name=100.070/int int name=100.130/int int name=100.20/int int name=100.250/int int name=100.330/int int name=100.40/int int name=100.450/int int name=100.530/int int name=100.60/int int name=100.670/int int name=100.730/int int name=100.80/int int name=100.870/int int name=100.950/int int name=100.960/int int name=1010/int int name=101.10/int int name=101.130/int int name=101.20/int int name=101.270/int int name=101.330/int int name=101.40/int int name=101.470/int int name=101.60/int int name=101.670/int int name=101.730/int int name=101.80/int int name=101.870/int int name=1020/int int name=102.070/int int name=102.190/int int name=102.20/int int name=102.270/int int name=102.330/int int name=102.40/int int name=102.530/int int name=102.60/int int name=102.670/int int name=102.80/int int name=102.870/int int name=102.930/int int name=1022.40/int int name=1030/int It is only returning results, which are having values started with 2, 3, 4 or some other integer instead of only 1. It is not returning records in which value is 10 and 100. Please do not send a duplicate mails. It will not help you get an answer faster. If you need to filter results to a specific range then you should use filter queries through the fq parameter: fq=productPrice_product_str_s:[0%20TO%20100] -- Regards, Shalin Shekhar Mangar.
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
What I mean is that unless solène travels to Solr in strict UTF-8, mapping-ISOLatin1Accent won't do anything, and posibly your DB query returns data in ISO-Latin1 (I always have this issue with UTF8-Mysql), so unless you transcode your data from Latin1 to UTF8 before sending it to SolR, mapping-ISOLatin1Accent won't know how to interpret it. Does it make any sense? :P On Fri, Mar 20, 2009 at 11:53 AM, aerox7 amyne.berr...@me.com wrote: I'm using DataImportHandler to send my data to Solr ! so you mean it possible to apply a transformer in db-config.xml with a perl script ? Óscar Marín Miró wrote: Hi, My guess is that *although* your DB is in UTF-8, the database engine sends you the rows in ISO-Latin1, so before doing *anything* after receiving the data, you should transcode from ISO-Latin1 to UTF-8 and then send that to SolR. I'm no Java expert, but in perl (MySQL DB in utf-8) I have to do with any row: $row=decode(iso-8859-1,$row); ... and before building the xml to invoque and add document to SolR: $row=encode(utf8,$row); On Fri, Mar 20, 2009 at 10:55 AM, aerox7 amyne.berr...@me.com wrote: I add : è = e to mapping-ISOLatin1Accent.txt and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType By still have the same probleme ! it's only work when i store ISO string into UTF-8 data base (ex: store solène not solène) :,( aerox7 wrote: == where are you seeing it as Solène as opposed to the correct way of solène? I have Solène in my Mysql DATA BASE ! so i don't know if this is correct or not ? i gess that Solène is solène in UTF-8 ?! I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so when i try with solène everything is ok ! but when i try with Solène (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!! I think that ISOLatin1AccentFilterFactory take only string with Charset ISO-8859-1 . So any solution to transform my string to ISO-8859-1 before indexing process. May be by creating transformer in DataImportHandler ? (Never code in java :( ) Thank you all. Koji Sekiguchi-2 wrote: aerox7 wrote: Hi, I have a mysql data base in UTF-8. I have a row with Solène (solène). I want to transforme this to solene, so i use Solr ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8 so normaly ISOLatin1AccentFilterFactory have to replace the accent ... any ideas ? i use DataImportHandler. If a mapping rule è to e is always true in your field, you can try to use MappingCharFilter instead of ISOLatin1AccentFilter. Add the following line to mapping-ISOLatin1Accent.txt: è = e and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. Koji -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html Sent from the Solr - User mailing list archive at Nabble.com. -- “I may not believe in myself, but I believe in what I'm doing.” -- Jimmy Page -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22618085.html Sent from the Solr - User mailing list archive at Nabble.com. -- “I may not believe in myself, but I believe in what I'm doing.” -- Jimmy Page
Re: Issue with Facet Query
I am using this query only but I am getting the same results. facet=truefacet.field=productPrice_product_str_sfq=productPrice_product_str_s:[1%20TO%20100] - lst name=facet_fields - lst name=productPrice_product_str_s int name=100202/int int name=1057/int int name=10.614/int int name=10.211/int int name=10.6710/int int name=10.89/int int name=10.999/int int name=1.337/int int name=16/int int name=10.45/int int name=10.344/int int name=1.012/int int name=1.22/int int name=1.662/int int name=10.632/int int name=10.662/int int name=1.41/int int name=1.71/int int name=1.81/int int name=10.331/int int name=10.751/int int name=10.91/int int name=.010/int int name=.20/int int name=00/int int name=100.050/int int name=100.070/int int name=100.130/int int name=100.20/int int name=100.250/int int name=100.330/int int name=100.40/int int name=100.450/int int name=100.530/int int name=100.60/int int name=100.670/int int name=100.730/int int name=100.80/int int name=100.870/int int name=100.950/int int name=100.960/int int name=1010/int int name=101.10/int int name=101.130/int int name=101.20/int int name=101.270/int int name=101.330/int int name=101.40/int int name=101.470/int int name=101.60/int int name=101.670/int int name=101.730/int int name=101.80/int int name=101.870/int It still is not showing up the other values. Do I need to make any entry in schema or solrConfig xml files. Do I need to convert the string into numeric values etc etc. Please suggest. Thanks, Amit Shalin Shekhar Mangar wrote: On Fri, Mar 20, 2009 at 4:00 PM, dabboo ag...@sapient.com wrote: Hi Shalin, One more thing, facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100] This is my query and these are results, I am getting: int name=100202/int int name=1057/int int name=10.614/int int name=10.211/int int name=10.6710/int int name=10.89/int int name=10.999/int int name=1.337/int int name=16/int int name=10.45/int int name=10.344/int int name=1.012/int int name=1.22/int int name=1.662/int int name=10.632/int int name=10.662/int int name=1.41/int int name=1.71/int int name=1.81/int int name=10.331/int int name=10.751/int int name=10.91/int int name=.010/int int name=.20/int int name=100.050/int int name=100.070/int int name=100.130/int int name=100.20/int int name=100.250/int int name=100.330/int int name=100.40/int int name=100.450/int int name=100.530/int int name=100.60/int int name=100.670/int int name=100.730/int int name=100.80/int int name=100.870/int int name=100.950/int int name=100.960/int int name=1010/int int name=101.10/int int name=101.130/int int name=101.20/int int name=101.270/int int name=101.330/int int name=101.40/int int name=101.470/int int name=101.60/int int name=101.670/int int name=101.730/int int name=101.80/int int name=101.870/int int name=1020/int int name=102.070/int int name=102.190/int int name=102.20/int int name=102.270/int int name=102.330/int int name=102.40/int int name=102.530/int int name=102.60/int int name=102.670/int int name=102.80/int int name=102.870/int int name=102.930/int int name=1022.40/int int name=1030/int It is only returning results, which are having values started with 2, 3, 4 or some other integer instead of only 1. It is not returning records in which value is 10 and 100. Please do not send a duplicate mails. It will not help you get an answer faster. If you need to filter results to a specific range then you should use filter queries through the fq parameter: fq=productPrice_product_str_s:[0%20TO%20100] -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22618714.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with Facet Query
And you'll need to re-index once you make the schema change. On Fri, Mar 20, 2009 at 5:24 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: What is the type of the productPrice_product_str field? I'm guessing that it is a string type. Since it is a float value and you need range search, you should change this to a 'sfloat' or 'sdouble' in your schema.xml On Fri, Mar 20, 2009 at 5:11 PM, dabboo ag...@sapient.com wrote: I am using this query only but I am getting the same results. facet=truefacet.field=productPrice_product_str_sfq=productPrice_product_str_s:[1%20TO%20100] - lst name=facet_fields - lst name=productPrice_product_str_s int name=100202/int int name=1057/int int name=10.614/int int name=10.211/int int name=10.6710/int int name=10.89/int int name=10.999/int int name=1.337/int int name=16/int int name=10.45/int int name=10.344/int int name=1.012/int int name=1.22/int int name=1.662/int int name=10.632/int int name=10.662/int int name=1.41/int int name=1.71/int int name=1.81/int int name=10.331/int int name=10.751/int int name=10.91/int int name=.010/int int name=.20/int int name=00/int int name=100.050/int int name=100.070/int int name=100.130/int int name=100.20/int int name=100.250/int int name=100.330/int int name=100.40/int int name=100.450/int int name=100.530/int int name=100.60/int int name=100.670/int int name=100.730/int int name=100.80/int int name=100.870/int int name=100.950/int int name=100.960/int int name=1010/int int name=101.10/int int name=101.130/int int name=101.20/int int name=101.270/int int name=101.330/int int name=101.40/int int name=101.470/int int name=101.60/int int name=101.670/int int name=101.730/int int name=101.80/int int name=101.870/int It still is not showing up the other values. Do I need to make any entry in schema or solrConfig xml files. Do I need to convert the string into numeric values etc etc. Please suggest. Thanks, Amit Shalin Shekhar Mangar wrote: On Fri, Mar 20, 2009 at 4:00 PM, dabboo ag...@sapient.com wrote: Hi Shalin, One more thing, facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100] This is my query and these are results, I am getting: int name=100202/int int name=1057/int int name=10.614/int int name=10.211/int int name=10.6710/int int name=10.89/int int name=10.999/int int name=1.337/int int name=16/int int name=10.45/int int name=10.344/int int name=1.012/int int name=1.22/int int name=1.662/int int name=10.632/int int name=10.662/int int name=1.41/int int name=1.71/int int name=1.81/int int name=10.331/int int name=10.751/int int name=10.91/int int name=.010/int int name=.20/int int name=100.050/int int name=100.070/int int name=100.130/int int name=100.20/int int name=100.250/int int name=100.330/int int name=100.40/int int name=100.450/int int name=100.530/int int name=100.60/int int name=100.670/int int name=100.730/int int name=100.80/int int name=100.870/int int name=100.950/int int name=100.960/int int name=1010/int int name=101.10/int int name=101.130/int int name=101.20/int int name=101.270/int int name=101.330/int int name=101.40/int int name=101.470/int int name=101.60/int int name=101.670/int int name=101.730/int int name=101.80/int int name=101.870/int int name=1020/int int name=102.070/int int name=102.190/int int name=102.20/int int name=102.270/int int name=102.330/int int name=102.40/int int name=102.530/int int name=102.60/int int name=102.670/int int name=102.80/int int name=102.870/int int name=102.930/int int name=1022.40/int int name=1030/int It is only returning results, which are having values started with 2, 3, 4 or some other integer instead of only 1. It is not returning records in which value is 10 and 100. Please do not send a duplicate mails. It will not help you get an answer faster. If you need to filter results to a specific range then you should use filter queries through the fq parameter: fq=productPrice_product_str_s:[0%20TO%20100] -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22618714.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar. -- Regards, Shalin Shekhar Mangar.
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
A got you :) Sorry. Correct, I use a Perl client. But sorry to say, I don't use DataImportHandler. I just make the queries to the DB, filter the results, and build the solr XML 'by hand' at the perl script :( On Fri, Mar 20, 2009 at 1:04 PM, aerox7 amyne.berr...@me.com wrote: Yes ! i completely understand the problem. I'm just asking about your solution to resolvre this problem. I gess that you use Solar PERL Client to index your DATABASE. for my case i use DataImportHandler, so to only solution that i have with this is to create a transformer for DataImportHandler and try to convert my row from latin to UTF-8. (see http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9 ) So i just wanna know if you use DataImportHandler two with a perl script like a transformer ? Óscar Marín Miró wrote: What I mean is that unless solène travels to Solr in strict UTF-8, mapping-ISOLatin1Accent won't do anything, and posibly your DB query returns data in ISO-Latin1 (I always have this issue with UTF8-Mysql), so unless you transcode your data from Latin1 to UTF8 before sending it to SolR, mapping-ISOLatin1Accent won't know how to interpret it. Does it make any sense? :P On Fri, Mar 20, 2009 at 11:53 AM, aerox7 amyne.berr...@me.com wrote: I'm using DataImportHandler to send my data to Solr ! so you mean it possible to apply a transformer in db-config.xml with a perl script ? Óscar Marín Miró wrote: Hi, My guess is that *although* your DB is in UTF-8, the database engine sends you the rows in ISO-Latin1, so before doing *anything* after receiving the data, you should transcode from ISO-Latin1 to UTF-8 and then send that to SolR. I'm no Java expert, but in perl (MySQL DB in utf-8) I have to do with any row: $row=decode(iso-8859-1,$row); ... and before building the xml to invoque and add document to SolR: $row=encode(utf8,$row); On Fri, Mar 20, 2009 at 10:55 AM, aerox7 amyne.berr...@me.com wrote: I add : è = e to mapping-ISOLatin1Accent.txt and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType By still have the same probleme ! it's only work when i store ISO string into UTF-8 data base (ex: store solène not solène) :,( aerox7 wrote: == where are you seeing it as Solène as opposed to the correct way of solène? I have Solène in my Mysql DATA BASE ! so i don't know if this is correct or not ? i gess that Solène is solène in UTF-8 ?! I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so when i try with solène everything is ok ! but when i try with Solène (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!! I think that ISOLatin1AccentFilterFactory take only string with Charset ISO-8859-1 . So any solution to transform my string to ISO-8859-1 before indexing process. May be by creating transformer in DataImportHandler ? (Never code in java :( ) Thank you all. Koji Sekiguchi-2 wrote: aerox7 wrote: Hi, I have a mysql data base in UTF-8. I have a row with Solène (solène). I want to transforme this to solene, so i use Solr ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8 so normaly ISOLatin1AccentFilterFactory have to replace the accent ... any ideas ? i use DataImportHandler. If a mapping rule è to e is always true in your field, you can try to use MappingCharFilter instead of ISOLatin1AccentFilter. Add the following line to mapping-ISOLatin1Accent.txt: è = e and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. Koji -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html Sent from the Solr - User mailing list archive at Nabble.com. -- “I may not believe in myself, but I believe in what I'm doing.” -- Jimmy Page -- View this message in context:
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
On Fri, Mar 20, 2009 at 5:34 PM, aerox7 amyne.berr...@me.com wrote: Yes ! i completely understand the problem. I'm just asking about your solution to resolvre this problem. I gess that you use Solar PERL Client to index your DATABASE. for my case i use DataImportHandler, so to only solution that i have with this is to create a transformer for DataImportHandler and try to convert my row from latin to UTF-8. (see http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9 ) So i just wanna know if you use DataImportHandler two with a perl script like a transformer ? No, but you can use any language which is available on the Java VM. For example, Javascript (available by default on JDK6), JRuby, Jython, Groovy, BeanShell etc. But you may not need to do so much. Look at http://www.mysqlperformanceblog.com/2009/03/17/converting-character-sets/ -- Regards, Shalin Shekhar Mangar.
Re: Facet Query Results Issue
On Mar 20, 2009, at 6:39 AM, dabboo wrote: this is my facet query. facet .field =productPrice_product_str_sfacet.query=productPrice_product_str_s: [0%20TO%20100] This is my query and these are results, I am getting: It is only returning results, which are having values started with 2, 3, 4 or some other integer instead of only 1. It is not returning records in which value is 10 and 100. Please suggest. If you want the counts filtered, use fq (instead of or in addition to facet.query). facet.query/facet.field are for generating counts for documents that match q/fq parameters, but do not themselves filter. Erik
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
My DATABASE is already in UTF-8 (Collation and Charset). I already set Tomcat connector to UTF-8, and Mysql default charset to UTF-8 How to force mysql to send on UTF-8 (Or may be i have to do this for TomCat ?) i'm going crazy... :) Shalin Shekhar Mangar wrote: On Fri, Mar 20, 2009 at 5:34 PM, aerox7 amyne.berr...@me.com wrote: Yes ! i completely understand the problem. I'm just asking about your solution to resolvre this problem. I gess that you use Solar PERL Client to index your DATABASE. for my case i use DataImportHandler, so to only solution that i have with this is to create a transformer for DataImportHandler and try to convert my row from latin to UTF-8. (see http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9 ) So i just wanna know if you use DataImportHandler two with a perl script like a transformer ? No, but you can use any language which is available on the Java VM. For example, Javascript (available by default on JDK6), JRuby, Jython, Groovy, BeanShell etc. But you may not need to do so much. Look at http://www.mysqlperformanceblog.com/2009/03/17/converting-character-sets/ -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22619285.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Issue with Facet Query
What is the type of the productPrice_product_str field? I'm guessing that it is a string type. Since it is a float value and you need range search, you should change this to a 'sfloat' or 'sdouble' in your schema.xml On Fri, Mar 20, 2009 at 5:11 PM, dabboo ag...@sapient.com wrote: I am using this query only but I am getting the same results. facet=truefacet.field=productPrice_product_str_sfq=productPrice_product_str_s:[1%20TO%20100] - lst name=facet_fields - lst name=productPrice_product_str_s int name=100202/int int name=1057/int int name=10.614/int int name=10.211/int int name=10.6710/int int name=10.89/int int name=10.999/int int name=1.337/int int name=16/int int name=10.45/int int name=10.344/int int name=1.012/int int name=1.22/int int name=1.662/int int name=10.632/int int name=10.662/int int name=1.41/int int name=1.71/int int name=1.81/int int name=10.331/int int name=10.751/int int name=10.91/int int name=.010/int int name=.20/int int name=00/int int name=100.050/int int name=100.070/int int name=100.130/int int name=100.20/int int name=100.250/int int name=100.330/int int name=100.40/int int name=100.450/int int name=100.530/int int name=100.60/int int name=100.670/int int name=100.730/int int name=100.80/int int name=100.870/int int name=100.950/int int name=100.960/int int name=1010/int int name=101.10/int int name=101.130/int int name=101.20/int int name=101.270/int int name=101.330/int int name=101.40/int int name=101.470/int int name=101.60/int int name=101.670/int int name=101.730/int int name=101.80/int int name=101.870/int It still is not showing up the other values. Do I need to make any entry in schema or solrConfig xml files. Do I need to convert the string into numeric values etc etc. Please suggest. Thanks, Amit Shalin Shekhar Mangar wrote: On Fri, Mar 20, 2009 at 4:00 PM, dabboo ag...@sapient.com wrote: Hi Shalin, One more thing, facet.field=productPrice_product_str_sfacet.query=productPrice_product_str_s:[0%20TO%20100] This is my query and these are results, I am getting: int name=100202/int int name=1057/int int name=10.614/int int name=10.211/int int name=10.6710/int int name=10.89/int int name=10.999/int int name=1.337/int int name=16/int int name=10.45/int int name=10.344/int int name=1.012/int int name=1.22/int int name=1.662/int int name=10.632/int int name=10.662/int int name=1.41/int int name=1.71/int int name=1.81/int int name=10.331/int int name=10.751/int int name=10.91/int int name=.010/int int name=.20/int int name=100.050/int int name=100.070/int int name=100.130/int int name=100.20/int int name=100.250/int int name=100.330/int int name=100.40/int int name=100.450/int int name=100.530/int int name=100.60/int int name=100.670/int int name=100.730/int int name=100.80/int int name=100.870/int int name=100.950/int int name=100.960/int int name=1010/int int name=101.10/int int name=101.130/int int name=101.20/int int name=101.270/int int name=101.330/int int name=101.40/int int name=101.470/int int name=101.60/int int name=101.670/int int name=101.730/int int name=101.80/int int name=101.870/int int name=1020/int int name=102.070/int int name=102.190/int int name=102.20/int int name=102.270/int int name=102.330/int int name=102.40/int int name=102.530/int int name=102.60/int int name=102.670/int int name=102.80/int int name=102.870/int int name=102.930/int int name=1022.40/int int name=1030/int It is only returning results, which are having values started with 2, 3, 4 or some other integer instead of only 1. It is not returning records in which value is 10 and 100. Please do not send a duplicate mails. It will not help you get an answer faster. If you need to filter results to a specific range then you should use filter queries through the fq parameter: fq=productPrice_product_str_s:[0%20TO%20100] -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Issue-with-Facet-Query-tp22615577p22618714.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
Yes ! i completely understand the problem. I'm just asking about your solution to resolvre this problem. I gess that you use Solar PERL Client to index your DATABASE. for my case i use DataImportHandler, so to only solution that i have with this is to create a transformer for DataImportHandler and try to convert my row from latin to UTF-8. (see http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9) So i just wanna know if you use DataImportHandler two with a perl script like a transformer ? Óscar Marín Miró wrote: What I mean is that unless solène travels to Solr in strict UTF-8, mapping-ISOLatin1Accent won't do anything, and posibly your DB query returns data in ISO-Latin1 (I always have this issue with UTF8-Mysql), so unless you transcode your data from Latin1 to UTF8 before sending it to SolR, mapping-ISOLatin1Accent won't know how to interpret it. Does it make any sense? :P On Fri, Mar 20, 2009 at 11:53 AM, aerox7 amyne.berr...@me.com wrote: I'm using DataImportHandler to send my data to Solr ! so you mean it possible to apply a transformer in db-config.xml with a perl script ? Óscar Marín Miró wrote: Hi, My guess is that *although* your DB is in UTF-8, the database engine sends you the rows in ISO-Latin1, so before doing *anything* after receiving the data, you should transcode from ISO-Latin1 to UTF-8 and then send that to SolR. I'm no Java expert, but in perl (MySQL DB in utf-8) I have to do with any row: $row=decode(iso-8859-1,$row); ... and before building the xml to invoque and add document to SolR: $row=encode(utf8,$row); On Fri, Mar 20, 2009 at 10:55 AM, aerox7 amyne.berr...@me.com wrote: I add : è = e to mapping-ISOLatin1Accent.txt and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType By still have the same probleme ! it's only work when i store ISO string into UTF-8 data base (ex: store solène not solène) :,( aerox7 wrote: == where are you seeing it as Solène as opposed to the correct way of solène? I have Solène in my Mysql DATA BASE ! so i don't know if this is correct or not ? i gess that Solène is solène in UTF-8 ?! I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so when i try with solène everything is ok ! but when i try with Solène (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!! I think that ISOLatin1AccentFilterFactory take only string with Charset ISO-8859-1 . So any solution to transform my string to ISO-8859-1 before indexing process. May be by creating transformer in DataImportHandler ? (Never code in java :( ) Thank you all. Koji Sekiguchi-2 wrote: aerox7 wrote: Hi, I have a mysql data base in UTF-8. I have a row with Solène (solène). I want to transforme this to solene, so i use Solr ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8 so normaly ISOLatin1AccentFilterFactory have to replace the accent ... any ideas ? i use DataImportHandler. If a mapping rule è to e is always true in your field, you can try to use MappingCharFilter instead of ISOLatin1AccentFilter. Add the following line to mapping-ISOLatin1Accent.txt: è = e and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. Koji -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22617278.html Sent from the Solr - User mailing list archive at Nabble.com. -- “I may not believe in myself, but I believe in what I'm doing.” -- Jimmy Page -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22618085.html Sent from the Solr - User mailing list archive at Nabble.com. -- “I may not believe in myself, but I believe in what I'm doing.” -- Jimmy Page -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22618999.html Sent from the Solr - User mailing list archive at Nabble.com.
Error in identifying the primary key
Hi, I am new to Solr. I am trying to index SQL table rows. I am getting the below error. Can anyone help me in resolving this issue. Mar 20, 2009 6:03:38 PM org.apache.solr.handler.dataimport.DataImporter verifyWithSchema INFO: id is a required field in SolrSchema . But not found in DataConfig Mar 20, 2009 6:03:38 PM org.apache.solr.handler.dataimport.DataImportHandler inform SEVERE: Exception while loading DataImporter org.apache.solr.handler.dataimport.DataImportHandlerException: There are errors in the Schema The field :age present in DataConfig does not have a counterpart in Solr Schema The field :firstname present in DataConfig does not have a counterpart in Solr Schema The field :lastName present in DataConfig does not have a counterpart in Solr Schema at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:108) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:95) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388) at org.apache.solr.core.SolrCore.init(SolrCore.java:571) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:121) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:78) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:448) at org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Mar 20, 2009 6:03:38 PM org.apache.solr.servlet.SolrDispatchFilter init SEVERE: Could not start SOLR. Check solr/home property org.apache.solr.common.SolrException: FATAL: Could not create importer. DataImporter config invalid at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:103) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388) at org.apache.solr.core.SolrCore.init(SolrCore.java:571) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:121) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) Thanks
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
Hi, Maybe this info is handy for you: http://dev.mysql.com/doc/refman/5.0/en/charset-connection.html The fact is Mysql can have UTF8 in its storage engine (or defined by database), as you have, but the *connection* to the mysql client, can be set to latin1. In fact, here are my character_set variables: character_set_client = latin1 character_set_connection = latin1 character_set_database = utf8 character_set_filesystem = binary character_set_results = latin1 character_set_server = latin1 character_set_system = utf8 character_sets_dir = /usr/share/mysql/charsets/ As you see, the database is in utf8, *but* the client, connection, results and server, expects latin1. You can see this variables through a mysql console, just typing: $ mysql -u user -p Enter password: Welcome to the MySQL monitor. Commands end with ; or \g. Your MySQL connection id is 8114 Server version: 5.0.32-Debian_7etch5-log Debian etch distribution Type 'help;' or '\h' for help. Type '\c' to clear the buffer. mysql SHOW VARIABLES LIKE 'character_set%'; +--++ | Variable_name| Value | +--++ | character_set_client | latin1 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results| latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--++ 8 rows in set (0.00 sec) and change them like this: mysql SET character_set_client = utf8; Query OK, 0 rows affected (0.00 sec) mysql SHOW VARIABLES LIKE 'character_set%'; +--++ | Variable_name| Value | +--++ | character_set_client | utf8 | | character_set_connection | latin1 | | character_set_database | latin1 | | character_set_filesystem | binary | | character_set_results| latin1 | | character_set_server | latin1 | | character_set_system | utf8 | | character_sets_dir | /usr/share/mysql/charsets/ | +--++ 8 rows in set (0.00 sec) So... maybe after setting all variables that are set to latin1 to utf8 can solve your problem? If they are set to latin1, of course ;) If this is not the problem, hell, we escaped from work just for a few minutes :P On Fri, Mar 20, 2009 at 1:25 PM, aerox7 amyne.berr...@me.com wrote: My DATABASE is already in UTF-8 (Collation and Charset). I already set Tomcat connector to UTF-8, and Mysql default charset to UTF-8 How to force mysql to send on UTF-8 (Or may be i have to do this for TomCat ?) i'm going crazy... :) Shalin Shekhar Mangar wrote: On Fri, Mar 20, 2009 at 5:34 PM, aerox7 amyne.berr...@me.com wrote: Yes ! i completely understand the problem. I'm just asking about your solution to resolvre this problem. I gess that you use Solar PERL Client to index your DATABASE. for my case i use DataImportHandler, so to only solution that i have with this is to create a transformer for DataImportHandler and try to convert my row from latin to UTF-8. (see http://wiki.apache.org/solr/DataImportHandler#head-27fcc2794bd71f7d727104ffc6b99e194bdb6ff9 ) So i just wanna know if you use DataImportHandler two with a perl script like a transformer ? No, but you can use any language which is available on the Java VM. For example, Javascript (available by default on JDK6), JRuby, Jython, Groovy, BeanShell etc. But you may not need to do so much. Look at http://www.mysqlperformanceblog.com/2009/03/17/converting-character-sets/ -- Regards, Shalin Shekhar Mangar. -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22619285.html Sent from the Solr - User mailing list archive at Nabble.com. -- “I may not believe in myself, but I believe in what I'm doing.” -- Jimmy Page
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
Usually, when I see characters like this, it means you aren't viewing/ handling the UTF-8 correctly when bringing it into Java. I would first check that your DB or JDBC driver is getting the chars out right. It may even be the case that they did not go into the DB correctly in the first place. On Mar 20, 2009, at 4:36 AM, aerox7 wrote: == where are you seeing it as Solène as opposed to the correct way of solène? I have Solène in my Mysql DATA BASE ! so i don't know if this is correct or not ? i gess that Solène is solène in UTF-8 ?! I'vz tryed analysis in http://localhost:8983/solr/admin/ analysis.jsp, so when i try with solène everything is ok ! but when i try with Solène (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!! I think that ISOLatin1AccentFilterFactory take only string with Charset ISO-8859-1 . So any solution to transform my string to ISO-8859-1 before indexing process. May be by creating transformer in DataImportHandler ? (Never code in java :( ) Thank you all. Koji Sekiguchi-2 wrote: aerox7 wrote: Hi, I have a mysql data base in UTF-8. I have a row with Solène (solène). I want to transforme this to solene, so i use Solr ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8 so normaly ISOLatin1AccentFilterFactory have to replace the accent ... any ideas ? i use DataImportHandler. If a mapping rule è to e is always true in your field, you can try to use MappingCharFilter instead of ISOLatin1AccentFilter. Add the following line to mapping-ISOLatin1Accent.txt: è = e and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. Koji -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22616220.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solrj : probleme with utf-8 content
Hi, I have that problem to. But I notice that it only happens if I send my data via solrj. If I send it via the solr-ruby gem, everything is fine (http://wiki.apache.org/solr/solr-ruby). Here is my jruby script: --- require 'rubygems' require 'solr' require 'rexml/document' include Java def send_via_solrj(text, url) doc = org.apache.solr.common.SolrInputDocument.new doc.addField('id', '1') doc.addField('text', text) server = org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.new(url) server.add(doc); server.commit(); end def send_via_gem(text, url) solr_doc = Solr::Document.new solr_doc['id'] = '2' solr_doc['text'] = text options = { :autocommit = :on } conn = Solr::Connection.new(url, options) conn.add(solr_doc) end host = 'localhost' port = '' path = '/solr/core0' url = http://#{host}:#{port}#{path}; text = eaiou with circumflexes: êâîôû send_via_solrj(text, url) send_via_gem(text, url) puts done! --- If I watch the http messages with tcpmon, I see that the data sent via solrj is encoded in cp1252 while the data sent via the gem is utf-8. Anyone has an idea of how we can configure sorlj to send in utf-8? Thanks in advance. Walid ABDELKABIR wrote: when executing this code I got in my index the field includes with this value : ? ? ? : --- String content =eaiou with circumflexes: êâîôû; SolrInputDocument doc = new SolrInputDocument(); doc.addField( id, 123, 1.0f ); doc.addField( includes, content, 1.0f ); server.add( doc ); --- but this code works fine : --- String addContent = adddoc boost=1.0 +field name=id123/fieldfield name=includeseaiou with circumflexes:âîôû/field +/doc/add; DirectXmlRequest up = new DirectXmlRequest( /update, addContent ); server.request( up ); --- thanks for help -- View this message in context: http://www.nabble.com/solrj-%3A-probleme-with-utf-8-content-tp22577377p22620317.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: delta-import commit=false doesn't seems to work
Thanks I gave more information there : http://www.nabble.com/Problem-for-replication-%3A-segment-optimized-automaticly-td22601442.html thanks a lot Paul Noble Paul നോബിള് नोब्ळ् wrote: sorry, the whole thing was commented . I did not notice that. I'll look into that 2009/3/20 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com: you have set autoCommit every x minutes . it must have invoked commit automatically On Thu, Mar 19, 2009 at 4:17 PM, sunnyfr johanna...@gmail.com wrote: Hi, Even if I hit command=delta-importcommit=falseoptimize=false I still have commit set in my logs and sometimes even optimize=true, About optimize I wonder if it comes from commitment too close and one is not done, but still I don't know really. Any idea? Thanks a lot, -- View this message in context: http://www.nabble.com/delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22597630p22597630.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul -- --Noble Paul -- View this message in context: http://www.nabble.com/Re%3A-delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22614216p22620439.html Sent from the Solr - User mailing list archive at Nabble.com.
Unknown FieldType: 'string' used in QueryElevationComponent
Hi, I am having below schema.xml, I did not define any string field. But I am getting the below error when I start Tomcat, Can anyone please suggest me what is the issue here. WARNING: No queryConverter defined, using default converter Mar 20, 2009 7:31:55 PM org.apache.solr.core.QuerySenderListener newSearcher INFO: QuerySenderListener sending requests to searc...@fe135d main Mar 20, 2009 7:31:55 PM org.apache.solr.servlet.SolrDispatchFilter init SEVERE: Could not start SOLR. Check solr/home property org.apache.solr.common.SolrException: Unknown FieldType: 'string' used in QueryElevationComponent at org.apache.solr.handler.component.QueryElevationComponent.inform(QueryElevationComponent.java:151) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388) at org.apache.solr.core.SolrCore.init(SolrCore.java:571) schema name=example types fieldType name=text class=solr.TextField positionIncrementGap=100/ fieldType name=integer class=solr.IntField omitNorms=true/ /types fields !-- BOOKS -- field name=person_id type=integer indexed=true stored=true multivalued=false required=true/ field name=first_name type=text indexed=true stored=true multivalued=false/ field name=last_name type=text indexed=true stored=true multivalued=false/ field name=_age type=integer indexed=true stored=true multivalued=false/ field name=all type=text indexed=true stored=true multivalued=true/ /fields uniqueKeyperson_id/uniqueKey defaultSearchFieldall/defaultSearchField solrQueryParser defaultOperator=OR/ copyField source=first_name dest=all/ copyField source=last_name dest=all/ copyField source=_age dest=all/ /schema
Re: how can I check field which are indexed but not stored?
Cool I was just having a look on it but it doesn't seem to show up field which are not stored just tried : /admin/luke?id=8582006fl=description but it doesn't seems to work :( It find this id but show up stored field. Did I do a mistake ? thanks a lot Markus Jelsma - Buyways B.V. wrote: On Fri, 2009-03-20 at 03:41 -0700, sunnyfr wrote: Hi I've an issue, I've some data which come up but I've applied a filtre on it and it shouldnt, when I check in my database mysql I've obviously the document which has been updated so I will like to see how it is in solr. if I do : /solr/video/select?q=id:8582006 I will just see field which has been stored. Is there a way to see how data are indexed for other field of my schema which are not stored but indexed. /solr/admin/luke will show you a lot of information concering stored and indexed fields. Hope this is what you meant. Like a bit in the console dataimporthandler, which with verbose activated I can see every field of my schema. Otherwise what would you reckon in this case, a document which has not been updated ? how can I sort it out? Thanks a lot guys for your excellent help -- View this message in context: http://www.nabble.com/how-can-I-check-field-which-are-indexed-but-not-stored--tp22617914p22621773.html Sent from the Solr - User mailing list archive at Nabble.com.
q.alt and highlights
Is there any way to activate highlights using q.alt of dismax? I have hl well configurated and working for normal q in the field content (in the solr.xml). For q.alt, I try to do: http://localhost:8080/solr/select/?q=q.alt=my_id:475836start=0rows=10hl=true But no highlight is showed... Any advice? -- View this message in context: http://www.nabble.com/q.alt-and-highlights-tp22621774p22621774.html Sent from the Solr - User mailing list archive at Nabble.com.
JVM exception_access_violation
I'm running Solr on Tomcat 6.0.18 with Java 6 update 7 on Windows 2003 64 bit. Over the past month or so, my JVM has crashed twice with the error below. Has anyone experienced this? My system is not heavily loaded, and the crash seems to coincide with an update (via DIH). I'm running trunk code from late January. Note that I update my index ~50 times per day, and this crash has happened twice in the past month (so 2 of 1500 updates seem to have triggered the crash). This Windows deployment is for demos, so I'm not too concerned about it. Interestingly, my production deployment is on a 64 bit Linux system (same versions of everything) and I haven't been able to reproduce the bug there. # # An unexpected error has been detected by Java Runtime Environment: # # EXCEPTION_ACCESS_VIOLATION (0xc005) at pc=0x080e51c3, pid=4404, tid=956 # # Java VM: Java HotSpot(TM) 64-Bit Server VM (10.0-b23 mixed mode windows-amd64) # Problematic frame: # V [jvm.dll+0xe51c3] # # If you would like to submit a bug report, please visit: # http://java.sun.com/webapps/bugreport/crash.jsp # --- T H R E A D --- Current thread (0x01de2000): GCTaskThread [stack: 0x,0x] [id=956] siginfo: ExceptionCode=0xc005, reading address 0x Registers: EAX=0x3000, EBX=0x01e40330, ECX=0x000184b49821, EDX=0x000184b4b580 ESP=0x07cff9b0, EBP=0x, ESI=0x000184b4b580, EDI=0x0935 EIP=0x080e51c3, EFLAGS=0x00010206 Top of Stack: (sp=0x07cff9b0) 0x07cff9b0: 01e40330 0x07cff9c0: 000184b4dd88 0935 0x07cff9d0: 08464b08 01dbbdc0 0x07cff9e0: 01dbf190 8a65 0x07cff9f0: 2f5b4000 0002015f 0x07cffa00: 0002 01dbf2f0 0x07cffa10: 01e40330 01dbf430 0x07cffa20: 01dbf4f0 000201602d18 0x07cffa30: 07effa00 07cffb40 0x07cffa40: 0x07cffa50: 0830484d 0x07cffa60: 0002015f 0002 0x07cffa70: 0048 0001 0x07cffa80: 0001 00bb8501 0x07cffa90: 01dbf378 080ea807 0x07cffaa0: 07cffb40 07cffb40 Instructions: (pc=0x080e51c3) 0x080e51b3: 4c 8d 44 24 20 48 8b d6 48 8b 41 10 48 83 c1 10 0x080e51c3: ff 90 c0 01 00 00 44 8b 1d 08 f2 44 00 45 85 db Stack: [0x,0x], sp=0x07cff9b0, free space=127998k Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code) V [jvm.dll+0xe51c3] [error occurred during error reporting (printing native stack), id 0xc005] --- P R O C E S S --- Java Threads: ( = current thread ) 0x10286c00 JavaThread Thread-135 daemon [_thread_blocked, id=4892, stack(0x1169,0x1179)] 0x10285400 JavaThread http-8084-10 daemon [_thread_blocked, id=5108, stack(0x1201,0x1211)] 0x10287400 JavaThread http-8084-9 daemon [_thread_blocked, id=1772, stack(0x149a,0x14aa)] 0x1028a400 JavaThread http-8084-8 daemon [_thread_blocked, id=1656, stack(0x11f1,0x1201)] 0x01dc2c00 JavaThread http-8084-7 daemon [_thread_blocked, id=2056, stack(0x11e1,0x11f1)] 0x10288400 JavaThread http-8084-6 daemon [_thread_blocked, id=4792, stack(0x11d1,0x11e1)] 0x10286800 JavaThread MultiThreadedHttpConnectionManager cleanup daemon [_thread_blocked, id=3792, stack(0x1251,0x1261)] 0x0f6e8400 JavaThread http-8084-5 daemon [_thread_blocked, id=3540, stack(0x11c1,0x11d1)] 0x0f6e7800 JavaThread http-8084-4 daemon [_thread_blocked, id=4048, stack(0x11b1,0x11c1)] 0x0f6e8000 JavaThread http-8084-3 daemon [_thread_blocked, id=1932, stack(0x1159,0x1169)] 0x0f6e7000 JavaThread http-8084-2 daemon [_thread_blocked, id=996, stack(0x1149,0x1159)] 0x01dc6000 JavaThread http-8084-1 daemon [_thread_blocked, id=4924, stack(0x1139,0x1149)] 0x01dc5800 JavaThread TP-Monitor daemon [_thread_blocked, id=2288, stack(0x1121,0x1131)] 0x01dc5400 JavaThread TP-Processor4 daemon [_thread_in_native, id=4588, stack(0x,0x1121)] 0x01dc4c00 JavaThread TP-Processor3 daemon [_thread_blocked, id=652, stack(0x1101,0x)] 0x01dc4400 JavaThread TP-Processor2
Re: stop word search
Hi Erik, I have now commented the query time stopword analyzer .I restarted the server.But now when i search for a stop word ,i am getting results. We had earlier indexed the content with the stop word analyzer.I dont think we need to reindex after commentting the query analyzer,right? This field is a text field with the defaul analyzer. Please let me know if i have missed something here. Regards Sujatha On 3/17/09, Erick Erickson erickerick...@gmail.com wrote: Well, by definition, using an analyzer that removes stopwords *should* do this at query time. This assumes that you used an analyzer that removed stopwords at index and query time. The stopwords are not in the index. You can get the behavior you expect by using an analyzer at query time that does NOT remove stopwords, and one at indexing time that *does* remove stopwords. Gut I'm having a hard time imagining that this would result in a good user experience. I mean anytime that you had a stopword in the query where the stopword was required, no results would be returned. Which would be hard to explain to a user What is it you're trying to accomplish? Best Erick On Tue, Mar 17, 2009 at 7:40 AM, revas revas...@gmail.com wrote: Hi, I have a query like this content:the AND iuser_id:5 which means return all docs of user id 5 which have the word the in content .Since 'the' is a stop word ,this query executes as just user_id :5 inspite of the AND clause ,Whereas the expected result here is since there is no result for the ,no results shloud be returned. Am i missing anythin here? Regards
Re: Error in identifying the primary key
for all the fields mentioned in data-config.xml there should be a counterpart in schema.xml anyway that is relaxed in the latest nightly On Fri, Mar 20, 2009 at 6:26 PM, radha c radhas...@gmail.com wrote: Hi, I am new to Solr. I am trying to index SQL table rows. I am getting the below error. Can anyone help me in resolving this issue. Mar 20, 2009 6:03:38 PM org.apache.solr.handler.dataimport.DataImporter verifyWithSchema INFO: id is a required field in SolrSchema . But not found in DataConfig Mar 20, 2009 6:03:38 PM org.apache.solr.handler.dataimport.DataImportHandler inform SEVERE: Exception while loading DataImporter org.apache.solr.handler.dataimport.DataImportHandlerException: There are errors in the Schema The field :age present in DataConfig does not have a counterpart in Solr Schema The field :firstname present in DataConfig does not have a counterpart in Solr Schema The field :lastName present in DataConfig does not have a counterpart in Solr Schema at org.apache.solr.handler.dataimport.DataImporter.init(DataImporter.java:108) at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:95) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388) at org.apache.solr.core.SolrCore.init(SolrCore.java:571) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:121) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:78) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4222) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:831) at org.apache.catalina.startup.HostConfig.deployWARs(HostConfig.java:720) at org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:490) at org.apache.catalina.startup.HostConfig.start(HostConfig.java:1150) at org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311) at org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022) at org.apache.catalina.core.StandardHost.start(StandardHost.java:736) at org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014) at org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443) at org.apache.catalina.core.StandardService.start(StandardService.java:448) at org.apache.catalina.core.StandardServer.start(StandardServer.java:700) at org.apache.catalina.startup.Catalina.start(Catalina.java:552) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295) at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433) Mar 20, 2009 6:03:38 PM org.apache.solr.servlet.SolrDispatchFilter init SEVERE: Could not start SOLR. Check solr/home property org.apache.solr.common.SolrException: FATAL: Could not create importer. DataImporter config invalid at org.apache.solr.handler.dataimport.DataImportHandler.inform(DataImportHandler.java:103) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:388) at org.apache.solr.core.SolrCore.init(SolrCore.java:571) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:121) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) Thanks -- --Noble Paul
Re: delta-import commit=false doesn't seems to work
just hit the DIH without any command and you may be able to see the status of the last import. It can tell you whether a commit/optimize was performed On Fri, Mar 20, 2009 at 7:07 PM, sunnyfr johanna...@gmail.com wrote: Thanks I gave more information there : http://www.nabble.com/Problem-for-replication-%3A-segment-optimized-automaticly-td22601442.html thanks a lot Paul Noble Paul നോബിള് नोब्ळ् wrote: sorry, the whole thing was commented . I did not notice that. I'll look into that 2009/3/20 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com: you have set autoCommit every x minutes . it must have invoked commit automatically On Thu, Mar 19, 2009 at 4:17 PM, sunnyfr johanna...@gmail.com wrote: Hi, Even if I hit command=delta-importcommit=falseoptimize=false I still have commit set in my logs and sometimes even optimize=true, About optimize I wonder if it comes from commitment too close and one is not done, but still I don't know really. Any idea? Thanks a lot, -- View this message in context: http://www.nabble.com/delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22597630p22597630.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul -- --Noble Paul -- View this message in context: http://www.nabble.com/Re%3A-delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22614216p22620439.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul
DIH data-config loading
I'm trying to load or delete entities in data-config in runtime, changing the data-config.xml file, reload and delete or full-import as needed.My question is: does data-config gets loaded into memory in runtime an reload only, that is, can I change the file while solr is importing or deleting data? Another question: to delete documents, a different handler from import is used (update), is it problematic to delete documents from a determinate entity while importing? Thanks in advance, Rui Pereira
Re: delta-import commit=false doesn't seems to work
Like you can see, I did that and I've no information in my DIH but you can notice in my logs and even my segments that and optimize is fired alone automaticly? Noble Paul നോബിള് नोब्ळ् wrote: just hit the DIH without any command and you may be able to see the status of the last import. It can tell you whether a commit/optimize was performed On Fri, Mar 20, 2009 at 7:07 PM, sunnyfr johanna...@gmail.com wrote: Thanks I gave more information there : http://www.nabble.com/Problem-for-replication-%3A-segment-optimized-automaticly-td22601442.html thanks a lot Paul Noble Paul നോബിള് नोब्ळ् wrote: sorry, the whole thing was commented . I did not notice that. I'll look into that 2009/3/20 Noble Paul നോബിള് नोब्ळ् noble.p...@gmail.com: you have set autoCommit every x minutes . it must have invoked commit automatically On Thu, Mar 19, 2009 at 4:17 PM, sunnyfr johanna...@gmail.com wrote: Hi, Even if I hit command=delta-importcommit=falseoptimize=false I still have commit set in my logs and sometimes even optimize=true, About optimize I wonder if it comes from commitment too close and one is not done, but still I don't know really. Any idea? Thanks a lot, -- View this message in context: http://www.nabble.com/delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22597630p22597630.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul -- --Noble Paul -- View this message in context: http://www.nabble.com/Re%3A-delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22614216p22620439.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul -- View this message in context: http://www.nabble.com/Re%3A-delta-import-commit%3Dfalse-doesn%27t-seems-to-work-tp22614216p22625149.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solrj : probleme with utf-8 content
do you know if your java file is encoded with utf-8? sometimes it will be encoded as something different and that can cause funny problems.. On Mar 18, 2009, at 7:46 AM, Walid ABDELKABIR wrote: when executing this code I got in my index the field includes with this value : ? ? ? : --- String content =eaiou with circumflexes: êâîôû; SolrInputDocument doc = new SolrInputDocument(); doc.addField( id, 123, 1.0f ); doc.addField( includes, content, 1.0f ); server.add( doc ); --- but this code works fine : --- String addContent = adddoc boost=1.0 +field name=id123/fieldfield name=includeseaiou with circumflexes:âîôû/field +/doc/add; DirectXmlRequest up = new DirectXmlRequest( /update, addContent ); server.request( up ); --- thanks for help
Re: DIH data-config loading
On Fri, Mar 20, 2009 at 10:57 PM, Rui Pereira ruipereira...@gmail.com wrote: I'm trying to load or delete entities in data-config in runtime, changing the data-config.xml file, reload and delete or full-import as needed.My question is: does data-config gets loaded into memory in runtime an reload only, that is, can I change the file while solr is importing or deleting data? it is safe to edit the data-config.xml . The reload happens only only if you issue the command=reload-config Another question: to delete documents, a different handler from import is used (update), is it problematic to delete documents from a determinate entity while importing? Solr does not have an issue , but be aware that the commit may be happening after the import and if that is OK for your data then it should be OK Thanks in advance, Rui Pereira -- --Noble Paul
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
May be there is an issue with the recent changes with SOLR-973 I have given a new patch on SOLR-973 aerox ,is it possible to confirm if that is the problem On Fri, Mar 20, 2009 at 6:52 PM, Grant Ingersoll gsing...@apache.org wrote: Usually, when I see characters like this, it means you aren't viewing/handling the UTF-8 correctly when bringing it into Java. I would first check that your DB or JDBC driver is getting the chars out right. It may even be the case that they did not go into the DB correctly in the first place. On Mar 20, 2009, at 4:36 AM, aerox7 wrote: == where are you seeing it as Solène as opposed to the correct way of solène? I have Solène in my Mysql DATA BASE ! so i don't know if this is correct or not ? i gess that Solène is solène in UTF-8 ?! I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so when i try with solène everything is ok ! but when i try with Solène (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!! I think that ISOLatin1AccentFilterFactory take only string with Charset ISO-8859-1 . So any solution to transform my string to ISO-8859-1 before indexing process. May be by creating transformer in DataImportHandler ? (Never code in java :( ) Thank you all. Koji Sekiguchi-2 wrote: aerox7 wrote: Hi, I have a mysql data base in UTF-8. I have a row with Solène (solène). I want to transforme this to solene, so i use Solr ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8 so normaly ISOLatin1AccentFilterFactory have to replace the accent ... any ideas ? i use DataImportHandler. If a mapping rule è to e is always true in your field, you can try to use MappingCharFilter instead of ISOLatin1AccentFilter. Add the following line to mapping-ISOLatin1Accent.txt: è = e and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. Koji -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22616220.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul
Re: solrj : probleme with utf-8 content
SOLR-973 seems to have caused the problem On Fri, Mar 20, 2009 at 11:01 PM, Ryan McKinley ryan...@gmail.com wrote: do you know if your java file is encoded with utf-8? sometimes it will be encoded as something different and that can cause funny problems.. On Mar 18, 2009, at 7:46 AM, Walid ABDELKABIR wrote: when executing this code I got in my index the field includes with this value : ? ? ? : --- String content =eaiou with circumflexes: êâîôû; SolrInputDocument doc = new SolrInputDocument(); doc.addField( id, 123, 1.0f ); doc.addField( includes, content, 1.0f ); server.add( doc ); --- but this code works fine : --- String addContent = adddoc boost=1.0 +field name=id123/fieldfield name=includeseaiou with circumflexes:âîôû/field +/doc/add; DirectXmlRequest up = new DirectXmlRequest( /update, addContent ); server.request( up ); --- thanks for help -- --Noble Paul
Re: Page-Rank algorithm
Victor, Solr knows nothing about hyperlinks, web pages, and such. Solr doesn't even have a web crawler. You should ask on nutch-u...@lucene... mailing list instead. The answer there will be positive. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Huang, Zijian(Victor) zijian.hu...@etrade.com To: solr-user@lucene.apache.org Sent: Thursday, March 19, 2009 5:55:36 PM Subject: Page-Rank algorithm Hi, Do you guys know if there is some versions of the page-rank algorithm already implemented in Solr(Lucene)? If not, how hard is it to implement. I am trying to improve the ranking relevance for Solr. Thanks Vic
Re: solrj : probleme with utf-8 content
yes, now it works fine with the trunk sources thanks! Noble Paul നോബിള് नोब्ळ् wrote: SOLR-973 seems to have caused the problem On Fri, Mar 20, 2009 at 11:01 PM, Ryan McKinley ryan...@gmail.com wrote: do you know if your java file is encoded with utf-8? sometimes it will be encoded as something different and that can cause funny problems.. On Mar 18, 2009, at 7:46 AM, Walid ABDELKABIR wrote: when executing this code I got in my index the field includes with this value : ? ? ? : --- String content =eaiou with circumflexes: êâîôû; SolrInputDocument doc = new SolrInputDocument(); doc.addField( id, 123, 1.0f ); doc.addField( includes, content, 1.0f ); server.add( doc ); --- but this code works fine : --- String addContent = adddoc boost=1.0 +field name=id123/fieldfield name=includeseaiou with circumflexes:âîôû/field +/doc/add; DirectXmlRequest up = new DirectXmlRequest( /update, addContent ); server.request( up ); --- thanks for help -- --Noble Paul -- View this message in context: http://www.nabble.com/solrj-%3A-probleme-with-utf-8-content-tp22577377p22627715.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: stop word search
Yes, you do need to reindex after removing the stopword filter from the configuration. When you indexed the first time using the stopword filter, the words were NOT indexed, so they won't be found now that they're getting through the query analyzer. Best Erick On Fri, Mar 20, 2009 at 1:02 PM, revas revas...@gmail.com wrote: Hi Erik, I have now commented the query time stopword analyzer .I restarted the server.But now when i search for a stop word ,i am getting results. We had earlier indexed the content with the stop word analyzer.I dont think we need to reindex after commentting the query analyzer,right? This field is a text field with the defaul analyzer. Please let me know if i have missed something here. Regards Sujatha On 3/17/09, Erick Erickson erickerick...@gmail.com wrote: Well, by definition, using an analyzer that removes stopwords *should* do this at query time. This assumes that you used an analyzer that removed stopwords at index and query time. The stopwords are not in the index. You can get the behavior you expect by using an analyzer at query time that does NOT remove stopwords, and one at indexing time that *does* remove stopwords. Gut I'm having a hard time imagining that this would result in a good user experience. I mean anytime that you had a stopword in the query where the stopword was required, no results would be returned. Which would be hard to explain to a user What is it you're trying to accomplish? Best Erick On Tue, Mar 17, 2009 at 7:40 AM, revas revas...@gmail.com wrote: Hi, I have a query like this content:the AND iuser_id:5 which means return all docs of user id 5 which have the word the in content .Since 'the' is a stop word ,this query executes as just user_id :5 inspite of the AND clause ,Whereas the expected result here is since there is no result for the ,no results shloud be returned. Am i missing anythin here? Regards
Re: Stemming in Solr
: Can someone please let me know how to implement stemming in solr. I am : particularly looking of the changes, I might need to do in the config files : and also if I need to use some already supplied libraries/factories etc etc. i would start by searching the wiki and email archives for stemming... http://wiki.apache.org/solr/?action=fullsearchcontext=180value=stemmingfullsearch=Text -Hoss
Re: Special Characters search in solr
: Yes, I did and below is my debugQuery result. before you even look at the debug section, look at the params section in the responseHeader... : str name=qColo�/str the raw value Solr is getting from your servlet container doesn't match what you think you are sending... : It is actually converting Coloèr to Colo� and hence not searching. It is ...i'm guessing that either your servlet container is missconfigured for dealing with UTF-8 characters, or your client code is doing something not quite right ... untill you get that value you expect to see coming back in that responseHeader, there's no point in fiddling with your schema. -Hoss
Re: Issue with Facet Query
: I am using this query only but I am getting the same results. : : : facet=truefacet.field=productPrice_product_str_sfq=productPrice_product_str_s:[1%20TO%20100] ... : It still is not showing up the other values. Do I need to make any entry in : schema or solrConfig xml files. Do I need to convert the string into numeric : values etc etc. ... : It is only returning results, which are having values started with 2, 3, : 4 : or some other integer instead of only 1. It is not returning records in : which value is 10 and 100. your fq param is saying you only want docs matching values between 1 and 100, you seem to be using a string type, so it's not going to match anything starting with a character other then a 1 ... if it doens't match any docs with values like 23 then the facet counts for 23 are going to be 0 as well. reading between the lines, i think you missunderstood Shalin about 10 messages ago ... fq is for providing a *filter* query, it restricts the results of your entire query. facet.query is for faceting on an arbitrary query (which can be a range query) if you search for 'ipod' and you want to get back *all* the documents that match, but you also want to know how many of those have a price between $10 and $100 use a facet.query. if you search for 'ipod' and you want to get back *only* the documents that have a price between $10 and $100 use an fq. ...but either way: yes, convert to a numeric field type so that your ranges will actually work properly. -Hoss
Re: Problem with UTF-8 and Solr ISOLatin1AccentFilterFactory
Hi, I've cheked MySql conf with mysql SHOW VARIABLES LIKE 'character_set%'; : all character_set are in UTF-8. I think that dataimporter get data in ISO. so the i just write a custom transformer to change the row's charset from iso to utf and now it work. -- Noble Paul : I use SOLR 1.4 Nighty 2009-03-18 build. i have to download the last one to apply your patch ? Noble Paul നോബിള് नोब्ळ् wrote: May be there is an issue with the recent changes with SOLR-973 I have given a new patch on SOLR-973 aerox ,is it possible to confirm if that is the problem On Fri, Mar 20, 2009 at 6:52 PM, Grant Ingersoll gsing...@apache.org wrote: Usually, when I see characters like this, it means you aren't viewing/handling the UTF-8 correctly when bringing it into Java. I would first check that your DB or JDBC driver is getting the chars out right. It may even be the case that they did not go into the DB correctly in the first place. On Mar 20, 2009, at 4:36 AM, aerox7 wrote: == where are you seeing it as Solène as opposed to the correct way of solène? I have Solène in my Mysql DATA BASE ! so i don't know if this is correct or not ? i gess that Solène is solène in UTF-8 ?! I'vz tryed analysis in http://localhost:8983/solr/admin/analysis.jsp, so when i try with solène everything is ok ! but when i try with Solène (like what i have in DB) analysis convert à in A delete ¨ so i get SolAne !!! I think that ISOLatin1AccentFilterFactory take only string with Charset ISO-8859-1 . So any solution to transform my string to ISO-8859-1 before indexing process. May be by creating transformer in DataImportHandler ? (Never code in java :( ) Thank you all. Koji Sekiguchi-2 wrote: aerox7 wrote: Hi, I have a mysql data base in UTF-8. I have a row with Solène (solène). I want to transforme this to solene, so i use Solr ISOLatin1AccentFilterFactory to perform this task but it dosn't work ?!! i gess that Solène is solène in UTF-8 ?! i also set tomcat to utf-8 so normaly ISOLatin1AccentFilterFactory have to replace the accent ... any ideas ? i use DataImportHandler. If a mapping rule è to e is always true in your field, you can try to use MappingCharFilter instead of ISOLatin1AccentFilter. Add the following line to mapping-ISOLatin1Accent.txt: è = e and add the following fieldType: fieldType name=textCharNorm class=solr.TextField positionIncrementGap=100 analyzer charFilter class=solr.MappingCharFilterFactory mapping=mapping-ISOLatin1Accent.txt/ tokenizer class=solr.CharStreamAwareWhitespaceTokenizerFactory/ /analyzer /fieldType MappingCharFilter and mapping-ISOLatin1Accent.txt are in nightly build. Koji -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22616220.html Sent from the Solr - User mailing list archive at Nabble.com. -- --Noble Paul -- View this message in context: http://www.nabble.com/Problem-with-UTF-8-and-Solr-ISOLatin1AccentFilterFactory-tp22607642p22633051.html Sent from the Solr - User mailing list archive at Nabble.com.