Re: Geo spatial clustering of points
Hi Chris Jeroen, Tonight I posted some tips on Solr's wiki on this subject: http://wiki.apache.org/solr/SpatialClustering ~ David Chris Atkinson wrote Did you get any resolution for this? I'm about to implement something identical. On 3 Jul 2013 23:03, Jeroen Steggink lt; jeroen@ gt; wrote: Hi, I'm looking for a way to clustering (or should I call it group) geo spatial points on map based on the current zoom level and get the median coordinate for that cluster. Let's say I'm on the world level, and I want to cluster spatial points within a 1000 km radius. When I zoom in I only want to get the clustered points for that boundary. Let's say all the points within the US and cluster them within a 500 km radius. I'm using Solr 4.3.0 and looked into SpatialRecursivePrefixTreeFiel**dType with faceting. However, I'm not sure if the geohashes are of any use for clustering points. Does anyone have any experience with geo spatial clustering with Solr? Regards, jeroen - Author: http://www.packtpub.com/apache-solr-3-enterprise-search-server/book -- View this message in context: http://lucene.472066.n3.nabble.com/Geo-spatial-clustering-of-points-tp4075315p4086243.html Sent from the Solr - User mailing list archive at Nabble.com.
how to integrate solr with HDFS HA
Hi all I try to integrate solr with HDFS HA.When I start the solr server, it comes out an exeception[1]. And I do know this is because the hadoop.conf.Configuration in HdfsDirectoryFactory.java does not include the HA configuration. So I want to know ,in solr,is there any way to include my hadoop HA configuration ? [1]--- Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: lklcluster at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:415) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:382) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:123) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2277) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2311) at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:2299) at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:364) at org.apache.solr.store.hdfs.HdfsDirectory.init(HdfsDirectory.java:59) at org.apache.solr.core.HdfsDirectoryFactory.create(HdfsDirectoryFactory.java:154) at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:350) at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:256) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:469) at org.apache.solr.core.SolrCore.init(SolrCore.java:759)
Reloading synonyms and stop words
Hello, Is it possible to reload the synonyms and stopwords files without rebooting solr? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal
Re: Reloading synonyms and stop words
Yes, you can use the Core RELOAD command: https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com wrote: Hello, Is it possible to reload the synonyms and stopwords files without rebooting solr? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar.
Re: Flushing cache without restarting everything?
On Thu, 2013-08-22 at 20:08 +0200, Walter Underwood wrote: We warm the file buffers before starting Solr to avoid spending time waiting for disk IO. The script is something like this: for core in core1 core2 core3 do find /apps/solr/data/${core}/index -type f | xargs cat /dev/null done On that subject, I will note that the reason for not just doing a simple cp index_files /dev/null is that the shell (sometimes? always?) is clever enough to skip the copying when it is done directly to /dev/null. It makes a big difference in the first few minutes of service. Of course, it helps if you have enough RAM to hold the entire index. That is actually essential if you want to perform a reproducible test. If there is not enough free RAM for the full amount of index files, the first files from find will not be cached at all and if the index has changed since last test, that makes it pretty arbitrary which parts are cached and which are not. -- Ceterum censeo spinning drives esse delendam
Re: when does RAMBufferSize work when commit.
Hi Shawn Thanks a lot. I got it. Regards 2013/8/22 Shawn Heisey s...@elyograg.org On 8/22/2013 2:25 AM, YouPeng Yang wrote: Hi all About the RAMBufferSize and commit ,I have read the doc : http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/60544 I can not figure out how do they make work. Given the settings: ramBufferSizeMB10/ramBufferSizeMB autoCommit maxTime${solr.autoCommit.maxDocs:1000}/maxTime openSearcherfalse/openSearcher /autoCommit If the indexs docs up to 1000 and the size of these docs is below 10MB ,it will trigger an commit. If the size of the indexed docs reaches to 10MB while the the number is below 1000, it will not trigger an commit , however the index docs will just be flushed to disk,it will only commit when the number reaches to 1000? Your actual config seems to have its wires crossed a little bit. You have the autoCommit.maxDocs value being used in a maxTime tag, not a maxDocs tag. You may want to adjust the variable name or the tag. If that were a maxDocs tag instead of maxTime, your description would be pretty much right on the money. The space taken in the RAM buffer is typically larger than the actual document size, but the general idea is sound. The default for RAMBufferSizeMB in recent Solr versions is 100. Unless you've got super small documents, or you are in a limited memory situation and have a lot of cores, I would not go smaller than that. Thanks, Shawn
Leader election
Hi, I am using solr 4.4 for my search application. I was indexing some 1 million docs. At that time, i accidentally killed leader node of that collection. Indexing failed with the exception , /org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request:[http://localhost:9133/solr/collection7_shard1_replica3, http://localhost:8983/solr/collection7_shard1_replica1] at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:333) at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:318) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)/ after that i checked solr admin page, leader election didnt get triggered for that collection. http://lucene.472066.n3.nabble.com/file/n4086259/Screenshot.png I couldnt able to index for that collection but i can able to search from that collection. Help me in this issue Thanks in advance Srivatsan -- View this message in context: http://lucene.472066.n3.nabble.com/Leader-election-tp4086259.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Reloading synonyms and stop words
Great! What about inside a RequestHandler source code in Java? I want to create a requestHandler that receives new synonyms, insert them on the synonyms file and reload the core. Regards Bruno On Fri, Aug 23, 2013 at 9:28 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Yes, you can use the Core RELOAD command: https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com wrote: Hello, Is it possible to reload the synonyms and stopwords files without rebooting solr? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar. -- Bruno René Santos Lisboa - Portugal
Re: Solr Ref guide question
The version is the 4.4, I did the download, unzip and run the command in example folder: java -jar start.jar Is a fresh install, no modification done. -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Thursday, August 22, 2013 at 9:22 PM, Brendan Grainger wrote: What version of solr are you using? Have you copied a solr.xml from somewhere else? I can almost reproduce the error you're getting if I put a non-existent core in my solr.xml, e.g.: solr cores adminPath=/admin/cores core name=core0 instanceDir=a_non_existent_core / /cores ... On Thu, Aug 22, 2013 at 1:30 PM, yriveiro yago.rive...@gmail.com (mailto:yago.rive...@gmail.com) wrote: Hi all, I think that there is some lack in solr's ref doc. Section Running Solr says to run solr using the command: $ java -jar start.jar But If I do this with a fresh install, I have a stack trace like this: http://pastebin.com/5YRRccTx Is it this behavior as expected? - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Ref-guide-question-tp4086142.html Sent from the Solr - User mailing list archive at Nabble.com (http://Nabble.com). -- Brendan Grainger www.kuripai.com (http://www.kuripai.com)
Indexing status when one tomcat goes down
hi all, Im running solr cloud with solr 4.4 . I have 2 tomcat instances with 4 shards ( 2 in each). What will happen if one of the tomcats go down during indexing. The otehr tomcat throws status as Leader not active in the logs. Regards, Prasi
Re: Solr Ref guide question
I found the problem. The java version was overwritten for a dependency and was 1.5. Reinstalling java, Solr works as expected. Tal vez si fuese lanzado un error diciendo que la versión no era compatible ayudaba en estos casos. -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Friday, August 23, 2013 at 10:30 AM, Yago Riveiro wrote: The version is the 4.4, I did the download, unzip and run the command in example folder: java -jar start.jar Is a fresh install, no modification done. -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Thursday, August 22, 2013 at 9:22 PM, Brendan Grainger wrote: What version of solr are you using? Have you copied a solr.xml from somewhere else? I can almost reproduce the error you're getting if I put a non-existent core in my solr.xml, e.g.: solr cores adminPath=/admin/cores core name=core0 instanceDir=a_non_existent_core / /cores ... On Thu, Aug 22, 2013 at 1:30 PM, yriveiro yago.rive...@gmail.com (mailto:yago.rive...@gmail.com) wrote: Hi all, I think that there is some lack in solr's ref doc. Section Running Solr says to run solr using the command: $ java -jar start.jar But If I do this with a fresh install, I have a stack trace like this: http://pastebin.com/5YRRccTx Is it this behavior as expected? - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Ref-guide-question-tp4086142.html Sent from the Solr - User mailing list archive at Nabble.com (http://Nabble.com). -- Brendan Grainger www.kuripai.com (http://www.kuripai.com)
Re: Solr Ref guide question
Ups, sorry for the las email, wrong language :P I wanted to say was: Maybe if it were thrown an error saying that the version was not compatible helped in these cases -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Friday, August 23, 2013 at 10:53 AM, Yago Riveiro wrote: I found the problem. The java version was overwritten for a dependency and was 1.5. Reinstalling java, Solr works as expected. Tal vez si fuese lanzado un error diciendo que la versión no era compatible ayudaba en estos casos. -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Friday, August 23, 2013 at 10:30 AM, Yago Riveiro wrote: The version is the 4.4, I did the download, unzip and run the command in example folder: java -jar start.jar Is a fresh install, no modification done. -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Thursday, August 22, 2013 at 9:22 PM, Brendan Grainger wrote: What version of solr are you using? Have you copied a solr.xml from somewhere else? I can almost reproduce the error you're getting if I put a non-existent core in my solr.xml, e.g.: solr cores adminPath=/admin/cores core name=core0 instanceDir=a_non_existent_core / /cores ... On Thu, Aug 22, 2013 at 1:30 PM, yriveiro yago.rive...@gmail.com (mailto:yago.rive...@gmail.com) wrote: Hi all, I think that there is some lack in solr's ref doc. Section Running Solr says to run solr using the command: $ java -jar start.jar But If I do this with a fresh install, I have a stack trace like this: http://pastebin.com/5YRRccTx Is it this behavior as expected? - Best regards -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Ref-guide-question-tp4086142.html Sent from the Solr - User mailing list archive at Nabble.com (http://Nabble.com). -- Brendan Grainger www.kuripai.com (http://www.kuripai.com)
Re: Reloading synonyms and stop words
I don't think that should be a problem. Your custom RequestHandler must call reload. Note that a new instance of your request handler will be created and inform will be called on it once reload happens i.e. you won't be able to keep any state in the request handler across core reloads. You can also do this at a level above RequestHandler i.e. via a custom CoreAdminHandler. See CoreAdminHandler.handleCustomAction() On Fri, Aug 23, 2013 at 2:57 PM, Bruno René Santos brunor...@gmail.com wrote: Great! What about inside a RequestHandler source code in Java? I want to create a requestHandler that receives new synonyms, insert them on the synonyms file and reload the core. Regards Bruno On Fri, Aug 23, 2013 at 9:28 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Yes, you can use the Core RELOAD command: https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com wrote: Hello, Is it possible to reload the synonyms and stopwords files without rebooting solr? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar. -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar.
Re: Leader election
How long has the shard been without a leader? How many shards? How many replicas per shard? Which version of Solr? On Fri, Aug 23, 2013 at 2:51 PM, Srivatsan ranjith.venkate...@gmail.com wrote: Hi, I am using solr 4.4 for my search application. I was indexing some 1 million docs. At that time, i accidentally killed leader node of that collection. Indexing failed with the exception , /org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request:[http://localhost:9133/solr/collection7_shard1_replica3, http://localhost:8983/solr/collection7_shard1_replica1] at org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(LBHttpSolrServer.java:333) at org.apache.solr.client.solrj.impl.CloudSolrServer.request(CloudSolrServer.java:318) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:116) at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:102)/ after that i checked solr admin page, leader election didnt get triggered for that collection. http://lucene.472066.n3.nabble.com/file/n4086259/Screenshot.png I couldnt able to index for that collection but i can able to search from that collection. Help me in this issue Thanks in advance Srivatsan -- View this message in context: http://lucene.472066.n3.nabble.com/Leader-election-tp4086259.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: Leader election
almost 15 minutes. After that i restarted the entire cluster. I am using solr 4.4 with 1 shard and 3 replicas -- View this message in context: http://lucene.472066.n3.nabble.com/Leader-election-tp4086259p4086287.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Leader election
Any exceptions in the logs of other replicas. The default leaderVoteWait time is 3 minutes after which a leader election should have been initiated automatically. On Fri, Aug 23, 2013 at 4:01 PM, Srivatsan ranjith.venkate...@gmail.com wrote: almost 15 minutes. After that i restarted the entire cluster. I am using solr 4.4 with 1 shard and 3 replicas -- View this message in context: http://lucene.472066.n3.nabble.com/Leader-election-tp4086259p4086287.html Sent from the Solr - User mailing list archive at Nabble.com. -- Regards, Shalin Shekhar Mangar.
Re: Leader election
No exceptions. And leaderVoteWait value will be used only during startup rite ? A new leader will be elected once the leader node is down. Am i right ? -- View this message in context: http://lucene.472066.n3.nabble.com/Leader-election-tp4086259p4086290.html Sent from the Solr - User mailing list archive at Nabble.com.
Caused by: java.net.SocketException: Connection reset by peer: socket write error solr querying
Hi, I am working on solr 4.4 jetty, and generated the index on 3350128 records. Now i want to test the query performance. So applied load test with time of 5 minutes, and 600 virtual users for different solr queries. After test completion got below errors. ERROR - 2013-08-23 09:49:43.867; org.apache.solr.common.SolrException; null:org.eclipse.jetty.io.EofException at org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:914) at org.eclipse.jetty.http.AbstractGenerator.blockForOutput(AbstractGenerator.java:507) at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:170) at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202) at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:263) at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:106) at java.io.OutputStreamWriter.write(OutputStreamWriter.java:190) at org.apache.solr.util.FastWriter.flush(FastWriter.java:141) at org.apache.solr.util.FastWriter.write(FastWriter.java:126) at java.io.Writer.write(Writer.java:140) at org.apache.solr.response.XMLWriter.startTag(XMLWriter.java:144) at org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:347) at org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:295) at org.apache.solr.schema.StrField.write(StrField.java:67) at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:130) at org.apache.solr.response.XMLWriter.writeArray(XMLWriter.java:273) at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:190) at org.apache.solr.response.XMLWriter.writeSolrDocument(XMLWriter.java:199) at org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:275) at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:172) at org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:111) at org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:39) at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:647) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:375) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at org.eclipse.jetty.server.bio.SocketConnector$ConnectorEndPoint.run(SocketConnector.java:264) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.SocketException: Connection reset by peer: socket write error at java.net.SocketOutputStream.socketWrite0(Native Method) at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92) at
Problem with importing tab-delimited csv file
I'm having trouble importing a tab-delimited file with the csv update handler.My data file looks like this:"id" "question" "answer" "url""q99" "Who?" "You!" "none"When I send this data to Solr using Curl:curl 'http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09' --data @sample.tmpAll seems to be well:?xml version="1.0" encoding="UTF-8"?responselst name="responseHeader"int name="status"0/intint name="QTime"221/int/lst/responseBut when I query the development core, there is no data. I must be overlooking something trivial. I would appreciate if anyone could spot what! - Rob
How to patch Solr4.2 for SolrEnityProcessor Sub-Enity issue
According to http://stackoverflow.com/questions/15734308/solrentityprocessor-is-called-only-once-for-sub-entities?lq=1 we can use the patched SolrEntityProcessor in https://issues.apache.org/jira/browse/SOLR-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel to solve the subentity problem. I tried renaming the jar file to zip and then try replacing the patched file but as I got only java file I can't replace it with class file. So I drop this idea. Here is then what I have tried. I decompiled the original jar solr-dataimporthandler-4.2.0.jar present in solr 4.2 package. Then I replace the patch file. And try to compile the files for making the jar again. But I started getting compilation errors. .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: ')' expected /* 432 / if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: expected / 432 / if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: not a statem ent / 432 / if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: illegal star t of expression / 432 */ if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: ';' expected /* 432 */ if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:397: ';' expected /* 432 / if (XPathEntityProcessor.2.this.val$isEnd.get()) { ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:398: not a statem ent / 433 */ XPathEntityProcessor.2.this.val$throwExp.set(false); ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:398: ';' expected /* 433 / XPathEntityProcessor.2.this.val$throwExp.set(false); ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:406: not a statem ent / 442 */ XPathEntityProcessor.2.this.val$isEnd.set(true); ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:406: ';' expected /* 442 / XPathEntityProcessor.2.this.val$isEnd.set(true); ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:409: not a statem ent / 445 */ XPathEntityProcessor.2.this.offer(row); ^ .\org\apache\solr\handler\dataimport\XPathEntityProcessor.java:409: ';' expected /* 445 */ XPathEntityProcessor.2.this.offer(row); ^ 12 errors Any idea how to patch Solr4.2 for this issue. I thought it must be in Sol4.4 but it is not in it. Any help on this. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-patch-Solr4-2-for-SolrEnityProcessor-Sub-Enity-issue-tp4086292.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR Prevent solr of modifying fields when update doc
Well, not much in the way of help because you can't do what you want AFAIK. I don't think UUID is suitable for your use-case. Why not use your uniqueId? Or generate something yourself... Best Erick On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso meligalet...@gmail.com wrote: Hi, How can i prevent solr from update some fields when updating a doc? The problem is, i have an uuid with the field name uuid, but it is not an unique key. When a rss source updates a feed, solr will update the doc with the same link but it generates a new uuid. This is not the desired because this id is used by me to relate feeds with an user. Can someone help me? Many Thanks
Re: Storing query results
No, there's nothing like that in Solr. The closest you could come would be to not do a hard commit (openSearcher=true) or a soft commit for a very long time. As long as neither of these things happen, the search results won't change. But that's a hackish solution. In fact I question your basic assumption. You say you don't want the search results to change. But 1 the user probably wouldn't notice 2 this can mislead in completely different ways. What if most of the search results were deleted after the first query? What if the _exact_ document she was looking for got indexed after the first query? This is one of those features that at first blush sounds somewhat reasonable, but I don't think stands up under inspection. It'd be some amount of work for, IMO, dubious utility. If you _must_ do something like this, the app layer could do something like request a rows=1000fl=id and essentially re-implement the queryResultCache at the app layer. Subsequent pages would cause you to issue queries like id=(1 or 54 or 90 ). Best Erick On Thu, Aug 22, 2013 at 6:00 PM, Ahmet Arslan iori...@yahoo.com wrote: Hi jfeist, Your mail reminds me this blog, not sure about solr though. http://blog.mikemccandless.com/2011/11/searcherlifetimemanager-prevents-broken.html From: jfeist jfe...@llminc.com To: solr-user@lucene.apache.org Sent: Friday, August 23, 2013 12:09 AM Subject: Storing query results I am in the process of setting up a search application that allows the user to view paginated query results. The documents are highly dynamic but I want the search results to be static, i.e. I don't want the user to click the next page button, the query reruns, and now he has a different set of search results because the data changed while he was looking through it. I want the results stored somewhere else and the successive page queries to draw from that. I know Solr has query result caching, but I want to store it entirely. Does Solr provide any functionality like this? I imagine it doesn't, because then you'd need to specify how long to store it, etc. I'm using Solr 4.4.0. I found someone asking something similar here http://lucene.472066.n3.nabble.com/storing-results-td476351.html but that was 6 years ago. -- View this message in context: http://lucene.472066.n3.nabble.com/Storing-query-results-tp4086182.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR Prevent solr of modifying fields when update doc
Hi thanks by the answer, but the uniqueId is generated by me. But when solr indexes and there is an update in a doc, it deletes the doc and creates a new one, so it generates a new UUID. It is not suitable for me, because i want that solr just updates some fields, because the UUID is the key that i use to map it to an user in my database. Right now i'm using information that comes from the source and never chages, as my uniqueId, like for example the guid, that exists in some rss feeds, or if it doesn't exists i use link. I think there is any simple solution for me, because for what i have read, when an update to a doc exists, SOLR deletes the old one and create a new one, right? On Aug 23, 2013, at 12:07 PM, Erick Erickson erickerick...@gmail.com wrote: Well, not much in the way of help because you can't do what you want AFAIK. I don't think UUID is suitable for your use-case. Why not use your uniqueId? Or generate something yourself... Best Erick On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso meligalet...@gmail.com wrote: Hi, How can i prevent solr from update some fields when updating a doc? The problem is, i have an uuid with the field name uuid, but it is not an unique key. When a rss source updates a feed, solr will update the doc with the same link but it generates a new uuid. This is not the desired because this id is used by me to relate feeds with an user. Can someone help me? Many Thanks smime.p7s Description: S/MIME cryptographic signature
Re: Measuring SOLR performance
Hi Roman, With adminPath=/admin or adminPath=/admin/cores, no. Interestingly enough, though, I can access http://localhost:8983/solr/statements/admin/system But I can access http://localhost:8983/solr/admin/cores, only when with adminPath=/admin/cores (which suggests that this is the right value to be used for cores), and not with adminPath=/admin. Bottom line, these core configuration is not self-evident. Dmitry On Fri, Aug 23, 2013 at 4:18 AM, Roman Chyla roman.ch...@gmail.com wrote: Hi Dmitry, So it seems solrjmeter should not assume the adminPath - and perhaps needs to be passed as an argument. When you set the adminPath, are you able to access localhost:8983/solr/statements/admin/cores ? roman On Wed, Aug 21, 2013 at 7:36 AM, Dmitry Kan solrexp...@gmail.com wrote: Hi Roman, I have noticed a difference with different solr.xml config contents. It is probably legit, but thought to let you know (tests run on fresh checkout as of today). As mentioned before, I have two cores configured in solr.xml. If the file is: [code] solr persistent=false !-- adminPath: RequestHandler path to manage cores. If 'null' (or absent), cores will not be manageable via request handler -- cores adminPath=/admin/cores host=${host:} hostPort=${jetty.port:8983} hostContext=${hostContext:solr} core name=metadata instanceDir=metadata / core name=statements instanceDir=statements / /cores /solr [/code] then the instruction: python solrjmeter.py -a -x ./jmx/SolrQueryTest.jmx -q ./queries/demo/demo.queries -s localhost -p 8983 -a --durationInSecs 60 -R cms -t /solr/statements -e statements -U 100 works just fine. If however the solr.xml has adminPath set to /admin solrjmeter produces an error: [error] **ERROR** File solrjmeter.py, line 1386, in module main(sys.argv) File solrjmeter.py, line 1278, in main check_prerequisities(options) File solrjmeter.py, line 375, in check_prerequisities error('Cannot find admin pages: %s, please report a bug' % apath) File solrjmeter.py, line 66, in error traceback.print_stack() Cannot find admin pages: http://localhost:8983/solr/admin, please report a bug [/error] With both solr.xml configs the following url returns just fine: http://localhost:8983/solr/statements/admin/system?wt=json Regards, Dmitry On Wed, Aug 14, 2013 at 2:03 PM, Dmitry Kan solrexp...@gmail.com wrote: Hi Roman, This looks much better, thanks! The ordinary non-comarison mode works. I'll post here, if there are other findings. Thanks for quick turnarounds, Dmitry On Wed, Aug 14, 2013 at 1:32 AM, Roman Chyla roman.ch...@gmail.com wrote: Hi Dmitry, oh yes, late night fixes... :) The latest commit should make it work for you. Thanks! roman On Tue, Aug 13, 2013 at 3:37 AM, Dmitry Kan solrexp...@gmail.com wrote: Hi Roman, Something bad happened in fresh checkout: python solrjmeter.py -a -x ./jmx/SolrQueryTest.jmx -q ./queries/demo/demo.queries -s localhost -p 8983 -a --durationInSecs 60 -R cms -t /solr/statements -e statements -U 100 Traceback (most recent call last): File solrjmeter.py, line 1392, in module main(sys.argv) File solrjmeter.py, line 1347, in main save_into_file('before-test.json', simplejson.dumps(before_test)) File /usr/lib/python2.7/dist-packages/simplejson/__init__.py, line 286, in dumps return _default_encoder.encode(obj) File /usr/lib/python2.7/dist-packages/simplejson/encoder.py, line 226, in encode chunks = self.iterencode(o, _one_shot=True) File /usr/lib/python2.7/dist-packages/simplejson/encoder.py, line 296, in iterencode return _iterencode(o, 0) File /usr/lib/python2.7/dist-packages/simplejson/encoder.py, line 202, in default raise TypeError(repr(o) + is not JSON serializable) TypeError: __main__.ForgivingValue object at 0x7fc6d4040fd0 is not JSON serializable Regards, D. On Tue, Aug 13, 2013 at 8:10 AM, Roman Chyla roman.ch...@gmail.com wrote: Hi Dmitry, On Mon, Aug 12, 2013 at 9:36 AM, Dmitry Kan solrexp...@gmail.com wrote: Hi Roman, Good point. I managed to run the command with -C and double quotes: python solrjmeter.py -a -C g1,cms -c hour -x ./jmx/SolrQueryTest.jmx As a result got several files (html, css, js, csv) in the running directory (any way to specify where the output should be stored in this case?) i know it is confusing, i plan to change it - but later, now it is too busy here... When I look onto the comparison dashboard, I see this: http://pbrd.co/17IRI0b two
Re: Flushing cache without restarting everything?
by monitoring the original and changed systems over long enough periods, where long enough is a parameter (to compute). Or then going really low-level, if you know which component has been changed (like they do in Lucene [1]; not always possible in Solr..) [1] http://people.apache.org/~mikemccand/lucenebench/ On Thu, Aug 22, 2013 at 3:58 PM, Jean-Sebastien Vachon jean-sebastien.vac...@wantedanalytics.com wrote: How can you validate that the changes you just made had any impact on the performance of the cloud if you don't have the same starting conditions? What we do basically is running a batch of requests to warm up the index and then launch the benchmark itself. That way we can measure the impact of our change(s). Otherwise there is absolutely no way we can be sure who is responsible for the gain or loss of performance. Restarting a cloud is actually a real pain, I just want to know if there is a faster way to proceed. -Original Message- From: Dmitry Kan [mailto:solrexp...@gmail.com] Sent: August-22-13 7:26 AM To: solr-user@lucene.apache.org Subject: Re: Flushing cache without restarting everything? But is it really a good benchmarking, if you flush the cache? Wouldn't you want to benchmark against a system, that would be comparable to what is under real (=production) load? Dmitry On Tue, Aug 20, 2013 at 9:39 PM, Jean-Sebastien Vachon jean- sebastien.vac...@wantedanalytics.com wrote: I just want to run benchmarks and want to have the same starting conditions. -Original Message- From: Walter Underwood [mailto:wun...@wunderwood.org] Sent: August-20-13 2:06 PM To: solr-user@lucene.apache.org Subject: Re: Flushing cache without restarting everything? Why? What are you trying to acheive with this? --wunder On Aug 20, 2013, at 11:04 AM, Jean-Sebastien Vachon wrote: Hi All, Is there a way to flush the cache of all nodes in a Solr Cloud (by reloading all the cores, through the collection API, ...) without having to restart all nodes? Thanks - Aucun virus trouvé dans ce message. Analyse effectuée par AVG - www.avg.fr Version: 2013.0.3392 / Base de données virale: 3209/6563 - Date: 09/08/2013 La Base de données des virus a expiré. - Aucun virus trouvé dans ce message. Analyse effectuée par AVG - www.avg.fr Version: 2013.0.3392 / Base de données virale: 3209/6563 - Date: 09/08/2013 La Base de données des virus a expiré.
Query term count over a result set
Hi all, I would like to get the total count of a query term of a result set. Is there a way to get this? I know there is a TermVectorComponent that does this per result (document), but it would be far too expensive to take the sum over all documents for a term given that term. The LukeRequestHandler and the terms component only present the term counts over the whole index. Thanks!
RE: How to set discountOverlaps=true in Solr 4x schema.xml
Yes, discountOverlaps is used in computeNorm which is used at index time. You should see a change after reindexing. Cheers, Markus -Original message- From:Tom Burton-West tburt...@umich.edu Sent: Thursday 22nd August 2013 23:32 To: solr-user@lucene.apache.org Subject: Re: How to set discountOverlaps=quot;truequot; in Solr 4x schema.xml I should have said that I have set it both to true and to false and restarted Solr each time and the rankings and info in the debug query showed no change. Does this have to be set at index time? Tom
Re: dataimporter tika fields empty
ok but i'm not doing any path extraction, at least i don't think so. htmlMapper=identity isn't preserving html it's reading the content of the pages but it's not putting it into text_test and text. it's only in text_test the copyField isn't working. data-config.xml: dataConfig dataSource type=BinFileDataSource name=data/ dataSource type=BinURLDataSource name=dataUrl/ dataSource type=URLDataSource name=main/ document entity name=rec processor=XPathEntityProcessor url=http://127.0.0.1/tkb/internet/docImportUrl.xml; forEach=/docs/doc dataSource=main field column=title xpath=//title / field column=id xpath=//id / field column=file xpath=//file / field column=path xpath=//path / field column=url xpath=//url / field column=Author xpath=//author / entity name=tika processor=TikaEntityProcessor url=${rec.path}${rec.file} dataSource=dataUrl onError=skip htmlMapper=identity field column=text name=text_test / copyField source=text_test dest=text / !-- field column=text_test xpath=//div[@id='content'] / -- /entity /entity /document /dataConfig On 22. Aug 2013, at 10:06 PM, Alexandre Rafalovitch wrote: Ah. That's because Tika processor does not support path extraction. You need to nest one more level. Regards, Alex On 22 Aug 2013 13:34, Andreas Owen a...@conx.ch wrote: i can do it like this but then the content isn't copied to text. it's just in text_test entity name=tika processor=TikaEntityProcessor url=${rec.path}${rec.file} dataSource=dataUrl field column=text name=text_test copyField source=text_test dest=text / /entity On 22. Aug 2013, at 6:12 PM, Andreas Owen wrote: i put it in the tika-entity as attribute, but it doesn't change anything. my bigger concern is why text_test isn't populated at all On 22. Aug 2013, at 5:27 PM, Alexandre Rafalovitch wrote: Can you try SOLR-4530 switch: https://issues.apache.org/jira/browse/SOLR-4530 Specifically, setting htmlMapper=identity on the entity definition. This will tell Tika to send full HTML rather than a seriously stripped one. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Thu, Aug 22, 2013 at 11:02 AM, Andreas Owen a...@conx.ch wrote: i'm trying to index a html page and only user the div with the id=content. unfortunately nothing is working within the tika-entity, only the standard text (content) is populated. do i have to use copyField for test_text to get the data? or is there a problem with the entity-hirarchy? or is the xpath wrong, even though i've tried it without and just using text? or should i use the updateextractor? data-config.xml: dataConfig dataSource type=BinFileDataSource name=data/ dataSource type=BinURLDataSource name=dataUrl/ dataSource type=URLDataSource baseUrl= http://127.0.0.1/tkb/internet/; name=main/ document entity name=rec processor=XPathEntityProcessor url=docImportUrl.xml forEach=/docs/doc dataSource=main field column=title xpath=//title / field column=id xpath=//id / field column=file xpath=//file / field column=path xpath=//path / field column=url xpath=//url / field column=Author xpath=//author / entity name=tika processor=TikaEntityProcessor url=${rec.path}${rec.file} dataSource=dataUrl !-- copyField source=text dest=text_test / -- field column=text_test xpath=//div[@id='content'] / /entity /entity /document /dataConfig docImporterUrl.xml: ?xml version=1.0 encoding=utf-8? docs doc id5/id authortkb/author titleStartseite/title descriptionblabla .../description filehttp://localhost/tkb/internet/index.cfm/file urlhttp://localhost/tkb/internet/index.cfm/url/url path2http\specialConf/path2 /doc doc id6/id authortkb/author titleEigenheim/title descriptionMachen Sie sich erste Gedanken über den Erwerb von Wohneigentum? Oder haben Sie bereits konkrete Pläne oder gar ein spruchreifes Projekt? Wir beraten Sie gerne in allen Fragen rund um den Erwerb oder Bau von Wohneigentum, damit Ihr Vorhaben auch in finanzieller Hinsicht gelingt./description file http://127.0.0.1/tkb/internet/private/beratung/eigenheim.htm/file url
Re: dataimporter tika fields empty
i changed following line (xpath): field column=text xpath=//div[@id='content'] name=text_test / On 22. Aug 2013, at 10:06 PM, Alexandre Rafalovitch wrote: Ah. That's because Tika processor does not support path extraction. You need to nest one more level. Regards, Alex On 22 Aug 2013 13:34, Andreas Owen a...@conx.ch wrote: i can do it like this but then the content isn't copied to text. it's just in text_test entity name=tika processor=TikaEntityProcessor url=${rec.path}${rec.file} dataSource=dataUrl field column=text name=text_test copyField source=text_test dest=text / /entity On 22. Aug 2013, at 6:12 PM, Andreas Owen wrote: i put it in the tika-entity as attribute, but it doesn't change anything. my bigger concern is why text_test isn't populated at all On 22. Aug 2013, at 5:27 PM, Alexandre Rafalovitch wrote: Can you try SOLR-4530 switch: https://issues.apache.org/jira/browse/SOLR-4530 Specifically, setting htmlMapper=identity on the entity definition. This will tell Tika to send full HTML rather than a seriously stripped one. Regards, Alex. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Thu, Aug 22, 2013 at 11:02 AM, Andreas Owen a...@conx.ch wrote: i'm trying to index a html page and only user the div with the id=content. unfortunately nothing is working within the tika-entity, only the standard text (content) is populated. do i have to use copyField for test_text to get the data? or is there a problem with the entity-hirarchy? or is the xpath wrong, even though i've tried it without and just using text? or should i use the updateextractor? data-config.xml: dataConfig dataSource type=BinFileDataSource name=data/ dataSource type=BinURLDataSource name=dataUrl/ dataSource type=URLDataSource baseUrl= http://127.0.0.1/tkb/internet/; name=main/ document entity name=rec processor=XPathEntityProcessor url=docImportUrl.xml forEach=/docs/doc dataSource=main field column=title xpath=//title / field column=id xpath=//id / field column=file xpath=//file / field column=path xpath=//path / field column=url xpath=//url / field column=Author xpath=//author / entity name=tika processor=TikaEntityProcessor url=${rec.path}${rec.file} dataSource=dataUrl !-- copyField source=text dest=text_test / -- field column=text_test xpath=//div[@id='content'] / /entity /entity /document /dataConfig docImporterUrl.xml: ?xml version=1.0 encoding=utf-8? docs doc id5/id authortkb/author titleStartseite/title descriptionblabla .../description filehttp://localhost/tkb/internet/index.cfm/file urlhttp://localhost/tkb/internet/index.cfm/url/url path2http\specialConf/path2 /doc doc id6/id authortkb/author titleEigenheim/title descriptionMachen Sie sich erste Gedanken über den Erwerb von Wohneigentum? Oder haben Sie bereits konkrete Pläne oder gar ein spruchreifes Projekt? Wir beraten Sie gerne in allen Fragen rund um den Erwerb oder Bau von Wohneigentum, damit Ihr Vorhaben auch in finanzieller Hinsicht gelingt./description file http://127.0.0.1/tkb/internet/private/beratung/eigenheim.htm/file url http://127.0.0.1/tkb/internet/private/beratung/eigenheim.htm/url/url /doc /docs
Re: Query term count over a result set
You can get the term frequency (per document) for a term using the termfreq() function query in the fl parameter: fl=*,termfreq(field,'term') -- Jack Krupansky -Original Message- From: JZ Sent: Friday, August 23, 2013 7:43 AM To: solr-user@lucene.apache.org Subject: Query term count over a result set Hi all, I would like to get the total count of a query term of a result set. Is there a way to get this? I know there is a TermVectorComponent that does this per result (document), but it would be far too expensive to take the sum over all documents for a term given that term. The LukeRequestHandler and the terms component only present the term counts over the whole index. Thanks!
Re: Query term count over a result set
Hi JZ, You can use faceting component. http://localhost:8080/solr/core/select?q=ahmetwt=xmlfacet=onfacet.field=titlefacet.prefix=queryTerm From: JZ zhangju...@gmail.com To: solr-user@lucene.apache.org Sent: Friday, August 23, 2013 2:43 PM Subject: Query term count over a result set Hi all, I would like to get the total count of a query term of a result set. Is there a way to get this? I know there is a TermVectorComponent that does this per result (document), but it would be far too expensive to take the sum over all documents for a term given that term. The LukeRequestHandler and the terms component only present the term counts over the whole index. Thanks!
Re: Reloading synonyms and stop words
Hi again, Thanx for the help :) I have this handler: public class SynonymsHandler extends RequestHandlerBase implements SolrCoreAware { public SynonymsHandler() {} private static Logger log = LoggerFactory.getLogger(SynonymsHandler.class); @Override public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception { System.out.println(req.getContext()); if (req.getContext().get(path).equals(/synonyms/update)) {} if (req.getContext().get(path).equals(/synonyms/get)) {} req.getCore().reload(req.getCore()); } @Override public String getDescription() { return null; } @Override public String getSource() { return null; } @Override public void inform(SolrCore core) {} } and when i call the reload I get this error: 63748 T33 C6 oasc.SolrException.log ERROR java.lang.NullPointerException at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64) at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1693) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) here ShardHandler shardHandler1 = shardHandlerFactory.getShardHandler(); the factory is null... how can I get it to initialize? I checked on the CoreAdminHandler that I have to do something on the inform like you said but I am not sure what... the inform is recursive right? Could I try to execute the CoreAdminHandler from within my Handler with the reload action? I am not sure what is the best practice Regards Bruno On Fri, Aug 23, 2013 at 11:20 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I don't think that should be a problem. Your custom RequestHandler must call reload. Note that a new instance of your request handler will be created and inform will be called on it once reload happens i.e. you won't be able to keep any state in the request handler across core reloads. You can also do this at a level above RequestHandler i.e. via a custom CoreAdminHandler. See CoreAdminHandler.handleCustomAction() On Fri, Aug 23, 2013 at 2:57 PM, Bruno René Santos brunor...@gmail.com wrote: Great! What about inside a RequestHandler source code in Java? I want to create a requestHandler that receives new synonyms, insert them on the synonyms file and reload the core. Regards Bruno On Fri, Aug 23, 2013 at 9:28 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Yes, you can use the Core RELOAD command: https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com wrote: Hello, Is it possible to reload the synonyms and stopwords files without rebooting solr? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar. -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar. -- Bruno René Santos Lisboa - Portugal
Re: Problem with importing tab-delimited csv file
Your data file appears to use spaces rather than tabs. -- Jack Krupansky From: Rob Koeling Ai Sent: Friday, August 23, 2013 6:38 AM To: solr-user@lucene.apache.org Subject: Problem with importing tab-delimited csv file I'm having trouble importing a tab-delimited file with the csv update handler. My data file looks like this: id question answer url q99 Who? You! none When I send this data to Solr using Curl: curl 'http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09' --data @sample.tmp All seems to be well: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime221/int/lst /response But when I query the development core, there is no data. I must be overlooking something trivial. I would appreciate if anyone could spot what! - Rob
Re: How to avoid underscore sign indexing problem?
Exactly - Solr does not define the punctuation, UAX#29 defines it, and I have deciphered the UAX#29 rules and included them in my book. Some punctuation is always punctuation and always removed, and some is conditional on context - I tried to lay out all the implied rules. -- Jack Krupansky -Original Message- From: Steve Rowe Sent: Friday, August 23, 2013 12:30 AM To: solr-user@lucene.apache.org Subject: Re: How to avoid underscore sign indexing problem? Dan, StandardTokenizer implements the word boundary rules from the Unicode Text Segmentation standard annex UAX#29: http://www.unicode.org/reports/tr29/#Word_Boundaries Every character sequence within UAX#29 boundaries that contains a numeric or an alphabetic character is emitted as a term, and nothing else is emitted. Punctuation can be included within a term, e.g. 1,248.99 or 192.168.1.1. To split on underscores, you can convert underscores to e.g. spaces by adding PatternReplaeCharFilterFactory to your analyzer: charFilter class=solr.PatternReplaceCharFilterFactory pattern=_ replacement= / This replacement will be performed prior to StandardTokenizer, which will then see token-splitting spaces instead of underscores. Steve On Aug 22, 2013, at 10:23 PM, Dan Davis dansm...@gmail.com wrote: Ah, but what is the definition of punctuation in Solr? On Wed, Aug 21, 2013 at 11:15 PM, Jack Krupansky j...@basetechnology.comwrote: I thought that the StandardTokenizer always split on punctuation, Proving that you haven't read my book! The section on the standard tokenizer details the rules that the tokenizer uses (in addition to extensive examples.) That's what I mean by deep dive. -- Jack Krupansky -Original Message- From: Shawn Heisey Sent: Wednesday, August 21, 2013 10:41 PM To: solr-user@lucene.apache.org Subject: Re: How to avoid underscore sign indexing problem? On 8/21/2013 7:54 PM, Floyd Wu wrote: When using StandardAnalyzer to tokenize string Pacific_Rim will get ST textraw_**bytesstartendtypeposition pacific_rim[70 61 63 69 66 69 63 5f 72 69 6d]011ALPHANUM1 How to make this string to be tokenized to these two tokens Pacific, Rim? Set _ as stopword? Please kindly help on this. Many thanks. Interesting. I thought that the StandardTokenizer always split on punctuation, but apparently that's not the case for the underscore character. You can always use the WordDelimeterFilter after the StandardTokenizer. http://wiki.apache.org/solr/**AnalyzersTokenizersTokenFilter**s#solr.** WordDelimiterFilterFactoryhttp://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory Thanks, Shawn
RE: how to integrate solr with HDFS HA
Finally something I can help with! I went through the same problems you're having a short while ago. Check out https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+HDFS for most of the information you need and be sure to check the comments on the page as well. Here's an example from my working setup: ** directoryFactory name=DirectoryFactory class=solr.HdfsDirectoryFactory bool name=solr.hdfs.blockcache.enabledtrue/bool int name=solr.hdfs.blockcache.slab.count1/int bool name=solr.hdfs.blockcache.direct.memory.allocationtrue/bool int name=solr.hdfs.blockcache.blocksperbank16384/int bool name=solr.hdfs.blockcache.read.enabledtrue/bool bool name=solr.hdfs.blockcache.write.enabledtrue/bool bool name=solr.hdfs.nrtcachingdirectory.enabletrue/bool int name=solr.hdfs.nrtcachingdirectory.maxmergesizemb16/int int name=solr.hdfs.nrtcachingdirectory.maxcachedmb192/int str name=solr.hdfs.homehdfs://nameservice1:8020/solr/str str name=solr.hdfs.confdir/etc/hadoop/conf.cloudera.hdfs1/str /directoryFactory ** Thanks, Greg -Original Message- From: YouPeng Yang [mailto:yypvsxf19870...@gmail.com] Sent: Friday, August 23, 2013 1:16 AM To: solr-user@lucene.apache.org Subject: how to integrate solr with HDFS HA Hi all I try to integrate solr with HDFS HA.When I start the solr server, it comes out an exeception[1]. And I do know this is because the hadoop.conf.Configuration in HdfsDirectoryFactory.java does not include the HA configuration. So I want to know ,in solr,is there any way to include my hadoop HA configuration ? [1]--- Caused by: java.lang.IllegalArgumentException: java.net.UnknownHostException: lklcluster at org.apache.hadoop.security.SecurityUtil.buildTokenService(SecurityUtil.java:418) at org.apache.hadoop.hdfs.NameNodeProxies.createNonHAProxy(NameNodeProxies.java:164) at org.apache.hadoop.hdfs.NameNodeProxies.createProxy(NameNodeProxies.java:129) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:415) at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:382) at org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:123) at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2277) at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:86) at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2311) at org.apache.hadoop.fs.FileSystem$Cache.getUnique(FileSystem.java:2299) at org.apache.hadoop.fs.FileSystem.newInstance(FileSystem.java:364) at org.apache.solr.store.hdfs.HdfsDirectory.init(HdfsDirectory.java:59) at org.apache.solr.core.HdfsDirectoryFactory.create(HdfsDirectoryFactory.java:154) at org.apache.solr.core.CachingDirectoryFactory.get(CachingDirectoryFactory.java:350) at org.apache.solr.core.SolrCore.getNewIndexDir(SolrCore.java:256) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:469) at org.apache.solr.core.SolrCore.init(SolrCore.java:759)
RE: Caused by: java.net.SocketException: Connection reset by peer: socket write error solr querying
If you're using the bundled jetty that comes with the download, check the etc/jetty.xml property for maxIdleTime and set it appropriately. I get that error when operations take longer than the property is set to and time out. Do note that the property is specified in milliseconds! Thanks, Greg -Original Message- From: aniljayanti [mailto:aniljaya...@yahoo.co.in] Sent: Thursday, August 22, 2013 11:44 PM To: solr-user@lucene.apache.org Subject: Caused by: java.net.SocketException: Connection reset by peer: socket write error solr querying Hi, I am working on solr 4.4 jetty, and generated the index on 3350128 records. Now i want to test the query performance. So applied load test with time of 5 minutes, and 600 virtual users for different solr queries. After test completion got below errors. ERROR - 2013-08-23 09:49:43.867; org.apache.solr.common.SolrException; null:org.eclipse.jetty.io.EofException at org.eclipse.jetty.http.HttpGenerator.flushBuffer(HttpGenerator.java:914) at org.eclipse.jetty.http.AbstractGenerator.blockForOutput(AbstractGenerator.java:507) at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:170) at org.eclipse.jetty.server.HttpOutput.write(HttpOutput.java:107) at sun.nio.cs.StreamEncoder.writeBytes(StreamEncoder.java:202) at sun.nio.cs.StreamEncoder.implWrite(StreamEncoder.java:263) at sun.nio.cs.StreamEncoder.write(StreamEncoder.java:106) at java.io.OutputStreamWriter.write(OutputStreamWriter.java:190) at org.apache.solr.util.FastWriter.flush(FastWriter.java:141) at org.apache.solr.util.FastWriter.write(FastWriter.java:126) at java.io.Writer.write(Writer.java:140) at org.apache.solr.response.XMLWriter.startTag(XMLWriter.java:144) at org.apache.solr.response.XMLWriter.writePrim(XMLWriter.java:347) at org.apache.solr.response.XMLWriter.writeStr(XMLWriter.java:295) at org.apache.solr.schema.StrField.write(StrField.java:67) at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:130) at org.apache.solr.response.XMLWriter.writeArray(XMLWriter.java:273) at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:190) at org.apache.solr.response.XMLWriter.writeSolrDocument(XMLWriter.java:199) at org.apache.solr.response.TextResponseWriter.writeDocuments(TextResponseWriter.java:275) at org.apache.solr.response.TextResponseWriter.writeVal(TextResponseWriter.java:172) at org.apache.solr.response.XMLWriter.writeResponse(XMLWriter.java:111) at org.apache.solr.response.XMLResponseWriter.write(XMLResponseWriter.java:39) at org.apache.solr.servlet.SolrDispatchFilter.writeResponse(SolrDispatchFilter.java:647) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:375) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:557) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1075) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:255) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:154) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116) at org.eclipse.jetty.server.Server.handle(Server.java:368) at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489) at org.eclipse.jetty.server.BlockingHttpConnection.handleRequest(BlockingHttpConnection.java:53) at org.eclipse.jetty.server.AbstractHttpConnection.headerComplete(AbstractHttpConnection.java:942) at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.headerComplete(AbstractHttpConnection.java:1004) at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:640) at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:235) at org.eclipse.jetty.server.BlockingHttpConnection.handle(BlockingHttpConnection.java:72) at
Re: Reloading synonyms and stop words
Actually I was suggesting that you execute the CoreAdminHandler from within your handler or you can try calling CoreContainer.reload directly. On Fri, Aug 23, 2013 at 6:13 PM, Bruno René Santos brunor...@gmail.com wrote: Hi again, Thanx for the help :) I have this handler: public class SynonymsHandler extends RequestHandlerBase implements SolrCoreAware { public SynonymsHandler() {} private static Logger log = LoggerFactory.getLogger(SynonymsHandler.class); @Override public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception { System.out.println(req.getContext()); if (req.getContext().get(path).equals(/synonyms/update)) {} if (req.getContext().get(path).equals(/synonyms/get)) {} req.getCore().reload(req.getCore()); } @Override public String getDescription() { return null; } @Override public String getSource() { return null; } @Override public void inform(SolrCore core) {} } and when i call the reload I get this error: 63748 T33 C6 oasc.SolrException.log ERROR java.lang.NullPointerException at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64) at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1693) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) here ShardHandler shardHandler1 = shardHandlerFactory.getShardHandler(); the factory is null... how can I get it to initialize? I checked on the CoreAdminHandler that I have to do something on the inform like you said but I am not sure what... the inform is recursive right? Could I try to execute the CoreAdminHandler from within my Handler with the reload action? I am not sure what is the best practice Regards Bruno On Fri, Aug 23, 2013 at 11:20 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I don't think that should be a problem. Your custom RequestHandler must call reload. Note that a new instance of your request handler will be created and inform will be called on it once reload happens i.e. you won't be able to keep any state in the request handler across core reloads. You can also do this at a level above RequestHandler i.e. via a custom CoreAdminHandler. See CoreAdminHandler.handleCustomAction() On Fri, Aug 23, 2013 at 2:57 PM, Bruno René Santos brunor...@gmail.com wrote: Great! What about inside a RequestHandler source code in Java? I want to create a requestHandler that receives new synonyms, insert them on the synonyms file and reload the core. Regards Bruno On Fri, Aug 23, 2013 at 9:28 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Yes, you can use the Core RELOAD command: https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com wrote: Hello, Is it possible to reload the synonyms and stopwords files without rebooting solr? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar. -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar. -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar.
Re: Reloading synonyms and stop words
req.getCore().getCoreDescriptor().getCoreContainer().reload(req.getCore().getName()); works like a charm :) thanx a lot Bruno On Fri, Aug 23, 2013 at 2:48 PM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Actually I was suggesting that you execute the CoreAdminHandler from within your handler or you can try calling CoreContainer.reload directly. On Fri, Aug 23, 2013 at 6:13 PM, Bruno René Santos brunor...@gmail.com wrote: Hi again, Thanx for the help :) I have this handler: public class SynonymsHandler extends RequestHandlerBase implements SolrCoreAware { public SynonymsHandler() {} private static Logger log = LoggerFactory.getLogger(SynonymsHandler.class); @Override public void handleRequestBody(SolrQueryRequest req, SolrQueryResponse rsp) throws Exception { System.out.println(req.getContext()); if (req.getContext().get(path).equals(/synonyms/update)) {} if (req.getContext().get(path).equals(/synonyms/get)) {} req.getCore().reload(req.getCore()); } @Override public String getDescription() { return null; } @Override public String getSource() { return null; } @Override public void inform(SolrCore core) {} } and when i call the reload I get this error: 63748 T33 C6 oasc.SolrException.log ERROR java.lang.NullPointerException at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:181) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1904) at org.apache.solr.core.QuerySenderListener.newSearcher(QuerySenderListener.java:64) at org.apache.solr.core.SolrCore$5.call(SolrCore.java:1693) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:724) here ShardHandler shardHandler1 = shardHandlerFactory.getShardHandler(); the factory is null... how can I get it to initialize? I checked on the CoreAdminHandler that I have to do something on the inform like you said but I am not sure what... the inform is recursive right? Could I try to execute the CoreAdminHandler from within my Handler with the reload action? I am not sure what is the best practice Regards Bruno On Fri, Aug 23, 2013 at 11:20 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: I don't think that should be a problem. Your custom RequestHandler must call reload. Note that a new instance of your request handler will be created and inform will be called on it once reload happens i.e. you won't be able to keep any state in the request handler across core reloads. You can also do this at a level above RequestHandler i.e. via a custom CoreAdminHandler. See CoreAdminHandler.handleCustomAction() On Fri, Aug 23, 2013 at 2:57 PM, Bruno René Santos brunor...@gmail.com wrote: Great! What about inside a RequestHandler source code in Java? I want to create a requestHandler that receives new synonyms, insert them on the synonyms file and reload the core. Regards Bruno On Fri, Aug 23, 2013 at 9:28 AM, Shalin Shekhar Mangar shalinman...@gmail.com wrote: Yes, you can use the Core RELOAD command: https://cwiki.apache.org/confluence/display/solr/CoreAdminHandler+Parameters+and+Usage#CoreAdminHandlerParametersandUsage-%7B%7BRELOAD%7D%7D On Fri, Aug 23, 2013 at 1:51 PM, Bruno René Santos brunor...@gmail.com wrote: Hello, Is it possible to reload the synonyms and stopwords files without rebooting solr? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar. -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar. -- Bruno René Santos Lisboa - Portugal -- Regards, Shalin Shekhar Mangar. -- Bruno René Santos Lisboa - Portugal
Re: Query term count over a result set
Hello, Ahmet, using the faceting component it gives me the document count for a term, while I am interested in the term counts within a document for a query term. Jack, the functionquery termfreq returns indeed the term frequency per document, but not over a result set. How to do this over a result set? I do not think there is something ready made, but perhaps a pointer to a plugin or some code (or explanation why this does not exist yet) would be great! Thanks On Fri, Aug 23, 2013 at 2:42 PM, Ahmet Arslan iori...@yahoo.com wrote: Hi JZ, You can use faceting component. http://localhost:8080/solr/core/select?q=ahmetwt=xmlfacet=onfacet.field=titlefacet.prefix=queryTerm From: JZ zhangju...@gmail.com To: solr-user@lucene.apache.org Sent: Friday, August 23, 2013 2:43 PM Subject: Query term count over a result set Hi all, I would like to get the total count of a query term of a result set. Is there a way to get this? I know there is a TermVectorComponent that does this per result (document), but it would be far too expensive to take the sum over all documents for a term given that term. The LukeRequestHandler and the terms component only present the term counts over the whole index. Thanks!
Solrconfig.xml
Is there any way inside a handleRequestBody on a RequestHandler to know the directory where the core configuration is? (schema.xml, solrconfig.xml, synonyms, etc) Regards Bruno -- Bruno René Santos Lisboa - Portugal
Index a database table?
Hello there, I just got something to index mysql database talble: http://wiki.apache.org/solr/DIHQuickStart Pasted the following in config tag of solrconfig.xml file (solr-4.4.0/example/solr/collection1/conf/solrconfig.xml): requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler Altering the next code, making a data-config.xml file, I have written the following, where I am not sure if driver, url and entity name are correct or not. How do I know if they are wrong? Because the model name ( table name ) i.e. tcc_userprofile and its attributes are written in query and I know they are right. New is myql database name. dataConfig dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/New user=root password=password/ document entity name=id query=select id,first_name from tcc_userprorofile /entity /document /dataConfig -- View this message in context: http://lucene.472066.n3.nabble.com/Index-a-database-table-tp4086334.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solrconfig.xml
Yes, if your RequestHandler implements SolrCoreAware you will get a SolrCore reference in inform(...) method. In SolrCore you have all what you need (specifically SolrResourceLoader is what you need) Note that if your request handler is a SearchHandler you don't need to implement that interface because it already does. Best, Andrea On 08/23/2013 04:23 PM, Bruno René Santos wrote: Is there any way inside a handleRequestBody on a RequestHandler to know the directory where the core configuration is? (schema.xml, solrconfig.xml, synonyms, etc) Regards Bruno
Re: Index a database table?
Seems ok assuming that - you have mysql driver jar in your $SOLR_HOME/lib - New is database name - user root / password is valid - table exists - SOLR has a schema with the following id and first_name fields declared About How do I know if they are wrong? Why don't you try? On 08/23/2013 04:31 PM, Kamaljeet Kaur wrote: Hello there, I just got something to index mysql database talble: http://wiki.apache.org/solr/DIHQuickStart Pasted the following in config tag of solrconfig.xml file (solr-4.4.0/example/solr/collection1/conf/solrconfig.xml): requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdata-config.xml/str /lst /requestHandler Altering the next code, making a data-config.xml file, I have written the following, where I am not sure if driver, url and entity name are correct or not. How do I know if they are wrong? Because the model name ( table name ) i.e. tcc_userprofile and its attributes are written in query and I know they are right. New is myql database name. dataConfig dataSource type=JdbcDataSource driver=com.mysql.jdbc.Driver url=jdbc:mysql://localhost/New user=root password=password/ document entity name=id query=select id,first_name from tcc_userprorofile /entity /document /dataConfig -- View this message in context: http://lucene.472066.n3.nabble.com/Index-a-database-table-tp4086334.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solrconfig.xml
That is what I needed req.getCore().getResourceLoader().getConfigDir() Thanx Bruno On Fri, Aug 23, 2013 at 3:37 PM, Andrea Gazzarini andrea.gazzar...@gmail.com wrote: Yes, if your RequestHandler implements SolrCoreAware you will get a SolrCore reference in inform(...) method. In SolrCore you have all what you need (specifically SolrResourceLoader is what you need) Note that if your request handler is a SearchHandler you don't need to implement that interface because it already does. Best, Andrea On 08/23/2013 04:23 PM, Bruno René Santos wrote: Is there any way inside a handleRequestBody on a RequestHandler to know the directory where the core configuration is? (schema.xml, solrconfig.xml, synonyms, etc) Regards Bruno -- Bruno René Santos Lisboa - Portugal
Problem with importing tab-delimited csv file
Thanks for the reply, Jack. It only looks like spaces, because I did a cut-and-paste. The file in question does contain tabs instead of spaces, i.e.: idquestion answerurl q99 Who? You! none Another question I means to ask, is whether this sort of activity is logged anywhere. I mean, after adding or deleting data, is there somewhere a record of that action? The 'logging' tab on the Dashboard page only reports errors as far as I can see. Thanks, - Rob Your data file appears to use spaces rather than tabs. -- Jack Krupansky From: Rob Koeling Ai Sent: Friday, August 23, 2013 6:38 AM To: solr-user@lucene.apache.org Subject: Problem with importing tab-delimited csv file I'm having trouble importing a tab-delimited file with the csv update handler. My data file looks like this: id question answer url q99 Who? You! none When I send this data to Solr using Curl: curl 'http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09' --data @sample.tmp All seems to be well: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime221/int/lst /response But when I query the development core, there is no data. I must be overlooking something trivial. I would appreciate if anyone could spot what! - Rob
Re: Storing query results
I completely agree. I would prefer to just rerun the search each time. However, we are going to be replacing our rdb based search with something like Solr, and the application currently behaves this way. Our users understand that the search is essentially a snapshot (and I would guess many prefer this over changing results) and we don't want to change existing behavior and confuse anyone. Also, my boss told me it unequivocally has to be this way :p Thanks for your input though, looks like I'm going to have to do something like you've suggested within our application. -- View this message in context: http://lucene.472066.n3.nabble.com/Storing-query-results-tp4086182p4086349.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR Prevent solr of modifying fields when update doc
Perhaps an atomic update that only changes the fields you want to change? -Greg On Fri, Aug 23, 2013 at 4:16 AM, Luís Portela Afonso meligalet...@gmail.com wrote: Hi thanks by the answer, but the uniqueId is generated by me. But when solr indexes and there is an update in a doc, it deletes the doc and creates a new one, so it generates a new UUID. It is not suitable for me, because i want that solr just updates some fields, because the UUID is the key that i use to map it to an user in my database. Right now i'm using information that comes from the source and never chages, as my uniqueId, like for example the guid, that exists in some rss feeds, or if it doesn't exists i use link. I think there is any simple solution for me, because for what i have read, when an update to a doc exists, SOLR deletes the old one and create a new one, right? On Aug 23, 2013, at 12:07 PM, Erick Erickson erickerick...@gmail.com wrote: Well, not much in the way of help because you can't do what you want AFAIK. I don't think UUID is suitable for your use-case. Why not use your uniqueId? Or generate something yourself... Best Erick On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso meligalet...@gmail.com wrote: Hi, How can i prevent solr from update some fields when updating a doc? The problem is, i have an uuid with the field name uuid, but it is not an unique key. When a rss source updates a feed, solr will update the doc with the same link but it generates a new uuid. This is not the desired because this id is used by me to relate feeds with an user. Can someone help me? Many Thanks
Suspicious message with attachment
The following message addressed to you was quarantined because it likely contains a virus: Subject: Problem with SolrCloud + Zookeeper From: =?GB2312?B?0MvMzsvv?= sunxing...@gmail.com However, if you know the sender and are expecting an attachment, please reply to this message, and we will forward the quarantined message to you.
Re: Problem with importing tab-delimited csv file
You need the CSV content type header and --data-binary. I tried this with Solr 4.4: curl 'http://localhost:8983/solr/update?commit=trueseparator=%09' -H 'Content-type:application/csv' --data-binary @sample.tmp Otherwise, Solr just ignores the request. -- Jack Krupansky -Original Message- From: Rob Koeling Ai Sent: Friday, August 23, 2013 9:41 AM To: solr-user@lucene.apache.org Subject: Problem with importing tab-delimited csv file Thanks for the reply, Jack. It only looks like spaces, because I did a cut-and-paste. The file in question does contain tabs instead of spaces, i.e.: id question answer url q99 Who? You! none Another question I means to ask, is whether this sort of activity is logged anywhere. I mean, after adding or deleting data, is there somewhere a record of that action? The 'logging' tab on the Dashboard page only reports errors as far as I can see. Thanks, - Rob Your data file appears to use spaces rather than tabs. -- Jack Krupansky From: Rob Koeling Ai Sent: Friday, August 23, 2013 6:38 AM To: solr-user@lucene.apache.org Subject: Problem with importing tab-delimited csv file I'm having trouble importing a tab-delimited file with the csv update handler. My data file looks like this: id question answer url q99 Who? You! none When I send this data to Solr using Curl: curl 'http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09' --data @sample.tmp All seems to be well: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime221/int/lst /response But when I query the development core, there is no data. I must be overlooking something trivial. I would appreciate if anyone could spot what! - Rob
solr built with maven
Hello all, I am building Solr's source code through maven in order to develop on top of it on Netbeans (As no ant task was made to Netbeans... not cool!). Three doubts about that: 1. How can I execute the solr server? 2. How can i debug the solr server? 3. If I create new packages (RequestHandlers, TOkenizers, etc) where can I put them so that the compilation process will view the new files? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal
Re: solr built with maven
You want to change the solr source code itself or you want to create your own Tokenizers and things? If the later why not just set up solr as a dependency in your pom.xml like so: dependency groupIdorg.apache.lucene/groupId artifactIdlucene-test-framework/artifactId scopetest/scope version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-test-framework/artifactId scopetest/scope version${solr.version}/version /dependency dependency groupIdorg.apache.lucene/groupId artifactIdlucene-core/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.lucene/groupId artifactIdlucene-facet/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr/artifactId version${solr.version}/version typewar/type /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-core/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-solrj/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-langid/artifactId version${solr.version}/version /dependency dependency groupIdlog4j/groupId artifactIdlog4j/artifactId version1.2.16/version /dependency dependency groupIdcommons-cli/groupId artifactIdcommons-cli/artifactId version1.2/version /dependency dependency groupIdjavax.servlet/groupId artifactIdservlet-api/artifactId version2.5/version /dependency On Fri, Aug 23, 2013 at 12:24 PM, Bruno René Santos brunor...@gmail.comwrote: Hello all, I am building Solr's source code through maven in order to develop on top of it on Netbeans (As no ant task was made to Netbeans... not cool!). Three doubts about that: 1. How can I execute the solr server? 2. How can i debug the solr server? 3. If I create new packages (RequestHandlers, TOkenizers, etc) where can I put them so that the compilation process will view the new files? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal -- Brendan Grainger www.kuripai.com
Re: solr built with maven
I dont want to change solr just extend it, but it would be nice to have the source code on the project so that I can debug it in Netbeans. Do I need to include jetty too? By the way (this is a little off-topic sorry) do you know any site that explains how maven works in a straight-forward way? All this magic is a little confusing sometimes... Regards Bruno On Fri, Aug 23, 2013 at 5:46 PM, Brendan Grainger brendan.grain...@gmail.com wrote: You want to change the solr source code itself or you want to create your own Tokenizers and things? If the later why not just set up solr as a dependency in your pom.xml like so: dependency groupIdorg.apache.lucene/groupId artifactIdlucene-test-framework/artifactId scopetest/scope version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-test-framework/artifactId scopetest/scope version${solr.version}/version /dependency dependency groupIdorg.apache.lucene/groupId artifactIdlucene-core/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.lucene/groupId artifactIdlucene-facet/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr/artifactId version${solr.version}/version typewar/type /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-core/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-solrj/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-langid/artifactId version${solr.version}/version /dependency dependency groupIdlog4j/groupId artifactIdlog4j/artifactId version1.2.16/version /dependency dependency groupIdcommons-cli/groupId artifactIdcommons-cli/artifactId version1.2/version /dependency dependency groupIdjavax.servlet/groupId artifactIdservlet-api/artifactId version2.5/version /dependency On Fri, Aug 23, 2013 at 12:24 PM, Bruno René Santos brunor...@gmail.com wrote: Hello all, I am building Solr's source code through maven in order to develop on top of it on Netbeans (As no ant task was made to Netbeans... not cool!). Three doubts about that: 1. How can I execute the solr server? 2. How can i debug the solr server? 3. If I create new packages (RequestHandlers, TOkenizers, etc) where can I put them so that the compilation process will view the new files? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal -- Brendan Grainger www.kuripai.com -- Bruno René Santos Lisboa - Portugal
Re: solr built with maven
Hi Bruno, IntelliJ IDEA has a one-click way of downloading the source jars of dependencies into your project. I'd look for something similar in Netbeans rather than trying to hack together a Maven build of Solr yourself. Michael Della Bitta Applications Developer o: +1 646 532 3062 | c: +1 917 477 7906 appinions inc. “The Science of Influence Marketing” 18 East 41st Street New York, NY 10017 t: @appinions https://twitter.com/Appinions | g+: plus.google.com/appinionshttps://plus.google.com/u/0/b/112002776285509593336/112002776285509593336/posts w: appinions.com http://www.appinions.com/ On Fri, Aug 23, 2013 at 12:51 PM, Bruno René Santos brunor...@gmail.comwrote: I dont want to change solr just extend it, but it would be nice to have the source code on the project so that I can debug it in Netbeans. Do I need to include jetty too? By the way (this is a little off-topic sorry) do you know any site that explains how maven works in a straight-forward way? All this magic is a little confusing sometimes... Regards Bruno On Fri, Aug 23, 2013 at 5:46 PM, Brendan Grainger brendan.grain...@gmail.com wrote: You want to change the solr source code itself or you want to create your own Tokenizers and things? If the later why not just set up solr as a dependency in your pom.xml like so: dependency groupIdorg.apache.lucene/groupId artifactIdlucene-test-framework/artifactId scopetest/scope version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-test-framework/artifactId scopetest/scope version${solr.version}/version /dependency dependency groupIdorg.apache.lucene/groupId artifactIdlucene-core/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.lucene/groupId artifactIdlucene-facet/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr/artifactId version${solr.version}/version typewar/type /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-core/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-solrj/artifactId version${solr.version}/version /dependency dependency groupIdorg.apache.solr/groupId artifactIdsolr-langid/artifactId version${solr.version}/version /dependency dependency groupIdlog4j/groupId artifactIdlog4j/artifactId version1.2.16/version /dependency dependency groupIdcommons-cli/groupId artifactIdcommons-cli/artifactId version1.2/version /dependency dependency groupIdjavax.servlet/groupId artifactIdservlet-api/artifactId version2.5/version /dependency On Fri, Aug 23, 2013 at 12:24 PM, Bruno René Santos brunor...@gmail.com wrote: Hello all, I am building Solr's source code through maven in order to develop on top of it on Netbeans (As no ant task was made to Netbeans... not cool!). Three doubts about that: 1. How can I execute the solr server? 2. How can i debug the solr server? 3. If I create new packages (RequestHandlers, TOkenizers, etc) where can I put them so that the compilation process will view the new files? Regards Bruno Santos -- Bruno René Santos Lisboa - Portugal -- Brendan Grainger www.kuripai.com -- Bruno René Santos Lisboa - Portugal
Re: Problem with importing tab-delimited csv file
Hi Rob, I think the wrong Content-type header is getting passed. Try one of these instead: curl ' http://localhost:8983/solr/update/csv?commit=trueseparator=%09stream.file=/tmp/sample.tmp ' OR curl 'http://localhost:8983/solr/update/csv?commit=trueseparator=%09' -H 'Content-type:application/csv; charset=utf-8' --data-binary @sample.tmp Regards, Aloke On Fri, Aug 23, 2013 at 6:15 PM, Jack Krupansky j...@basetechnology.comwrote: Your data file appears to use spaces rather than tabs. -- Jack Krupansky *From:* Rob Koeling Ai rob.koel...@ai-applied.com *Sent:* Friday, August 23, 2013 6:38 AM *To:* solr-user@lucene.apache.org *Subject:* Problem with importing tab-delimited csv file I'm having trouble importing a tab-delimited file with the csv update handler. My data file looks like this: id question answer url q99 Who? You! none When I send this data to Solr using Curl: curl ' http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09'http://localhost:8181/solr/development/update/csv?commit=trueseparator=%09%27--data @sample.tmp All seems to be well: ?xml version=1.0 encoding=UTF-8? response lst name=responseHeaderint name=status0/intint name=QTime221/int/lst /response But when I query the development core, there is no data. I must be overlooking something trivial. I would appreciate if anyone could spot what! - Rob
Schema.xml definition problem
Hello I want to index the XML below with multivalued fields. What better way to set the schema.xml since there are nested data? Thank you. documento id/ //String descricao/ //String data/ //Date conteudo/ //String assentamentos //Multivalued assentamento //First register id/ //String nome/ //String matricula/ //String classificacoes //Multivalued classificacao //First register id/ //String descricao/ //String agrupadores //Multivalued agrupador //First register valor/ //String agrupador/ /agrupadores /classificacao /classificacoes /assentamento /assentamentos /documento -- *Everton Rodrigues Garcia*
Re: SOLR search by external fields
Did you look here? https://cwiki.apache.org/confluence/display/solr/Working+with+External+Files+and+Processes -- View this message in context: http://lucene.472066.n3.nabble.com/SOLR-search-by-external-fields-tp4086197p4086408.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Boost by numFounds
Any help..? Is it possible to add this pagerank-like behaviour?
Re: Grouping
I'm getting the same error...Is there any workaround to this? -- View this message in context: http://lucene.472066.n3.nabble.com/Grouping-tp2820116p4086425.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SOLR Prevent solr of modifying fields when update doc
Solr does not by default generate unique IDs. It uses what you give as your unique field, usually called 'id'. What software do you use to index data from your RSS feeds? Maybe that is creating a new 'id' field? There is no partial update, Solr (Lucene) always rewrites the complete document. On 08/23/2013 09:03 AM, Greg Preston wrote: Perhaps an atomic update that only changes the fields you want to change? -Greg On Fri, Aug 23, 2013 at 4:16 AM, Luís Portela Afonso meligalet...@gmail.com wrote: Hi thanks by the answer, but the uniqueId is generated by me. But when solr indexes and there is an update in a doc, it deletes the doc and creates a new one, so it generates a new UUID. It is not suitable for me, because i want that solr just updates some fields, because the UUID is the key that i use to map it to an user in my database. Right now i'm using information that comes from the source and never chages, as my uniqueId, like for example the guid, that exists in some rss feeds, or if it doesn't exists i use link. I think there is any simple solution for me, because for what i have read, when an update to a doc exists, SOLR deletes the old one and create a new one, right? On Aug 23, 2013, at 12:07 PM, Erick Erickson erickerick...@gmail.com wrote: Well, not much in the way of help because you can't do what you want AFAIK. I don't think UUID is suitable for your use-case. Why not use your uniqueId? Or generate something yourself... Best Erick On Thu, Aug 22, 2013 at 5:56 PM, Luís Portela Afonso meligalet...@gmail.com wrote: Hi, How can i prevent solr from update some fields when updating a doc? The problem is, i have an uuid with the field name uuid, but it is not an unique key. When a rss source updates a feed, solr will update the doc with the same link but it generates a new uuid. This is not the desired because this id is used by me to relate feeds with an user. Can someone help me? Many Thanks
Re: Index a database table?
On Fri, Aug 23, 2013 at 11:17 PM, Andrea Gazzarini-3 [via Lucene] ml-node+s472066n4086384...@n3.nabble.com wrote: Why don't you try? Actually I wanted every single step to be clear, thats why I asked. Now there is written: Ensure that your solr schema (schema.xml) has the fields 'id', 'name', 'desc'. Change the appropriate details in the data-config.xml My schema.xml is not having these fields. That means I have to declare them. Can you tell me where? Where to declare them in that file? Isn't there the same option as in solr 3.5.0, Using a command, schema was built and we placed that output in schema.xml file? Also its written: Drop your JDBC driver jar file into the solr-home/lib directory. It's a Java application to interact with database. Where is it? It must be a .jar file, rest I don't know yet. My solr/example/lib directory has an ext directory, jetty drivers and a servlet-api-3.0.jar driver. Is it fine? Then which one is JDBC driver? -- Kamaljeet Kaur kamalkaur188.wordpress.com facebook.com/kaur.188 -- View this message in context: http://lucene.472066.n3.nabble.com/Index-a-database-table-tp4086334p4086437.html Sent from the Solr - User mailing list archive at Nabble.com.