Re: Solr Cloud Stress Test
Hello! You'll need a data set that you can index. This is quite simple - if you don't know what data you want to index or you don't have test data, just take Wikipedia dump and index it using Data Import Handler. You can find the example on using Wikipedia dump with DIH on http://wiki.apache.org/solr/DataImportHandler#Example:_Indexing_wikipedia. On the other hand you can also extract documents with a simple code and use tool like JMeter to prepare indexing stress test to see how far you can push your SolrCloud deplouyment when it comes to indexing. You will also need queries, so that you can run queries performance tests. You can build some example queries using common phrases found in Wikipedia - however I'm not sure if there is something already extracted, you will either have to extract them yourself or use common English phrases. You take that list of phrases (it would be nice to have a lot of them, so they don't repeat) and run them against Solr. Again you can use a tool like JMeter, create a test plan and run a tests with different number of threads until you find a point where Solr just can't handle more. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi, I am a student, planning to learn and do a features and functionality test of solr-cloud as one of my project. I liked to do the stress and performance test of solr-cloud on my local machine. (machine of 16gb ram, 250 gb ssd and 2.2 GHz Intel Core i7). Multiple features of cloud. What is the recommended way to get started with it? Thanks. David
Replicating external field files under Windows
Hello! I have a slight problem with the replication and maybe someone have the same experience and know if there is a solution. I have a Windows based Solr installation where I use external field type - two fields, two external files containing data. The deployment is a standard master - slave. The problem is the replication. As we know, replication can copy the index files and the configuration files. On Linux system I used to use relative paths to the conf/ dir and I was able to copy the external fields as well. On Windows I can't, even though the paths seems to be OK. Anyone has similar experience with Windows, replication and external field type files? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch
Re: Query about vacuum full
Hello! Please refer to PostgreSQL mailing list with this question. This question is purely about that database and this mailing list is about Solr. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi, I am seeing considerable decrease in speed of indexing of documents. I am using PostgreSQL. So is this a right time to do vacuum on PostgreSQL because i am using this since a week. Also, to invoke vacuum full do i just need to go to PostgreSQL command prompt and invoke VACUUM FULL command? Thanks, Ameya
Re: Java heap Space error
Hello! Yes, just edit your Jetty configuration file and add -Xmx and -Xms parameters. For example, the file you may be looking at it /etc/default/jetty. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ So can i come over this exception by increasing heap size somewhere? Thanks, Ameya On Tue, Jul 22, 2014 at 2:00 PM, Shawn Heisey s...@elyograg.org wrote: On 7/22/2014 11:37 AM, Ameya Aware wrote: i am running into java heap space issue. Please see below log. All we have here is an out of memory exception. It is impossible to know *why* you are out of memory from the exception. With enough investigation, we could determine the area of code where the error occurred, but that doesn't say anything at all about what allocated all the memory, it's simply the final allocation that pushed it over the edge. Here is some generic information about Solr and high heap usage: http://wiki.apache.org/solr/SolrPerformanceProblems#Java_Heap Thanks, Shawn
Re: [ANNOUNCE] Apache Solr 4.8.0 released
Hello! You don't need the fields and types section anymore, you can just include type or field definition anywhere in the schema.xml section. You can find more in https://issues.apache.org/jira/browse/SOLR-5228 -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ In which sense fields and types are now deprecated in schema.xml? Where can I found any pointer about this? On Mon, Apr 28, 2014 at 6:54 PM, Uwe Schindler uschind...@apache.orgwrote: 28 April 2014, Apache Solr™ 4.8.0 available The Lucene PMC is pleased to announce the release of Apache Solr 4.8.0 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search. Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world's largest internet sites. Solr 4.8.0 is available for immediate download at: http://lucene.apache.org/solr/mirrors-solr-latest-redir.html See the CHANGES.txt file included with the release for a full list of details. Solr 4.8.0 Release Highlights: * Apache Solr now requires Java 7 or greater (recommended is Oracle Java 7 or OpenJDK 7, minimum update 55; earlier versions have known JVM bugs affecting Solr). * Apache Solr is fully compatible with Java 8. * fields and types tags have been deprecated from schema.xml. There is no longer any reason to keep them in the schema file, they may be safely removed. This allows intermixing of fieldType, field and copyField definitions if desired. * The new {!complexphrase} query parser supports wildcards, ORs etc. inside Phrase Queries. * New Collections API CLUSTERSTATUS action reports the status of collections, shards, and replicas, and also lists collection aliases and cluster properties. * Added managed synonym and stopword filter factories, which enable synonym and stopword lists to be dynamically managed via REST API. * JSON updates now support nested child documents, enabling {!child} and {!parent} block join queries. * Added ExpandComponent to expand results collapsed by the CollapsingQParserPlugin, as well as the parent/child relationship of nested child documents. * Long-running Collections API tasks can now be executed asynchronously; the new REQUESTSTATUS action provides status. * Added a hl.qparser parameter to allow you to define a query parser for hl.q highlight queries. * In Solr single-node mode, cores can now be created using named configsets. * New DocExpirationUpdateProcessorFactory supports computing an expiration date for documents from the TTL expression, as well as automatically deleting expired documents on a periodic basis. Solr 4.8.0 also includes many other new features as well as numerous optimizations and bugfixes of the corresponding Apache Lucene release. Please report any feedback to the mailing lists (http://lucene.apache.org/solr/discussion.html) Note: The Apache Software Foundation uses an extensive mirroring network for distributing releases. It is possible that the mirror you are using may not have replicated the release yet. If that is the case, please try another mirror. This also goes for Maven access. - Uwe Schindler uschind...@apache.org Apache Lucene PMC Chair / Committer Bremen, Germany http://lucene.apache.org/
Re: DIH issues with 4.7.1
Hello! Look at the javadocs for both. The granularity of System.currentTimeMillis() depend on the operating system, so it may happen that calls to that method that are 1 millisecond away from each other still return the same value. This is not the case with System.nanoTime() - http://docs.oracle.com/javase/7/docs/api/java/lang/System.html -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi I have just compare the difference between the version 4.6.0 and 4.7.1. Notice that the time in the getConnection function is declared with the System.nanoTime in 4.7.1 ,while System.currentTimeMillis(). Curious about the resson for the change.the benefit of it .Is it neccessory? I have read the SOLR-5734 , https://issues.apache.org/jira/browse/SOLR-5734 Do some google about the difference of currentTimeMillis and nano,but still can not figure out it. 2014-04-26 2:24 GMT+08:00 Shawn Heisey s...@elyograg.org: On 4/25/2014 11:56 AM, Hutchins, Jonathan wrote: I recently upgraded from 4.6.1 to 4.7.1 and have found that the DIH process that we are using takes 4x as long to complete. The only odd thing I notice is when I enable debug logging for the dataimporthandler process, it appears that in the new version each sql query is resulting in a new connection opened through jdbcdatasource (log: http://pastebin.com/JKh4gpmu). Were there any changes that would affect the speed of running a full import? This is most likely the problem you are experiencing: https://issues.apache.org/jira/browse/SOLR-5954 The fix will be in the new 4.8 version. The release process for 4.8 is underway right now. A second release candidate was required yesterday. If no further problems are encountered, the release should be made around the middle of next week. If problems are encountered, the release will be delayed. Here's something very important that has been mentioned before: Solr 4.8 will require Java 7. Previously, Java 6 was required. Java 7u55 (the current release from Oracle as I write this) is recommended as a minimum. If a 4.7.3 version is built, this is a fix that we should backport. Thanks, Shawn
Re: Error on startup
Hello! What are the start-up parameters for Tomcat? -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Running Solr 4.6.1, Tomcat 7.0.29, Zookeeper 3.4.6, Java 6 I have 3 Tomcats running, each with their own Solr war, all on the same box, along with 5 ZK nodes. It's a dev box. I can get the SolrCloud up and running, then use the Collections API to get everything going. It's all fine until I stop Tomcat and restart it. Then I get this error: ERROR - 2014-04-24 14:48:08.677; org.apache.solr.servlet.SolrDispatchFilter; Could not start Solr. Check solr/home property and the logs ERROR - 2014-04-24 14:48:08.705; org.apache.solr.common.SolrException; null:java.lang.IllegalArgumentException: Illegal directory: /data/solr1/test_shard1_replica3/conf at org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:1331) at org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:1373) at org.apache.solr.cloud.ZkController.bootstrapConf(ZkController.java:1558) at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:193) at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:73) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:208) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:183) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:133) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:269) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:258) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:382) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:103) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4650) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5306) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:618) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:650) at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1582) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) From looking at the source, it's trying to upload the conf directories. When I use the API to pull the configs from ZK, it doesn't download a conf directory, and I think that's why the error is caused. Is there any way around this? I'm using this dev box to get ready to move my production Solr instances up to the newer versions, but I'm worried that I'd run into this issue there. Any advice would be helpful, let me know if there's anything else I can provide to help debug this. Thanks! -- Chris
Re: Error on startup
Hello! You have bootstrap_conf=true. It uploads a configuration set for each core in the solr.xml file. I would suggest dropping that parameter and upload the configuration once, using the scripts provided with Solr. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ These get added to the startup of Tomcat: -DhostPort=8181 -Djetty.port=8181 -DzkHost=localhost:2181,localhost:2182,localhost:2183,localhost:2184,localhost:2185 -Dbootstrap_conf=true -Dport=8181 -DhostContext=solr -DzkClientTimeout=2 -- Chris On Thu, Apr 24, 2014 at 11:41 AM, Rafał Kuć r@solr.pl wrote: Hello! What are the start-up parameters for Tomcat? -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Running Solr 4.6.1, Tomcat 7.0.29, Zookeeper 3.4.6, Java 6 I have 3 Tomcats running, each with their own Solr war, all on the same box, along with 5 ZK nodes. It's a dev box. I can get the SolrCloud up and running, then use the Collections API to get everything going. It's all fine until I stop Tomcat and restart it. Then I get this error: ERROR - 2014-04-24 14:48:08.677; org.apache.solr.servlet.SolrDispatchFilter; Could not start Solr. Check solr/home property and the logs ERROR - 2014-04-24 14:48:08.705; org.apache.solr.common.SolrException; null:java.lang.IllegalArgumentException: Illegal directory: /data/solr1/test_shard1_replica3/conf at org.apache.solr.cloud.ZkController.uploadToZK(ZkController.java:1331) at org.apache.solr.cloud.ZkController.uploadConfigDir(ZkController.java:1373) at org.apache.solr.cloud.ZkController.bootstrapConf(ZkController.java:1558) at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:193) at org.apache.solr.core.ZkContainer.initZooKeeper(ZkContainer.java:73) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:208) at org.apache.solr.servlet.SolrDispatchFilter.createCoreContainer(SolrDispatchFilter.java:183) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:133) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:269) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:258) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:382) at org.apache.catalina.core.ApplicationFilterConfig.init(ApplicationFilterConfig.java:103) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4650) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5306) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:618) at org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:650) at org.apache.catalina.startup.HostConfig$DeployDescriptor.run(HostConfig.java:1582) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303) at java.util.concurrent.FutureTask.run(FutureTask.java:138) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) From looking at the source, it's trying to upload the conf directories. When I use the API to pull the configs from ZK, it doesn't download a conf directory, and I think that's why the error is caused. Is there any way around this? I'm using this dev box to get ready to move my production Solr instances up to the newer versions, but I'm worried that I'd run into this issue there. Any advice would be helpful, let me know if there's anything else I can provide to help debug this. Thanks! -- Chris
Re: Transformation on a numeric field
Hello! You can achieve that using update processor, for example look here: http://wiki.apache.org/solr/ScriptUpdateProcessor What you would have to do, in general, is create a script that would take a value of the field, divide it by the 1000 and put it in another field - the target numeric field. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi All, I am looking for a way to index a numeric field and its value divided by 1 000 into another numeric field. I thought about using a CopyField with a PatternReplaceFilterFactory to keep only the first few digits (cutting the last three). Solr complains that I can not have an analysis chain on a numeric field: Core: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Plugin init failure for [schema.xml] fieldType truncated_salary: FieldType: TrieIntField (truncated_salary) does not support specifying an analyzer. Schema file is /data/solr/solr-no-cloud/Core1/schema.xml Is there a way to accomplish this ? Thanks
Re: Solr dosn't load index at startup: out of memory
Hello! Do you have warming queries defined? -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi, my solr (v. 4.5) after moths of work suddenly stopped to index: it responded at the query but didn't index anymore new data. Here the error message: ERROR - 2014-04-11 15:52:30.317; org.apache.solr.common.SolrException; java.lang.IllegalStateException: this writer hit an OutOfMemoryError; cannot commit at org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2788) So, I restarted solr using more RAM (from 4GB until 8Gb) but now solr can't load the cores. Here the error message: ERROR - 2014-04-11 16:32:50.509; org.apache.solr.core.CoreContainer; Unable to create core: posts org.apache.solr.common.SolrException: Error Instantiating Update Handler, solr.DirectUpdateHandler2 failed to instantiate org.apache.solr.update.UpdateHandler ... Caused by: java.lang.OutOfMemoryError: Java heap space Anyone can help me? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-dosn-t-load-index-at-startup-out-of-memory-tp4130665.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud 4.6.1 hanging
Hello! I've created SOLR-5935, please let me know if some more information is needed. I'll be glad to help on the issue. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ I'm looking into a hang as well - not sure of it involves searching as well, but it may. Can you file a JIRA issue - let's track it down. - Mark On Mar 28, 2014, at 8:07 PM, Rafał Kuć r@solr.pl wrote: Hello! I have an issue with one of the SolrCloud deployments and I wanted to ask maybe someone had a similar issue. Six machines, a collection with 6 shards with a replication factor of 3. It all runs on 6 physical servers, each with 24 cores. We've indexed about 32 milion documents and everything was fine until that point. Now, during performance tests, we run into an issue - SolrCloud hangs when querying and indexing is run at the same time. First we see a normal load on the machines, than the load starts to drop and thread dump shown numerous threads like this: Thread 12624: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2043 (Compiled frame) - org.apache.http.pool.PoolEntryFuture.await(java.util.Date) @bci=50, line=131 (Compiled frame) - org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, org.apache.http.pool.PoolEntryFuture) @bci=431, line=281 (Compiled frame) - org.apache.http.pool.AbstractConnPool.access$000(org.apache.http.pool.AbstractConnPool, java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, org.apache.http.pool.PoolEntryFuture) @bci=8, line=62 (Compiled frame) - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, java.util.concurrent.TimeUnit) @bci=15, line=176 (Compiled frame) - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, java.util.concurrent.TimeUnit) @bci=3, line=169 (Compiled frame) - org.apache.http.pool.PoolEntryFuture.get(long, java.util.concurrent.TimeUnit) @bci=38, line=100 (Compiled frame) - org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(java.util.concurrent.Future, long, java.util.concurrent.TimeUnit) @bci=4, line=212 (Compiled frame) - org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(long, java.util.concurrent.TimeUnit) @bci=10, line=199 (Compiled frame) - org.apache.http.impl.client.DefaultRequestDirector.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=259, line=456 (Compiled frame) - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=344, line=906 (Compiled frame) - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest, org.apache.http.protocol.HttpContext) @bci=21, line=805 (Compiled frame) - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest) @bci=6, line=784 (Compiled frame) - org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest, org.apache.solr.client.solrj.ResponseParser) @bci=1175, line=395 (Interpreted frame) - org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest) @bci=17, line=199 (Compiled frame) - org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(org.apache.solr.client.solrj.impl.LBHttpSolrServer$Req) @bci=132, line=285 (Interpreted frame) - org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(org.apache.solr.client.solrj.request.QueryRequest, java.util.List) @bci=13, line=214 (Compiled frame) - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=246, line=161 (Compiled frame) - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=1, line=118 (Interpreted frame) - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 (Interpreted frame) - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame) - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=471 (Interpreted frame) - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 (Interpreted frame) - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame) - java.lang.Thread.run() @bci=11, line=724 (Interpreted
SolrCloud 4.6.1 hanging
Hello! I have an issue with one of the SolrCloud deployments and I wanted to ask maybe someone had a similar issue. Six machines, a collection with 6 shards with a replication factor of 3. It all runs on 6 physical servers, each with 24 cores. We've indexed about 32 milion documents and everything was fine until that point. Now, during performance tests, we run into an issue - SolrCloud hangs when querying and indexing is run at the same time. First we see a normal load on the machines, than the load starts to drop and thread dump shown numerous threads like this: Thread 12624: (state = BLOCKED) - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame; information may be imprecise) - java.util.concurrent.locks.LockSupport.park(java.lang.Object) @bci=14, line=186 (Compiled frame) - java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await() @bci=42, line=2043 (Compiled frame) - org.apache.http.pool.PoolEntryFuture.await(java.util.Date) @bci=50, line=131 (Compiled frame) - org.apache.http.pool.AbstractConnPool.getPoolEntryBlocking(java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, org.apache.http.pool.PoolEntryFuture) @bci=431, line=281 (Compiled frame) - org.apache.http.pool.AbstractConnPool.access$000(org.apache.http.pool.AbstractConnPool, java.lang.Object, java.lang.Object, long, java.util.concurrent.TimeUnit, org.apache.http.pool.PoolEntryFuture) @bci=8, line=62 (Compiled frame) - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, java.util.concurrent.TimeUnit) @bci=15, line=176 (Compiled frame) - org.apache.http.pool.AbstractConnPool$2.getPoolEntry(long, java.util.concurrent.TimeUnit) @bci=3, line=169 (Compiled frame) - org.apache.http.pool.PoolEntryFuture.get(long, java.util.concurrent.TimeUnit) @bci=38, line=100 (Compiled frame) - org.apache.http.impl.conn.PoolingClientConnectionManager.leaseConnection(java.util.concurrent.Future, long, java.util.concurrent.TimeUnit) @bci=4, line=212 (Compiled frame) - org.apache.http.impl.conn.PoolingClientConnectionManager$1.getConnection(long, java.util.concurrent.TimeUnit) @bci=10, line=199 (Compiled frame) - org.apache.http.impl.client.DefaultRequestDirector.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=259, line=456 (Compiled frame) - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.HttpHost, org.apache.http.HttpRequest, org.apache.http.protocol.HttpContext) @bci=344, line=906 (Compiled frame) - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest, org.apache.http.protocol.HttpContext) @bci=21, line=805 (Compiled frame) - org.apache.http.impl.client.AbstractHttpClient.execute(org.apache.http.client.methods.HttpUriRequest) @bci=6, line=784 (Compiled frame) - org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest, org.apache.solr.client.solrj.ResponseParser) @bci=1175, line=395 (Interpreted frame) - org.apache.solr.client.solrj.impl.HttpSolrServer.request(org.apache.solr.client.solrj.SolrRequest) @bci=17, line=199 (Compiled frame) - org.apache.solr.client.solrj.impl.LBHttpSolrServer.request(org.apache.solr.client.solrj.impl.LBHttpSolrServer$Req) @bci=132, line=285 (Interpreted frame) - org.apache.solr.handler.component.HttpShardHandlerFactory.makeLoadBalancedRequest(org.apache.solr.client.solrj.request.QueryRequest, java.util.List) @bci=13, line=214 (Compiled frame) - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=246, line=161 (Compiled frame) - org.apache.solr.handler.component.HttpShardHandler$1.call() @bci=1, line=118 (Interpreted frame) - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 (Interpreted frame) - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame) - java.util.concurrent.Executors$RunnableAdapter.call() @bci=4, line=471 (Interpreted frame) - java.util.concurrent.FutureTask$Sync.innerRun() @bci=29, line=334 (Interpreted frame) - java.util.concurrent.FutureTask.run() @bci=4, line=166 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker) @bci=95, line=1145 (Compiled frame) - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5, line=615 (Interpreted frame) - java.lang.Thread.run() @bci=11, line=724 (Interpreted frame) I've checked I/O statistics, GC working, memory usage, networking and all of that - those resources are not exhausted during the test. Hard autocommit is set to 15 seconds with openSearcher=false and softAutocommit to 4 hours. We have a fairly high query rate, but until we start indexing everything runs smooth. Has anyone encountered similar behavior? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch
Re: stored=true vs stored=false, in terms of storage
Hello! Stored fields, the one with stored=true, mean that the original value of the field is stored and can be retrieved with the document in search results. Fields marked as stored=false can't be retrieved during searching. That is the difference. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi, I am using Solr and I have one doubt. If any field has stored=false, does it mean that this fields is stored in disk and not in main memory. and this will be loaded whenever asked. The scenario I would like to handle this, In my case there are lots of information which I need to show when debugQuery=true, so i can take the latency hit on debugQuery=true. Can i save all the information in a field with indexed=false and stored=true. And how do normally DebugInformation is saved Regards, Pramod Negi
Re: organize folder inside Solr
Hello! There are multiple tutorials on how to do this, for example: 1. http://solr.pl/en/2011/03/21/solr-and-tika-integration-part-1-basics/ 2. http://wiki.apache.org/solr/ExtractingRequestHandler -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Thanks, I followed it carefully, the example in the tutorial is indexing only Xml files, and that is my problem, I want my search engine to look for other formats like pictures, music, PDF, and so on. and I'm working just on Collection1 , Med. -- View this message in context: http://lucene.472066.n3.nabble.com/organize-folder-inside-Solr-tp4122207p4122242.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Using numeric ranges in Solr query
Hello! Just specify the left boundary, like: price:[900 TO 1000] -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ When user enter a price in price field, for Ex: 1000 USD, i want to fetch all items with price around 1000 USD. I found in documentation that i can use price:[* to 1000] like that. It will get all items with from 1 to 1000 USD. But i want to get results where price is between 900 to 1000 USD. Any help is appreciated. Thank You. -- View this message in context: http://lucene.472066.n3.nabble.com/Using-numeric-ranges-in-Solr-query-tp4116941.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Realtimeget SolrCloud
Hello! Do you have realtime get handler defined in your solrconfig.xml? This part should be present: requestHandler name=/get class=solr.RealTimeGetHandler lst name=defaults str name=omitHeadertrue/str str name=wtjson/str str name=indenttrue/str /lst /requestHandler -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hello, I am currently experimenting to move our Solr instance into a SolrCloud setup. I am getting an error trying to access the realtimeget handlers: HTTP ERROR 404 Problem accessing /solr/collection1/get. Reason: Not Found They work fine in the normal Solr setup. Do I need some changes in configurations? Am I missing something? Or is it not supported in the cloud environment? -- View this message in context: http://lucene.472066.n3.nabble.com/Realtimeget-SolrCloud-tp4114595.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Realtimeget SolrCloud
Hello! No problem. Also remember that you need the _version_ field to be present in your schema. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ That seemed to be the issue. I had several other request handlers as I wasn't using the simple /get, but apparently in SolrCloud this handler must be present in order to use the class RealTimeGetHandler at all. Thank you! -- View this message in context: http://lucene.472066.n3.nabble.com/Realtimeget-SolrCloud-tp4114595p4114598.html Sent from the Solr - User mailing list archive at Nabble.com.
ODP: How to get phrase recipe working?
Hello, Phrase search will work on any analyzed field if you will use the to sorround your phrase. Are you looking for something specific or a standard phrase search? Also - if you are looking for exact phrade search, you may want to remove Snowball filter. Rafał Kuć Oryginalna wiadomość Od: eShard zim...@yahoo.com Data:21.01.2014 18:11 (GMT+01:00) Do: solr-user@lucene.apache.org Temat: How to get phrase recipe working? Good morning, In the Apache Solr 4 cookbook, p 112 there is a recipe for setting up phrase searches; like so: fieldType name=text_phrase class=solr.TextField positionIncrementGap=100 analyzer tokenizer class=solr.WhitespaceTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.SnowballPorterFilterFactory language=English/ /analyzer /fieldType I ran a sample query q=text_ph:a-z index and it didn't work very well at all. Is there a better way to do phrase searches? I need a specific configuration to follow/use. Thanks, -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-get-phrase-recipe-working-tp4112484.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Problem with starting solr
Hello! If you want to use Morfologik you need to include it jar files, because they are in included in the standard distribution. Look at the downloaded Solr distribution, you should find the contrib/analysis-extras/lib directory there. Get those libraries, put it in a directory (ie: /var/solr/ext) add a section like this to your solrconfig.xml (you can find it in conf directory of Solr): lib dir=/var/solr/lib regex=.*\.jar / In addition to that you'll need the lucene-analyzers-morfologik-4.6.0.jar from contrib/analysis-extras/lucene-lib. Just put it the same directory. After that restart Solr and you should be OK. Btw - you can find the instructions (although a bit old) at: http://solr.pl/2012/04/02/solr-4-0-i-mozliwosci-analizy-jezyka-polskiego/ -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi. I'm total beginner to Solr. When I'm trying to start it I get following errors: lukasz@lukasz-VirtualBox:~/PycharmProjects/solr/solr-4.6.0/example$ java -jar start.jar 0[main] INFO org.eclipse.jetty.server.Server – jetty-8.1.10.v20130312 104 [main] INFO org.eclipse.jetty.deploy.providers.ScanningAppProvider – Deployment monitor /home/lukasz/PycharmProjects/solr/solr-4.6.0/example/contexts at interval 0 214 [main] INFO org.eclipse.jetty.deploy.DeploymentManager – Deployable added: /home/lukasz/PycharmProjects/solr/solr-4.6.0/example/contexts/solr-jetty-context.xml 9566 [main] INFO org.eclipse.jetty.webapp.StandardDescriptorProcessor – NO JSP Support for /solr, did not find org.apache.jasper.servlet.JspServlet 9746 [main] INFO org.apache.solr.servlet.SolrDispatchFilter – SolrDispatchFilter.init() 9805 [main] INFO org.apache.solr.core.SolrResourceLoader – JNDI not configured for solr (NoInitialContextEx) 9808 [main] INFO org.apache.solr.core.SolrResourceLoader – solr home defaulted to 'solr/' (could not find system property or JNDI) 9809 [main] INFO org.apache.solr.core.SolrResourceLoader – new SolrResourceLoader for directory: 'solr/' 10116 [main] INFO org.apache.solr.core.ConfigSolr – Loading container configuration from /home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/solr.xml 10406 [main] INFO org.apache.solr.core.CoreContainer – New CoreContainer 30060911 10407 [main] INFO org.apache.solr.core.CoreContainer – Loading cores into CoreContainer [instanceDir=solr/] 10479 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory – Setting socketTimeout to: 0 10479 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory – Setting urlScheme to: http:// 10480 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory – Setting connTimeout to: 0 10481 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory – Setting maxConnectionsPerHost to: 20 10483 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory – Setting corePoolSize to: 0 10485 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory – Setting maximumPoolSize to: 2147483647 10486 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory – Setting maxThreadIdleTime to: 5 10487 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory – Setting sizeOfQueue to: -1 10488 [main] INFO org.apache.solr.handler.component.HttpShardHandlerFactory – Setting fairnessPolicy to: false 10702 [main] INFO org.apache.solr.logging.LogWatcher – SLF4J impl is org.slf4j.impl.Log4jLoggerFactory 10704 [main] INFO org.apache.solr.logging.LogWatcher – Registering Log Listener [Log4j (org.slf4j.impl.Log4jLoggerFactory)] 10839 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.CoreContainer – Creating SolrCore 'content' using instanceDir: solr/content 10839 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.SolrResourceLoader – new SolrResourceLoader for directory: 'solr/content/' 10850 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/lucene-analyzers-morfologik-4.6.0.jar' to classloader 10855 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/morfologik-fsa-1.8.1.jar' to classloader 10857 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/morfologik-polish-1.8.1.jar' to classloader 10859 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/lukasz/PycharmProjects/solr/solr-4.6.0/example/solr/content/lib/morfologik-stemming-1.8.1.jar' to classloader 11005 [coreLoadExecutor-3-thread-1] INFO org.apache.solr.core.SolrConfig – Adding specified lib dirs
Re: Solr Query Slowliness
Hello! Could you tell us more about your scripts? What they do? If the queries are the same? How many results you fetch with your scripts and so on. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi all, I have multiple python scripts querying solr with the sunburnt module. Solr was hosted on an Amazon ec2 m1.large (2 vCPU with 4 ECU, 7.5 GB memory 840 GB storage) and contained several cores for different usage. When I manually executed a query through Solr Admin (a query containing 10~15 terms, with some of them having boosts over one field and limited to one result without any sorting or faceting etc ) it takes around 700 ms, and the Core contained 7 million documents. When the scripts are executed things get slower, my query takes 7~10s. Then what I did is to turn to SolrCloud expecting huge performance increase. I installed it on a cluster of 5 Amazon ec2 c3.2xlarge instances (8 vCPU with 28 ECU, 15 GB memory 160 SSD storage), then I created one collection to contain the core I was querying, I sharded it to 25 shards (each node containing 5 shards without replication), each shards took 54 MB of storage. Tested my query on the new SolrCloud, it takes 70 ms ! huge increase wich is very good ! Tested my scripts again (I have 30 scripts running at the same time), and as a surprise, things run fast for 5 seconds then it turns realy slow again (query time ). I updated the solrconfig.xml to remove the query caches (I don't need them since queries are very different and only 1 time queries) and changes the index memory to 1 GB, but only got a small increase (3~4s for each query ?!) Any ideas ? PS: My index size will not stay with 7m documents, it will grow to +100m and that may get things worse
Re: Solr Query Slowliness
Hello! Different queries can have different execution time, that's why I asked about the details. When running the scripts, is Solr CPU fully utilized? To tell more I would like to see what queries are run against Solr from scripts. Do you have any information on network throughput between the server you are running scripts on and the Solr cluster? You wrote that the scripts are fine for 5 seconds and than they get slow. If your Solr cluster is not fully utilized I would take a look at the queries and what they return (ie. using faceting with facet.limit=-1) and seeing if the network is able to process those. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Thanks Rafal for your reply, My scripts are running on other independent machines so they does not affect Solr, I did mention that the queries are not the same (that is why I removed the query cache from solrconfig.xml), and I only get 1 result from Solr (which is the top scored one so no sorting since it is by default ordred by score) 2013/12/26 Rafał Kuć r@solr.pl Hello! Could you tell us more about your scripts? What they do? If the queries are the same? How many results you fetch with your scripts and so on. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi all, I have multiple python scripts querying solr with the sunburnt module. Solr was hosted on an Amazon ec2 m1.large (2 vCPU with 4 ECU, 7.5 GB memory 840 GB storage) and contained several cores for different usage. When I manually executed a query through Solr Admin (a query containing 10~15 terms, with some of them having boosts over one field and limited to one result without any sorting or faceting etc ) it takes around 700 ms, and the Core contained 7 million documents. When the scripts are executed things get slower, my query takes 7~10s. Then what I did is to turn to SolrCloud expecting huge performance increase. I installed it on a cluster of 5 Amazon ec2 c3.2xlarge instances (8 vCPU with 28 ECU, 15 GB memory 160 SSD storage), then I created one collection to contain the core I was querying, I sharded it to 25 shards (each node containing 5 shards without replication), each shards took 54 MB of storage. Tested my query on the new SolrCloud, it takes 70 ms ! huge increase wich is very good ! Tested my scripts again (I have 30 scripts running at the same time), and as a surprise, things run fast for 5 seconds then it turns realy slow again (query time ). I updated the solrconfig.xml to remove the query caches (I don't need them since queries are very different and only 1 time queries) and changes the index memory to 1 GB, but only got a small increase (3~4s for each query ?!) Any ideas ? PS: My index size will not stay with 7m documents, it will grow to +100m and that may get things worse
Re: Solr Query Slowliness
Hello! It seems that the number of queries per second generated by your scripts may be too much for your Solr cluster to handle with the latency you want. Try launching your scripts one by one and see what is the bottle neck with your instance. I assume that for some number of scripts running at the same time you will have good performance and it will start to degrade after you start adding even more. If you don't have high commit rate and you don't need NRT, disabling the caches shouldn't be needed and they can help with query performance. Also there are tools our there that can help you diagnose what the actual problem is, for example (http://sematext.com/spm/index.html). -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ This an example of a query: http://myip:8080/solr/TestCatMatch_shard12_replica1/select?q=Royal+Cashmere+RC+106+CS+Silk+Cashmere+V+Neck+Moss+Green+Men ^10+s+Sweater+Cashmere^3+Men^3+Sweaters^3+Clothing^3rows=1wt=jsonindent=true in return : { responseHeader:{ status:0, QTime:191}, response:{numFound:4539784,start:0,maxScore:2.0123534,docs:[ { Sections:fashion, IdsCategories:11101911, IdProduct:ef6b8d7cf8340d0c8935727a07baebab, Id:11101911-ef6b8d7cf8340d0c8935727a07baebab, Name:Uniqlo Men Cashmere V Neck Sweater Men Clothing Sweaters Cashmere, _version_:1455419757424541696}] }} This query was executed when no script is running so the QTime is only 191 ms, but it may take up to 3s when they are) Of course it can be smaller or bigger and of course that affects the execution time (the execution times I spoke of are the internal ones returned by solr, not calculated by me). And yes the CPU is fully used. 2013/12/26 Rafał Kuć r@solr.pl Hello! Different queries can have different execution time, that's why I asked about the details. When running the scripts, is Solr CPU fully utilized? To tell more I would like to see what queries are run against Solr from scripts. Do you have any information on network throughput between the server you are running scripts on and the Solr cluster? You wrote that the scripts are fine for 5 seconds and than they get slow. If your Solr cluster is not fully utilized I would take a look at the queries and what they return (ie. using faceting with facet.limit=-1) and seeing if the network is able to process those. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Thanks Rafal for your reply, My scripts are running on other independent machines so they does not affect Solr, I did mention that the queries are not the same (that is why I removed the query cache from solrconfig.xml), and I only get 1 result from Solr (which is the top scored one so no sorting since it is by default ordred by score) 2013/12/26 Rafał Kuć r@solr.pl Hello! Could you tell us more about your scripts? What they do? If the queries are the same? How many results you fetch with your scripts and so on. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi all, I have multiple python scripts querying solr with the sunburnt module. Solr was hosted on an Amazon ec2 m1.large (2 vCPU with 4 ECU, 7.5 GB memory 840 GB storage) and contained several cores for different usage. When I manually executed a query through Solr Admin (a query containing 10~15 terms, with some of them having boosts over one field and limited to one result without any sorting or faceting etc ) it takes around 700 ms, and the Core contained 7 million documents. When the scripts are executed things get slower, my query takes 7~10s. Then what I did is to turn to SolrCloud expecting huge performance increase. I installed it on a cluster of 5 Amazon ec2 c3.2xlarge instances (8 vCPU with 28 ECU, 15 GB memory 160 SSD storage), then I created one collection to contain the core I was querying, I sharded it to 25 shards (each node containing 5 shards without replication), each shards took 54 MB of storage. Tested my query on the new SolrCloud, it takes 70 ms ! huge increase wich is very good ! Tested my scripts again (I have 30 scripts running at the same time), and as a surprise, things run fast for 5 seconds then it turns realy slow again (query time ). I updated the solrconfig.xml to remove the query caches (I don't need them since queries are very different and only 1 time queries) and changes the index memory to 1 GB, but only got a small increase (3~4s for each query ?!) Any ideas ? PS: My index size will not stay with 7m documents, it will grow to +100m and that may get things worse
Re: subscribe for this maillist
Hello! To subscribe please send a mail to solr-user-subscr...@lucene.apache.org -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ I want to subscribe for this solr mailing list. Thanks and Best Regards, Gabriel Zhang
Re: What is the difference between attorney:(Roger Miller) and attorney:Roger Miller
Hello! In the first one, the two terms 'Roger' and 'Miller' are run against the attorney field. In the second the 'Roger' term is run against the attorney field and the 'Miller' term is run against the default search field. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ We got different results for these two queries. The first one returned 115 records and the second returns 179 records. Thanks, Fudong
Re: What is the difference between attorney:(Roger Miller) and attorney:Roger Miller
Hello! Terms surrounded by characters will be treated as phrase query. So, if your default query operator is OR, the attorney:(Roger Miller) will result in documents with first or second (or both) terms in the attorney field. The attorney:Roger Miller will result only in documents that have the phrase Roger Miller in the attorney field. You may want to look at Lucene query syntax to understand all the differences: http://lucene.apache.org/core/4_5_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#package_description -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Also, attorney:(Roger Miller) is same as attorney:Roger Miller right? Or the term Roger Miller is run against attorney? Thanks, -Utkarsh On Tue, Nov 19, 2013 at 12:42 PM, Rafał Kuć r@solr.pl wrote: Hello! In the first one, the two terms 'Roger' and 'Miller' are run against the attorney field. In the second the 'Roger' term is run against the attorney field and the 'Miller' term is run against the default search field. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ We got different results for these two queries. The first one returned 115 records and the second returns 179 records. Thanks, Fudong
Re: Solr Synonym issue
Hello! Could you please describe the issue you are having? -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi Team, I had implemented solr with my magento enterprise edition. I am trying to implemented synonyms in solr but its not working.Please find attached for the synonyms.txt,schema.xml and solrconf.xml file. Since 2 days i am debugging the issue yet not finding any solution. Please help me as soon as possible. Hope i will get your reply as soon as possible. Thanks Regards Jyoti Kadam
Re: Stop solr service
Hello! Could you please write more about what you want to do? Do you need to stop running Solr process. If yes what you need to do is stop the container (Jetty/Tomcat) that Solr runs in. You can also kill JVM running Solr, however it will be usually enough to just stop the container. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi Team, Pla stop the solr service.
Re: Stop solr service
Hello! Please look at the Solr user list section of http://lucene.apache.org/solr/discussion.html on how to unsubscribe from the list. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ I want to stop the mail On Sun, Oct 27, 2013 at 4:37 PM, Rafał Kuć r@solr.pl wrote: Hello! Could you please write more about what you want to do? Do you need to stop running Solr process. If yes what you need to do is stop the container (Jetty/Tomcat) that Solr runs in. You can also kill JVM running Solr, however it will be usually enough to just stop the container. -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi Team, Pla stop the solr service.
Re: How to configure solr to our java project in eclipse
Hello! If you really want to go embedded way, please look at http://wiki.apache.org/solr/EmbeddedSolr -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ Hi friends,Iam giridhar.please clarify my doubt. we are using solr for our project.the problem the solr is outside of our project( in another folder) we have to manually type java -start.jar to start the solr and use that services. But what we need is,when we run the project,the solr should be automatically start. our project is a java project with tomcat in eclipse. How can i achieve this. Please help me. Thankyou. Giridhar -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-configure-solr-to-our-java-project-in-eclipse-tp4097954.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 4.6.0 latest build
Hello! The current development version of Solr can be found in the SVN repository - https://lucene.apache.org/solr/versioncontrol.html You need to download the source code and build Solr yourself. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hi! I'm using Solr 4.4.0 currently and I'm having quite some trouble with * SOLR-5216: Document updates to SolrCloud can cause a distributed deadlock. (Mark Miller) which should be fixed for 4.6.0. Where could I get Solr 4.6.0 from? I want to make some tests regarding this fix. Thank you! - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-6-0-latest-build-tp4096960.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Solr 4.6.0 latest build
Hello! Thanks for correction :) Did you report that the issue still persists? -- Regards, Rafał Kuć Performance Monitoring * Log Analytics * Search Analytics Solr Elasticsearch Support * http://sematext.com/ That's not quite right. You don't have to download the source or build it yourself. You can grab a 4.6 snapshot build from here: https://builds.apache.org/job/Solr-Artifacts-4.x/lastSuccessfulBuild/artifact/solr/package/ I have an issue with updates no longer being processed, and the snapshots have not resolved the issue for me. Cheers, Chris On Tuesday, October 22, 2013, Rafał Kuć wrote: Hello! The current development version of Solr can be found in the SVN repository - https://lucene.apache.org/solr/versioncontrol.html You need to download the source code and build Solr yourself. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hi! I'm using Solr 4.4.0 currently and I'm having quite some trouble with * SOLR-5216: Document updates to SolrCloud can cause a distributed deadlock. (Mark Miller) which should be fixed for 4.6.0. Where could I get Solr 4.6.0 from? I want to make some tests regarding this fix. Thank you! - Thanks, Michael -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-6-0-latest-build-tp4096960.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Seeking New Moderators for solr-user@lucene
Hello! I can help with moderation. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch It looks like it's time to inject some fresh blood into the solr-user@lucene moderation team. If you'd like to volunteer to be a moderator, please reply back to this thread and specify which email address you'd like to use as a moderator (if different from the one you use when sending the email) Being a moderator is really easy: you'll get a some extra emails in your inbox with MODERATE in the subject, which you skim to see if they are spam -- if they are you delete them, if not you reply all to let them get sent to the list, and authorize that person to send future messages w/o moderation. Occasionally, you'll see an explicit email to solr-user-owner@lucene from a user asking for help realted to their subscription (usually unsubscribing problems) and you and the other moderators chime in with assistance when possible. More details can be found here... https://wiki.apache.org/solr/MailingListModeratorInfo (I'll wait ~72+ hours to see who responds, and then file the appropriate jira with INFRA) -Hoss
Re: faceting from multiValued field
Hello! Your field needs to be indexed in order for faceting to work. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hi, I am having a problem with multiValued field and Faceting This is the schema: field name=series type=string indexed=false stored=true required=false omitTermFreqAndPositions=true multiValued=true / all I get is: lst name=series / Note: the data is correctly placed in the field as the query results shows. However, the facet is not working. Could anyone tell me how to achieve it? Thanks a lot.
Re: change schema.xml without bringing down the solr
Hello! You can upload new schema.xml (along with other configuration files) and reload the collection using collections API (http://wiki.apache.org/solr/SolrCloud#Managing_collections_via_the_Collections_API). However you have remember that in order for the new field to be usable documents needs to be re-indexed. You can't just add new shard (you can add replicas though), but you can re-shard your current collection. Look at the collections API and the SPLITSHARD command, don't know which Solr you are using, but the SPLITSHARD is supported since Solr 4.3. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hi, I have couple of usecase need to be implemented. 1. Out application is 24 X 7 , so we require search feature to be available 24 X 7. How to add new field to live solr node , without brining down the solr instance ? 2. How to additional shard to existing collection and re distribute the index ? Thanks, ~sathish -- View this message in context: http://lucene.472066.n3.nabble.com/change-schema-xml-without-bringing-down-the-solr-tp4087184.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Concat 2 fields in another field
Hello! You don't have to write custom component - you can use ScriptUpdateProcessor - http://wiki.apache.org/solr/ScriptUpdateProcessor -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hello all , I am using solr 4.x , I have a requirement where I need to have a field which holds data from 2 fields concatenated using _. So for example I have 2 fields firstName and lastName , I want a third field which should hold firstName_lastName. Is there any existing concatenating component available or I need to write a custom updateProcessor which does this task. By the way need for having this third field is that I want to group on the firstname,lastName but as grouping does not support multiple fields to form single group I am using this trick. Hope I am clear . Thanks . -- View this message in context: http://lucene.472066.n3.nabble.com/Concat-2-fields-in-another-field-tp4086786.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: JSON Update create different copies of the same document
Hello! Do you have the unique identifier specified in the schema.xml ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hello, I thought that by adding a new document with the same id on Solr the document already on Solr would be updated with the new info. But both documents appear on the search results... How can I update a document? Regards Bruno Santos
Re: Problems installing Solr4 in Jetty9
Hello! Could you look at the logs and post what you find there? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch I've been unable to install SOLR into Jetty. Jetty seems to be running fine, and this is the steps I took to install solr: # SOLR cd /opt wget -O - $SOLR_URL | tar -xzf - cp solr-4.4.0/dist/solr-4.4.0.war /opt/jetty/webapps/solr.war cp -R solr-4.4.0/example/solr /opt/ cp -R /opt/solr-4.4.0/dist/ /opt/solr/ cp -R /opt/solr-4.4.0/contrib/ /opt/solr/ rm -rf /opt/solr-4.4.0 echo 'JAVA_OPTIONS=-Dsolr.solr.home=/opt/solr $JAVA_OPTIONS' /etc/default/jetty chown -R jetty:jetty /opt/solr/ /etc/init.d/jetty restart When I browse to the directory it says: HTTP ERROR: 503 Problem accessing /solr. Reason: Service Unavailable Powered by Jetty:// Is anyone able to tell me of any steps that I have missed for installing it, or how to diagnose it? -- View this message in context: http://lucene.472066.n3.nabble.com/Problems-installing-Solr4-in-Jetty9-tp4083209.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Enabling DIH breaks Solr4.4
Hello! Try changing the lib directives to absolute paths, by looking at the exception: Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.dataimport.DataImportHandler it seems that the DataImportHandler class is not seen by Solr class loader. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hi, I'm a bit stuck here. I had Solr4.4 working without too many issues. I wanted to enable the DIH so I firstly added these lines to the solrconfig.xml: lib dir=../contrib/dataimporthandler/lib regex=.*\.jar / lib dir=../dist/ regex=apache-solr-dataimporthandler-.*\.jar / Restarted and everything was fine. Then I added these lines to the config as well which broke Solr: requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=config/opt/solr/collection1/conf/data-config.xml/str /lst /requestHandler This is the error it gives me: HTTP ERROR 500 Problem accessing /solr/. Reason: {msg=SolrCore 'collection1' is not available due to init failure: RequestHandler init failure,trace=org.apache.solr.common.SolrException: SolrCore 'collection1' is not available due to init failure: RequestHandler init failure at org.apache.solr.core.CoreContainer.getCore(CoreContainer.java:860) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:251) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158) at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1486) at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:503) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:138) at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:540) at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:213) at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1094) at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:432) at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:175) at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1028) at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:136) at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:258) at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:109) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:317) at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:97) at org.eclipse.jetty.server.Server.handle(Server.java:445) at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:267) at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:224) at org.eclipse.jetty.io.AbstractConnection$ReadCallback.run(AbstractConnection.java:358) at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:601) at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:532) at java.lang.Thread.run(Thread.java:724) Caused by: org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.SolrCore.init(SolrCore.java:835) at org.apache.solr.core.SolrCore.init(SolrCore.java:629) at org.apache.solr.core.CoreContainer.createFromLocal(CoreContainer.java:622) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:657) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:364) at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:356) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) at java.util.concurrent.FutureTask.run(FutureTask.java:166) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) ... 1 more Caused by: org.apache.solr.common.SolrException: RequestHandler init failure at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:167) at org.apache.solr.core.SolrCore.init(SolrCore.java:772) ... 13 more Caused by: org.apache.solr.common.SolrException: Error loading class
Re: Solr search on a large text field is very slow
Hello! In general, wildcard queries can be expensive. If you need wildcard queries, you can try the EdgeNGram - http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.EdgeNGramFilterFactory -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Index size is around 150 GB and there are around 6.5 million documents in the index. Search on a specific text field is very slow, it takes 1 minute to 2 minute for wildcard queries like *test* with no highlighting and no facets This field contributes to 90% of index size. This is my shema.xml fieldType name=text_pl class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory/ filter class=solr.LowerCaseFilterFactory/ filter class=solr.StopFilterFactory words=stopwords.txt ignoreCase=true/ filter class=solr.WordDelimiterFilterFactory generateWordParts=0 generateNumberParts=0 catenateWords=0 catenateNumbers=0 catenateAll=0 splitOnCaseChange=0/ /analyzer /fieldType /types fields field name=syndrome type=text_pl indexed=true stored=true required=true multivalued=false omitNorms=false/ field name=test_file_result_id type=long indexed=true stored=true omitNorms=true multivalued=false/ field name=start_date type=date indexed=true stored=true omitNorms=true multivalued=false/ field name=job_id type=long indexed=true stored=true omitNorms=true multivalued=false/ field name=test_run_id type=long indexed=true stored=true omitNorms=true multivalued=false/ field name=cluster type=string indexed=true stored=true omitNorms=true multivalued=false/ field name=logfile type=text_ws indexed=true stored=true omitNorms=true multivalued=false/ I am using DIH for indexing data from Database. Largest syndrome field size is 5MB it can range from 5MB to 1KB I tried using whitespacetokeniser with not much luck, I am using solr3.6.0 Indexing takes 1.5 hours. Please let me know , if I need to add anything to improve the search speed. Thanks Meena -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-search-on-a-large-text-field-is-very-slow-tp4083310.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: SolrCloud replication server and Master Server
Hello! There is no master - slave in SolrCloud. You can create a collection that has only a single shard and have one replica. When you send an indexing request it will be forwarded to the leader shard (in your case you want it to be the instance running on 8080), however both Solr instances will accept indexing requests. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch I have a requirement to set solrcloud with 2 instances of Solr( one on 8080 and otehr on 9090 ports respectivey) and a Zookeeper ensembe( 3 modes). Can I have one solr instance as a Master and the other as a replcia of the Master. Because, when i set up a solrcloud and index to one of the solr running on 8080, it automatically routes to 9090 solr also. I need indexing only on 8080 and 9090 solr should be a replica of 8080solr. Is this possible in Cloud.' Pls guide me.
Re: java.lang.OutOfMemoryError: Requested array size exceeds VM limit
Hello! The exception you've shown tells you that Solr tried to allocate an array that exceeded heap size. Do you use some custom sorts? Did you send large bulks during the time that the exception occurred? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Today I found in solr logs exception: java.lang.OutOfMemoryError: Requested array size exceeds VM limit. At that time memory usage was ~200MB / Xmx3g Env looks like this: 3x standalone zK (Java 7) 3x Solr 4.2.1 on Tomcat (Java 7) Linux 3.2.0-4-amd64 #1 SMP Debian 3.2.46-1 x86_64 GNU/Linux One Solr and one ZK on single host: lmsiprse01, lmsiprse02, lmsiprse03 Before exception I start restarting Solr: lmsiprse01: [2013-08-01 05:23:43]: /etc/init.d/tomcat6-1 stop [2013-08-01 05:25:09]: /etc/init.d/tomcat6-1 start lmsiprse02 (leader): 2013-08-01 05:27:21]: /etc/init.d/tomcat6-1 stop 2013-08-01 05:29:31]: /etc/init.d/tomcat6-1 start lmsiprse03: [2013-08-01 05:25:48]: /etc/init.d/tomcat6-1 stop [2013-08-01 05:26:42]: /etc/init.d/tomcat6-1 start and error shows up at 2013 5:27:26 on lmsiprse01 Is anybody knows what happened? fragment of log looks: sie 01, 2013 5:27:26 AM org.apache.solr.core.SolrCore execute INFO: [products] webapp=/solr path=/select params={facet=truestart=0q=facet.limit=-1facet.field=attribute_u-typfacet.field=attribute_u-gama-kolorystycznafacet.field=brand_namewt=javabinfq=node_id:1056version=2rows=0} hits=1241 status=0 QTime=33 sie 01, 2013 5:27:26 AM org.apache.solr.common.SolrException log SEVERE: null:java.lang.RuntimeException: java.lang.OutOfMemoryError: Requested array size exceeds VM limit at org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:653) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:366) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:859) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:602) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489) at java.lang.Thread.run(Thread.java:724) Caused by: java.lang.OutOfMemoryError: Requested array size exceeds VM limit at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:64) at org.apache.lucene.util.PriorityQueue.init(PriorityQueue.java:37) at org.apache.solr.handler.component.ShardFieldSortedHitQueue.init(ShardDoc.java:113) at org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:766) at org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:625) at org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:604) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:311) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1817) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:639) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:345) ... 13 more
Re: SolrCloud replication server and Master Server
Hello! Take a look at http://wiki.apache.org/solr/SolrCloud - when creating a collection you can specify the maxShardsPerNode parameter, which allows you to control how many shards of a given collection is permitted to be placed on a single node. By default it is set to 1, try setting it to 2 when creating your collection. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch I am rephrasing my question, Currently what i haev is Solr 8080 - Shard1Shard2- Replica Solr 9090 - Shard 2 Shard 1 - Replica Can we have something like Solr 8080 - Shard 1, Shard 2 solr 9090 - Shard2-Replica , Shard1- Replica On Thu, Aug 1, 2013 at 6:40 PM, Prasi S prasi1...@gmail.com wrote: Here when i create a single shard and a replica, then my shard will be on one server and replcia in teh other isn't ? On Thu, Aug 1, 2013 at 6:29 PM, Rafał Kuć r@solr.pl wrote: Hello! There is no master - slave in SolrCloud. You can create a collection that has only a single shard and have one replica. When you send an indexing request it will be forwarded to the leader shard (in your case you want it to be the instance running on 8080), however both Solr instances will accept indexing requests. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch I have a requirement to set solrcloud with 2 instances of Solr( one on 8080 and otehr on 9090 ports respectivey) and a Zookeeper ensembe( 3 modes). Can I have one solr instance as a Master and the other as a replcia of the Master. Because, when i set up a solrcloud and index to one of the solr running on 8080, it automatically routes to 9090 solr also. I need indexing only on 8080 and 9090 solr should be a replica of 8080solr. Is this possible in Cloud.' Pls guide me.
Re: AND Queries
Hello! Try turning on debugQuery and see what I happening. From what I see you are searching the en term in lang field, the book term in url field and the pencil and cat terms in the default search field, but from your second query I see that you would like to find the last two terms in the url. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch I am searching for a keyword as like that: lang:en AND url:book pencil cat It returns me results however none of them includes both book, pencil and cat keywords. How should I rewrite my query? I tried this: lang:en AND url:(book AND pencil AND cat) and looks like OK. However this not: lang:en AND url:book AND pencil AND cat why?
Re: Changing the number of shards?
Hello! Just to be perfectly clear here - Solr has advantage over ElasticSearch because it can split the live index. However, this sentence from Jack's mail is not true: In other words, even though you can easily “add shards” on ES, those are really just replicas of existing primary shards and cannot be new primary shard The replicas can become new primary shards if something happens to a primary shard - this is similar to what SolrCloud supports. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Correct. ES currently does not let you change the number of shards after you've created an Index (Collection in SolrCloud). It does not let you split shards either. SolrCloud has an advantage over ES around this at this point. Otis -- Solr ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Fri, Jul 5, 2013 at 6:51 PM, Jack Krupansky j...@basetechnology.com wrote: According to the ElasticSearch glossary, “You cannot change the number of primary shards in an index, once the index is created.” Really? Is that true? (A “primary shard” is what Solr calls a shard, or slice.) In other words, even though you can easily “add shards” on ES, those are really just replicas of existing primary shards and cannot be new primary shards – if I understand that statement correctly. Besides the simple fact that if a primary shard goes down, a non-primary shard can be promoted to being a primary shard (“a replica shard can be promoted to a primary shard if the primary fails.”) See: http://www.elasticsearch.org/guide/reference/glossary/ My understanding is that with Solr you can do things like split shards and change the number of shards per node. Is this an advantage that ES does not offer? Any ES experts want to comment? -- Jack Krupansky
Re: change solr core schema and config via http
Hello! In 4.3.1 you can only read schema.xml or portions of it using Schema API (https://issues.apache.org/jira/browse/SOLR-4658). It is a start to allow schema.xml modifications using HTTP API, which will be a functionality of next release of Solr - https://issues.apache.org/jira/browse/SOLR-3251 -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hi, I am trying to figure out how to change the schema/config of an existing core or a core to be created via http calls to solr. After spending hours in searching online, I still could not find any documents showing me how to do it. The only way I know is that you have to log on to the solr host and then create/modify the schema/config files on the host. It seem surprising to me that we can create new core via the http interface but are not able to change the schema/config in the same way. Can anyone give me some hints? Thanks. Regards, James
Re: dynamic field
Hello! Dynamic field is just a regular field from Lucene point of view, so its content will be treated just like the content of other fields. The difference in on the Solr level. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch How is daynamic field in solr implemented? Does it get saved into the same Document as other regular fields in lucene index? Ming-
Re: Solr 3.5 Optimization takes index file size almost double
Hello! Optimize command needs to rewrite the segments, so while it is still working you may see the index size to be doubled. However after it is finished the index size will be usually lowered comparing to the index size before optimize. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hi, I have solr server 1.4.1 with index file size 428GB.Now When I upgrade solr Server 1.4.1 to Solr 3.5.0 by replication method. Size remains same. But when optimize index for Solr 3.5.0 instance its size reaches 791GB.so what is solutions for size remains same or lesser. I optimize Solr 3.5 with Query: solr3.5 url/update?optimize=truecommit=true Thanks regards Viresh Modi
Re: Solr 3.5 Optimization takes index file size almost double
Hello! Do you have some backup after commit in your configuration? It would also be good to see how your index directory looks like, can you list that ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Thanks Rafal for reply... I agree with you. But Actually After optimization , it does not reduce size and it remains double. so is there any thing we missed or need to do for achieving index size reduction ? Is there any special setting we need to configure for replication? On 13 June 2013 16:53, Rafał Kuć r@solr.pl wrote: Hello! Optimize command needs to rewrite the segments, so while it is still working you may see the index size to be doubled. However after it is finished the index size will be usually lowered comparing to the index size before optimize. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - ElasticSearch Hi, I have solr server 1.4.1 with index file size 428GB.Now When I upgrade solr Server 1.4.1 to Solr 3.5.0 by replication method. Size remains same. But when optimize index for Solr 3.5.0 instance its size reaches 791GB.so what is solutions for size remains same or lesser. I optimize Solr 3.5 with Query: solr3.5 url/update?optimize=truecommit=true Thanks regards Viresh Modi
Re: FW: Solr and Lucene
Hello! Solr 4.2.1 is using Lucene 4.2.1. Basically Solr and Lucene are currently using the same numbers after their development was merged. As far for Luke I think that the last version is using beta or alpha release of Lucene 4.0. I would try replacing Lucene jar's and see if it works although I didn't try it. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, Which lucene version is used with Solr 4.2.1? And is it possible to open it by luke? If not by any other tool? Thanks Thanks
Re: /non/existent/dir/yields/warning
Hello! You should remove that entry from your solrconfig.xml file. It is something like this: lib dir=/non/existent/dir/yields/warning / -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I am constantly getting this error in my solr log: Can't find (or read) directory to add to classloader: /non/existent/dir/yields/warning (resolved as: E:\Projects\apache_solr\solr-4.3.0\example\solr\genesis_experimental\non\existent\dir\yields\warning). Anyone got any idea on how to solve this
Re: /non/existent/dir/yields/warning
Hello! That's a good question. I suppose its there to show users how to setup a custom path to libraries. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch ok thanks :) But why was it there anyway? I mean it says in comments: If a 'dir' option (with or without a regex) is used and nothing is found that matches, a warning will be logged. So it looks like a kind of exception handling or logging for libs not found... so shouldnt this folder actually exist? On Mon, Jun 3, 2013 at 2:06 PM, Rafał Kuć r@solr.pl wrote: Hello! You should remove that entry from your solrconfig.xml file. It is something like this: lib dir=/non/existent/dir/yields/warning / -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I am constantly getting this error in my solr log: Can't find (or read) directory to add to classloader: /non/existent/dir/yields/warning (resolved as: E:\Projects\apache_solr\solr-4.3.0\example\solr\genesis_experimental\non\existent\dir\yields\warning). Anyone got any idea on how to solve this
Re: Solr/Lucene Analayzer That Writes To File
Hello! Take a look at custom posting formats. For example here is a nice post showing what you can do with Lucene SimpleText codec: http://blog.mikemccandless.com/2010/10/lucenes-simpletext-codec.html However please remember that it is not advised to use that codec in production environment. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi; I want to use Solr for an academical research. One step of my purpose is I want to store tokens in a file (I will store it at a database later) and I don't want to index them. For such kind of purposes should I use core Lucene or Solr? Is there an example for writing a custom analyzer and just storing tokens in a file?
Re: search filter
Hello! You can try sending a filter like this fq=Salary:[5+TO+10]+OR+Salary:0 It should work -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Dear All Can I write a search filter for a field having a value in a range or a specific value. Say if I want to have a filter like 1. Select profiles with salary 5 to 10 or Salary 0. So I expect profiles having salary either 0 , 5, 6, 7, 8, 9, 10 etc. It should be possible, can somebody help me with syntax of 'fq' filter please. Best Regards kamal
Re: Mandatory words search in SOLR
Hello! Change the default query operator. For example add the q.op=AND to your query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi SOLR Experts When I search documents with keyword as *java, mysql* then I get the documents containing either *java* or *mysql* or both. Is it possible to get the documents those contains both *java* and *mysql*. In that case, how the query would look like. Thanks a lot Kamal
Re: Indexing Point Number
Hello! Use a float field type in your schema.xml file, for example like this: fieldType name=float class=solr.TrieFloatField precisionStep=0 positionIncrementGap=0/ Define a field using this type: field name=price type=float indexed=true stored=true/ You'll be able to index data like this: field name=price19.95/field -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, how can I indexing numbers with decimal point. For example: 5,50 109,90 I want to sort the numbers. Thanks
Re: Prediction About Index Sizes of Solr
Hello! Let me answer the first part of your question. Please have a look at https://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/size-estimator-lucene-solr.xls It should help you make an estimation about your index size. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch This may not be a well detailed question but I will try to make it clear. I am crawling web pages and will index them at SolrCloud 4.2. What I want to predict is the index size. I will have approximately 2 billion web pages and I consider each of them will be 100 Kb. I know that it depends on storing documents, stop words. etc. etc. If you want to ask about detail of my question I may give you more explanation. However there should be some analysis to help me because I should predict something about what will be the index size for me. On the other hand my other important question is how SolrCloud makes replicas for indexes, can I change it how many replicas will be. Because I should multiply the total amount of index size with replica size. Here I found an article related to my analysis: http://juanggrande.wordpress.com/2010/12/20/solr-index-size-analysis/ I know this question may not be details but if you give ideas about it you are welcome.
Re: Don't cache filter queries
Hello! Just add {!cache=false} to the filter in your query (http://wiki.apache.org/solr/SolrCaching#filterCache). -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch I need to use the filter query feature to filter my results, but I don't want the results cached as documents are added to the index several times per second and the results will be state immediately. Is there any way to disable filter query caching? This is on Solr 4.1 running in Jetty on Ubuntu Server. Thanks.
Re: Search data who does not have x field
Hello! You can for example add a filter, that will remove the documents that doesn't have a value in a field, like: fq=!category:[*+TO+*] -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi all, I am facing a problem. Problem is: I have updated 250 data to solr. and some of data have category field and some of don't have. for example. { id:321, name:anurag, category:x }, { id:3, name:john } now i want to search that data who does not have that field. what query should like. please reply It is very urgent - i have to complete this task by today itself thanks in advance. -- View this message in context: http://lucene.472066.n3.nabble.com/Search-data-who-does-not-have-x-field-tp4046959.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Returning to Solr 4.0 from 4.1
Hello! I suppose the only way to make this work will be reindexing the data. Solr 4.1 uses Lucene 4.1 as you know, which introduced new default codec with stored fields compression and this is one of the reasons you can't read that index with 4.0. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Solr 4.1 has been giving up much trouble rejecting documents indexed. While I try to work my way through this, I would like to move our application back to Solr 4.0. However, now when I try to start Solr with same index that was created with Solr 4.0 but has been running on 4.1 few a few days I get this error chain: org.apache.solr.common.SolrException: Error opening new searcher Caused by: org.apache.solr.common.SolrException: Error opening new searcher Caused by: java.lang.IllegalArgumentException: A SPI class of type org.apache.lucene.codecs.Codec with name 'Lucene41' does not exist. You need to add the corresponding JAR file supporting this SPI to your classpath.The current classpath supports the following names: [Lucene40, Lucene3x] Obviously I'll not be installing Lucene41 in Solr 4.0, but is there any way to work around this? Note that neither solrconf.xml nor schema.xml have changed. Thanks.
Re: Returning to Solr 4.0 from 4.1
Hello! As far as I know you have to re-index using external tool. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch On Fri, Mar 1, 2013 at 11:59 AM, Rafał Kuć r@solr.pl wrote: Hello! I assumed that re-indexing can be painful in your case, if it wouldn't you probably would re-index by now :) I guess (didn't test it myself), that you can create another collection inside your cluster, use the old codec for Lucene 4.0 (setting the version in solrconfig.xml should be enough) and re-indexing, but still re-indexing will have to be done. Or maybe someone knows a better way ? Will I have to reindex via an external script bridging, such as a Python script which requests N documents at a time, indexes them into Solr 4.1, then requests another N documents to index? Or is there internal Solr / Lucene facility for this? I've actually looked for such a facility, but as I am unable to find such a thing I ask.
Re: solr 4.1 - trying to create 2 collection with 2 different sets of configurations
Hello! You can try doing the following: 1. Run Solr with no collection and no cores, just an empty solr.xml 2. If you don't have a ZooKeeper run Solr with -DzkRun 3. Upload you configurations to ZooKeeper, by running cloud-scripts/zkcli.sh -cmdupconfig -zkhost localhost:9983 -confdir CONFIGURATION_1_DIR -confname COLLECTION_1_NAME and cloud-scripts/zkcli.sh -cmdupconfig -zkhost localhost:9983 -confdir CONFIGURATION_2_DIR -confname COLLECTION_2_NAME 4. Create those two collections: curl 'http://localhost:8983/solr/admin/collections?action=CREATEname=COLLECTION_1_NAMEnumShards=2replicationFactor=0' and curl 'http://localhost:8983/solr/admin/collections?action=CREATEname=COLLECTION_2_NAMEnumShards=2replicationFactor=0' Of course the CONFIGURATION_1_DIR and CONFIGURATION_2_DIR is the directory where your configurations are stored and the COLLECTION_1_NAME and the COLLECTION_2_NAME are your collection names. Also adjust the numShards and replicationFactor to your needs. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch solr 4.1 - trying to create 2 collection with 2 different sets of configurations. Anyone accomplished this? if I run bootstraop twice on different conf dirs, I get both of them in zookeper, but using collections API to create a collection if collection.configName=seconfConf doesnt work. any idea? thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-4-1-trying-to-create-2-collection-with-2-different-sets-of-configurations-tp4043609.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: solr 4.1 - trying to create 2 collection with 2 different sets of configurations
Hello! Yes I did :) -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Thanks I'm going to try this. Have you tried it yourself? -- View this message in context: http://lucene.472066.n3.nabble.com/solr-4-1-trying-to-create-2-collection-with-2-different-sets-of-configurations-tp4043609p4043617.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: id field doesn't match
Hello! If you what you need is an exact match, try using the simple string type. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi all, i have an id field wich always contains a string with that schema vw-200130315- Wich field type and settings should i use to get exactly this id as a result. Actually i always get more then one result. Kind regards Benjamin
Re: Is solr reindex all documents for each commit?
Hello! You don't need to re-index all the documents, you only need to index the new ones. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi We have a requirement to add 10 or 100 new documents to solr on a daily basis. Our solr application has 3 million documents. So whenever i add new documents, do i need to reindex all documents(3 million)? -- View this message in context: http://lucene.472066.n3.nabble.com/Is-solr-reindex-all-documents-for-each-commit-tp4041470.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Indexed And Stored
Hello! The simplest way will be updating your schema.xml file, do the change that needs to be done and fully re-index your data. Solr wont be able to automatically change not indexed field to indexed one. You could also use the partial document update API of Solr if you don't have your original data, however there are a few limitations. In order to fully reconstruct your documents you would have to have all the fields stored in your index. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch hello, in my schema field name=city_name type=text_general indexed=false stored=true/ and i updated 18 data. now i need indexed=true for all old data. i need solution please someone help me out. please reply urgent!! thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Indexed-And-Stored-tp4039893.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Maximum Number of Records In Index
Hello! Right, my bad - ids are still using int32. However, that still gives us 2,147,483,648 possible identifiers right per single index, which is not close to the 13,5 millions mentioned in the first mail. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Rafal, What about docnums, don't they are limited by int32 ? 07.02.2013 15:33 пользователь Rafał Kuć r@solr.pl написал: Hello! Practically there is no limit in how many documents can be stored in a single index. In your case, as you are using Solr from 2011, there is a limitation regarding the number of unique terms per Lucene segment ( http://lucene.apache.org/core/old_versioned_docs/versions/3_0_0/fileformats.html#Limitations ). However I don't think you've hit that. Solr by itself doesn't remove documents unless told to do so. Its hard to guess what can be the reason and as you said, you see updates coming to your handler. Maybe new documents have the same identifiers that the ones that are already indexed ? As I said, this is only a guess and we would need to have more information. Are there any exceptions in the logs ? Do you run delete command ? Are your index files changed ? How do you run commit ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch I have searched this forum but not yet found a definitive answer, I think the answer is There is No Limit depends on server specification. But never the less I will say what I have seen and then ask the questions. From scratch (November 2011) I have set up our SOLR which contains data from various sources, since March 2012 , the number of indexed records (unique ID's) reached 13.5 million , which was to be expected. However for the last 8 months the number of records in the index has not gone above 13.5 million, yet looking at the request handler outputs I can safely say at least anywhere from 50 thousand to 100 thousand records are being indexed daily. So I am assuming that earlier records are being removed, and I do not want that. Question: If there is a limit to the number of records the index can store where do I find this and change it? Question: If there is no limit does anyone have any idea why for the last months the number has not gone beyond 13.5 million, I can safely say that at least 90% are new records. thanks macroman -- View this message in context: http://lucene.472066.n3.nabble.com/Maximum-Number-of-Records-In-Index-tp4038961.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Error setting up SOLR with Tomcat on Windows
Hello! What Gora says is valid. Just point the SimplePostTools to the correct host, core name and handler and it should work just fine, of course if Solr is up and running. For example the following works just fine with Solr running on 8080: java -Durl=http://localhost:8080/solr/collection1/update -jar post.jar *.xml -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch (a) I am not using jetty, I am using tomcat as mentioned in the subject line (b) I am using some other port number and that I have included in the url On Monday, January 28, 2013, Gora Mohanty g...@mimirtech.com wrote: On 28 January 2013 16:11, Neha Jatav neha.ja...@gmail.com wrote: Dear Gora Mohanty, I am not using solr-url literally. I am using the localhost url slash solr. Again, please read the documentation. As mentioned earlier: (a) You do not need -Durl=... if using built-in Jetty, (b) and the URL should include the 8983 port number. Regards, Gora
Re: Does solr 4.1 support field compression?
Hello! It should be turned on by default, because the stored fields compression is the behavior of the default Lucene 4.1 codec. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi everyone, I didn't see any mention of field compression in the release notes for Solr 4.1. Did the ability to automatically compress fields end up getting added to this release? Thanks!, Ken
Re: ResultSet Solr
Hello! Maybe you are looking to get the results in plain text if you want to remove all the XML tags ? If that so, you can try adding wt=csv to get the response as CSV instead of XML. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Thanks. half problem solved. is there anyway i can get rid of the rest: response, numFound, start, doc? -- View this message in context: http://lucene.472066.n3.nabble.com/ResultSet-Solr-tp4035729p4035733.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: ResultSet Solr
Hello! As far as I know you can't remove the response, numFound, start and docs. This is how the response is prepared by Solr and apart from removing the header, you can't do anything. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch no I wanted it in json. i want it to start from where square bracket starts [ . I want to remove everything before that. I can get it in json by including wt=json. I just want to remove Response, numFound, start and docs. -- View this message in context: http://lucene.472066.n3.nabble.com/ResultSet-Solr-tp4035729p4035748.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: curl with dynamic url not working
Hello! Try surrounding you URL with ' characters, so the whole command looks like this: curl 'http://127.0.0.1:8080/solr/newsRSS_DIH?command=full-importamp;clean=falseamp;url=http://www.example.com/news' -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch While running the following curl command, the url never went with params while invoking: curl http://127.0.0.1:8080/solr/newsRSS_DIH?command=full-importamp;clean=falseamp;url=http://www.example.com/news LOG details: INFO: [collection1] webapp=/solr path=/newsRSS_DIH params={command=full-import} status=0 QTime=11 Jan 21, 2013 10:19:14 AM org.apache.solr.handler.dataimport.DataImporter doFullImport If I run the command in browser without curl, it works as expected and indexed properly. Please let me know any pointer to resolve this issue. -- View this message in context: http://lucene.472066.n3.nabble.com/curl-with-dynamic-url-not-working-tp4035092.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Priorities on fields
Hello! What do you mean by priority ? You can define index or query time boost. However that will allow to specify the importance of such field. A good page to look at is: http://wiki.apache.org/solr/SolrRelevancyCookbook -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, Is it possible to define priorities on fields? Lets say I have a product table which has the following fields: - id - title - description - code_name An entry could be like this: id: 42 title: shinny new shoes description: Shinny new shoes made in Italy code_name: shinny-new-shoes-42-2013 Now, I would like to priorities the fields for the search hint. I would like to do as follow: id: 0.0 title: 0.8 description: 0.5 code_name: 0.1 Is it possible in SOLR 3.6.1? Dariusz
Re: Calculate a sum.
Hello! Fetching all the documents, especially for a query that returns many documents can be a pain. However there is a StatsComponent (http://wiki.apache.org/solr/StatsComponent) in Solr, however your field would have to be numeric and indexed. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch hello. My problem is, that i need to calculate a sum of amounts. this amount is in my index (stored=true). my php script get all values with paging. but if a request takes too long, jetty is killing this process and i get a broken pipe. Which is the best/fastest way to get the values of many fields from index? exists an ResponseHandler for exports? Or which is the fastest? -- View this message in context: http://lucene.472066.n3.nabble.com/Calculate-a-sum-tp4033091.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: defaultOperator in schema.xml
Hello! You should set the q.op parameter in your request handler configuration in solrconfig.xml instead of using the default operator from schema.xml. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch I'm testing out Solr 4.0. I got the sample schema working, so now I'm converting my existing schema (from a Solr 3.4 instance), but I'm confused as to what to do for the defaultOperator setting for the solrQueryParser field. For my existing Solr, I have: solrQueryParser defaultOperator=AND/ The 4.0 schema says that the defaultOperator field is deprecated, and seems to suggest that I just pass it along in my queries. Is there no way I can set it to AND by default somewhere else? I don't control all the applications that use my Solr Indexes, and I want to ensure that they operate with AND and not OR. Thanks! -- Chris
Re: No live SolrServers Solr 4 exceptions on trying to create a collection
Hello! Can you share the command you use to start all four Solr servers ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Any clue to why this is happening will be greatly appreciated. This has become a blocker for me. I can use the HTTPSolrServer to create a core/make requests etc, but then it behaves like Solr 3.6 http://host:port/solr/admin/cores and not http://host:port/solr/admin/collections With my setup (4 servers running at localhost 8983, 8900, 7574 and 7500) when I manually do a http://127.0.0.1:7500/solr/admin/cores?action=CREATEname=myColl1instanceDir=defaultdataDir=myColl1Datacollection=myColl1numShards=2 it creates the collection only at the 7500 server. This is similar to when I use HttpSolrServer (Solr 3.6 behavior). And of course when I initiate a http://127.0.0.1:7500/solr/admin/collections?action=CREATEname=myColl2instanceDir=defaultdataDir=myColl2Datacollection=myColl2numShards=2 as expected it creates the collection spread on 2 servers. I am failing to achieve the same with SolrJ. As in the code at the bottom of the mail, I use CloudSolrServer and get the No live SolrServers exception. Any help or direction will of how to create collections (using the collections API) using SolrJ will be highly appreciated. Regards Jay -Original Message- From: Jay Parashar [mailto:jparas...@itscape.com] Sent: Sunday, January 06, 2013 7:42 PM To: solr-user@lucene.apache.org Subject: RE: Solr 4 exceptions on trying to create a collection The exception No live SolrServers is being thrown when trying to create a new Collection ( code at end of this mail). On the CloudSolrServer request method, we have this line ClientUtils.appendMap(coll, slices, clusterState.getSlices(coll)); where coll is the new collection I am trying to create and hence clusterState.getSlices(coll)); is returning null. And then the loop of the slices which adds to the urlList never happens and hence the LBHttpSolrServer created in the CloudSolrServer has a null url list in the constructor. This is giving the No live SolrServers exception. What I am missing? Instead of passing the CloudSolrServer to the create.process, if I pass the LBHttpSolrServer (server.getLbServer()), the collection gets created but only on one server. My code to create a new Cloud Server and new Collection:- String[] urls = {http://127.0.0.1:8983/solr/,http://127.0.0.1:8900/solr/,http://127.0.0.1:7500/solr/,http://127.0.0.1:7574/solr/}; CloudSolrServer server = new CloudSolrServer(127.0.0.1:2181, new LBHttpSolrServer(urls)); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.CONNECTION_TIMEOUT, 5000); server.getLbServer().getHttpClient().getParams().setParameter(CoreConnectionPNames.SO_TIMEOUT, 2); server.setDefaultCollection(collectionName); server.connect(); CoreAdminRequest.Create create = new CoreAdminRequest.Create(); create.setCoreName(myColl); create.setCollection(myColl); create.setInstanceDir(defaultDir); create.setDataDir(myCollData); create.setNumShards(2); create.process(server); //Exception No live SolrServers is thrown here Thanks Jay -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Friday, January 04, 2013 6:08 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4 exceptions on trying to create a collection Tried Wireshark yet to see what host/port it is trying to connect and why it fails? It is a complex tool, but well worth learning. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book) On Fri, Jan 4, 2013 at 6:58 PM, Jay Parashar jparas...@itscape.com wrote: Thanks! I had a different version of httpclient in the classpath. So the 2nd exception is gone but now I am back to the first one org.apache.solr.client.solrj.SolrServerException: No live SolrServers available to handle this request -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Friday, January 04, 2013 4:21 PM To: solr-user@lucene.apache.org Subject: Re: Solr 4 exceptions on trying to create a collection For the second one: Wrong version of library on a classpath or multiple versions of library on the classpath which causes wrong classes with missing fields/variables? Or library interface baked in and the implementation is newer. Some sort of mismatch basically. Most probably in Apache http library. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, it doesn't seem to be working. (Anonymous - via GTD book
Re: Is there any C API for Solr??
Hello! Although not C, but C++ there is a project aiming at this - http://code.google.com/p/solcpp/ However I don't know how usable that is, you can just make pure HTTP calls and process the response. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi All, Is there any C API for Solr?? Thanks and regards, Romita
Re: What do I need to research to solve the problem of returning good results for a generic term?
Hello! Ahmet I think it is available here: http://archive.apachecon.com/eu2012/presentations/07-Wednesday/L1R-Lucene/ -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi David, Mikhail tackled a similar problem. http://www.apachecon.eu/schedule/presentation/18/ Where we can find full presentation? --- On Fri, 12/28/12, David Parks davidpark...@yahoo.com wrote: From: David Parks davidpark...@yahoo.com Subject: What do I need to research to solve the problem of returning good results for a generic term? To: solr-user@lucene.apache.org Date: Friday, December 28, 2012, 11:40 AM I'm sure this is a complex problem requiring many iterations of work, so I'm just looking for pointers in the right direction of research here. I have a base term, such as let's say black dress that I might search for. Someone searching on this term is most logically looking for black dresses. In my dataset I have black dresses, but I also have many CDs with the term black dress in them (it's not so uncommon of a song title). I would want the CDs to show up if I search for a more specific term like black dress CD, but I would want the black dresses to show up for the less specific term black dress. Google image search is excellent at handling this example. A pretty vanilla installation of Solr isn't yet great at it. So. just looking for a nudge in the right direction here. What should I go read up on first to start learning how to improve on these results?
Re: facet query
Hello! Try facet.mincount=1, that should help. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, is there a way with facets to say, return facets that are not 0? I have facet=truefacet.field=officefacet.field=name as my facet parameters, and with some of my queries it brings back people that have a value of 0. Thanks
Re: Getting error. org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler
Hello! It seems that Tomcat can't see the the jar's with DataImportHandler. Did you try adding a lib tag to your solrconfig.xml, ie.: something like this: lib dir=/usr/share/solr/lib/ regex=apache-solr-dataimporthandler-\d.*\.jar / And putting the apache-solr-dataimporthandler-3.6.1.jar and apache-solr-dataimporthandler-extras-3.6.1.jar to /usr/share/solr/lib ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch I am new for solr. I have installed apache tomcat 7.0 on my server and I have solr 3.6.1 on server. I have solr-home folder set by network guys on my D:\ drive. The folders in that are: bin,etc,logs,multicore,webapps. In the multicore folder there are: core0,core1,exampledocs,README.txt and solr.xml. In webapps folder I have solr.war file nothing else. Now I keep one more core folder in multicore folder named ConfigUserTextUpdate which have conf folder in it and restart the tomcat service and I can see the new core on the localhost/solr. Now I add db-config.xml to the ConfigUserTextUpdate core. Below is the content: dataConfig dataSource type=JdbcDataSource driver=com.microsoft.sqlserver.jdbc.SQLServerDriver url=jdbc:sqlserver://dbname\servername;databaseName=dbname user=username password=password/ document entity name=ConfigUserTextUpdate query=UserTextUpdate_Index /entity /document /dataConfig Upto here everything is fine and all the three cores are shown on localhost/solr. Now in the solrconfig.xml of the core ConfigUserTextUpdate I add the line requestHandler name=/dataimport class=org.apache.solr.handler.dataimport.DataImportHandler lst name=defaults str name=configdb-data-config.xml/str /lst /requestHandler And it starts giving the error as shown below: HTTP Status 500 - Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configuration errors, change: false in solr.xml - org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:394) at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:419) at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:455) at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:159) at org.apache.solr.core.SolrCore.(SolrCore.java:563) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:480) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:332) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:216) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:161) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:96) at org.apache.catalina.core.ApplicationFilterConfig.initFilter(ApplicationFilterConfig.java:277) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:258) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:382) at org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:103) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4650) at org.apache.catalina.core.StandardContext.startInternal(StandardContext.java:5306) at org.apache.catalina.util.LifecycleBase.start(LifecycleBase.java:150) at org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:901) at org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:877) at org.apache.catalina.core.StandardHost.addChild(StandardHost.java:633) at org.apache.catalina.startup.HostConfig.deployWAR(HostConfig.java:977) at org.apache.catalina.startup.HostConfig$DeployWar.run(HostConfig.java:1655) at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) at java.util.concurrent.FutureTask.run(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.dataimport.DataImportHandler at java.net.URLClassLoader$1.run(Unknown Source) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.net.FactoryURLClassLoader.loadClass(Unknown Source) at java.lang.ClassLoader.loadClass(Unknown Source) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Unknown Source) at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:378) ... 27 more I have
Re: user session id / cookie to record search query
Hello! You want it to be written into logs ? If that is the case you can just add additional parameter, that is not recognized by Solr, for example 'userId' and send a query like this: http://localhost:8983/solr/select?q=*:*userId=user1 In the logs you should see something like that: INFO: [collection1] webapp=/solr path=/select params={q=*:*userId=user1} hits=0 status=0 QTime=1 -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi All, Do anyone have an idea how to use user session id / cookie to record search query from that particular user. Thanks and regards, Romita
Re: user session id / cookie to record search query
Hello! What Solr are you using ? If not 4.0, information on logging can be found on wiki - http://wiki.apache.org/solr/SolrLogging -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hello Rafał Kuć Thanks a lot for you guidance. I am not quite sure how to i collect the logs. Could you please help. Romita From: Rafał Kuć r@solr.pl To: solr-user@lucene.apache.org, Date: 11/21/2012 04:57 PM Subject:Re: user session id / cookie to record search query Hello! You want it to be written into logs ? If that is the case you can just add additional parameter, that is not recognized by Solr, for example 'userId' and send a query like this: http://localhost:8983/solr/select?q=*:*userId=user1 In the logs you should see something like that: INFO: [collection1] webapp=/solr path=/select params={q=*:*userId=user1} hits=0 status=0 QTime=1
Re: Pls help: Very long query - what to do?
Hello! If you really need a query that long, than one of the things is increase the allowed header length in Jetty. Add the following to your Jetty connector configuration: Set name=headerBufferSize16384/Set -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch my query is like this, see below. I use already POST request. i got a solr exception: org.apache.solr.client.solrj.SolrServerException: Server at http://server:7056/solr returned non ok status:400, message:Bad Request is there a way in order to prevent this? id:(ModuleImpl@20117 OR ModuleImpl@37886 OR ModuleImpl@9379 OR ModuleImpl@37906 OR ModuleImpl@19969 OR ModuleImpl@37936 OR ModuleImpl@115568 OR ModuleImpl@19901 OR ModuleImpl@115472 OR ModuleImpl@20044 OR ModuleImpl@25168 OR ModuleImpl@38026 OR ModuleImpl@115647 OR ModuleImpl@115648 OR ModuleImpl@115649 OR ModuleImpl@20045 OR ModuleImpl@25169 OR ModuleImpl@38031 OR ModuleImpl@115650 OR ModuleImpl@21090 OR ModuleImpl@38037 OR ModuleImpl@117097 OR ModuleImpl@21091 OR ModuleImpl@38038 OR ModuleImpl@117098 OR ModuleImpl@117099 OR ModuleImpl@19973 OR ModuleImpl@38040 OR ModuleImpl@115571 OR ModuleImpl@115572 OR ModuleImpl@115573 OR ModuleImpl@21092 OR ModuleImpl@38135 OR ModuleImpl@117100 OR ModuleImpl@21093 OR ModuleImpl@38136 OR ModuleImpl@117101 OR ModuleImpl@117102 OR ModuleImpl@19979 OR ModuleImpl@38140 OR ModuleImpl@115581 OR ModuleImpl@19980 OR ModuleImpl@38143 OR ModuleImpl@115582 OR ModuleImpl@115583 OR ModuleImpl@21094 OR ModuleImpl@38223 OR ModuleImpl@117104 OR ModuleImpl@117105 OR ModuleImpl@117106 OR ModuleImpl@117107 OR ModuleImpl@117108 OR ModuleImpl@21095 OR ModuleImpl@38224 OR ModuleImpl@117109 OR ModuleImpl@19920 OR ModuleImpl@25157 OR ModuleImpl@38240 OR ModuleImpl@115493 OR ModuleImpl@20139 OR ModuleImpl@38286 OR ModuleImpl@115752 OR ModuleImpl@21096 OR ModuleImpl@38327 OR ModuleImpl@117111 OR ModuleImpl@117112 OR ModuleImpl@117113 OR ModuleImpl@21097 OR ModuleImpl@38328 OR ModuleImpl@117114 OR ModuleImpl@19989 OR ModuleImpl@25166 OR ModuleImpl@38332 OR ModuleImpl@115585 OR ModuleImpl@115586 OR ModuleImpl@19990 OR ModuleImpl@38339 OR ModuleImpl@115587 OR ModuleImpl@115588 OR ModuleImpl@115589 OR ModuleImpl@115590 OR ModuleImpl@115591 OR ModuleImpl@115592 OR ModuleImpl@115593 OR ModuleImpl@115594 OR ModuleImpl@115595 OR ModuleImpl@19807 OR ModuleImpl@38365 OR ModuleImpl@115365 OR ModuleImpl@115366 OR ModuleImpl@19808 OR ModuleImpl@38373 OR ModuleImpl@115367 OR ModuleImpl@115368 OR ModuleImpl@115369 OR ModuleImpl@115370 OR ModuleImpl@115371 OR ModuleImpl@21121 OR ModuleImpl@38418 OR ModuleImpl@117132 OR ModuleImpl@117133 OR ModuleImpl@117134 OR ModuleImpl@732 OR ModuleImpl@38438 OR ModuleImpl@117115 OR ModuleImpl@21099 OR ModuleImpl@38440 OR ModuleImpl@117116 OR ModuleImpl@19929 OR ModuleImpl@38450 OR ModuleImpl@115501 OR ModuleImpl@115502 OR ModuleImpl@19810 OR ModuleImpl@38471 OR ModuleImpl@115372 OR ModuleImpl@115373 OR ModuleImpl@21124 OR ModuleImpl@38529 OR ModuleImpl@117135 OR ModuleImpl@117136 OR ModuleImpl@117137 OR ModuleImpl@117138 OR ModuleImpl@19931 OR ModuleImpl@115505 OR ModuleImpl@21074 OR ModuleImpl@38546 OR ModuleImpl@117077 OR ModuleImpl@19934 OR ModuleImpl@38548 OR ModuleImpl@115507 OR ModuleImpl@115508 OR ModuleImpl@115509 OR ModuleImpl@115510 OR ModuleImpl@20550 OR ModuleImpl@38607 OR ModuleImpl@115885 OR ModuleImpl@21127 OR ModuleImpl@38638 OR ModuleImpl@117139 OR ModuleImpl@21077 OR ModuleImpl@25182 OR ModuleImpl@38657 OR ModuleImpl@117078 OR ModuleImpl@117079 OR ModuleImpl@117080 OR ModuleImpl@19938 OR ModuleImpl@38658 OR ModuleImpl@115516 OR ModuleImpl@115517 OR ModuleImpl@115518 OR ModuleImpl@115519 OR ModuleImpl@19864 OR ModuleImpl@115432 OR ModuleImpl@19769 OR ModuleImpl@38695 OR ModuleImpl@115320 OR ModuleImpl@20556 OR ModuleImpl@38720 OR ModuleImpl@20494 OR ModuleImpl@38736 OR ModuleImpl@19871 OR ModuleImpl@115438 OR ModuleImpl@21056 OR ModuleImpl@38771 OR ModuleImpl@19775 OR ModuleImpl@19776 OR ModuleImpl@38802 OR ModuleImpl@115330 OR ModuleImpl@115331 OR ModuleImpl@115332 OR ModuleImpl@20566 OR ModuleImpl@38835 OR ModuleImpl@115889 OR ModuleImpl@115890 OR ModuleImpl@20501 OR ModuleImpl@38846 OR ModuleImpl@115869 OR ModuleImpl@115870 OR ModuleImpl@21107 OR ModuleImpl@38859 OR ModuleImpl@117118 OR ModuleImpl@19879 OR ModuleImpl@38871 OR ModuleImpl@115444 OR ModuleImpl@115445 OR ModuleImpl@21058 OR ModuleImpl@38873 OR ModuleImpl@19823 OR ModuleImpl@25153 OR ModuleImpl@38896 OR ModuleImpl@115396 OR ModuleImpl@115397 OR ModuleImpl@19779 OR ModuleImpl@38904 OR ModuleImpl@115334 OR ModuleImpl@115335 OR ModuleImpl@115336 OR ModuleImpl@20574 OR ModuleImpl@38932 OR ModuleImpl@115892 OR ModuleImpl@115893 OR ModuleImpl@20504 OR ModuleImpl@38941 OR ModuleImpl@115871 OR ModuleImpl@115872 OR ModuleImpl@21083 OR ModuleImpl@38962 OR ModuleImpl@117081 OR ModuleImpl@117082 OR ModuleImpl@19884 OR ModuleImpl@38969 OR ModuleImpl@115449
Re: SolrCloud and external Zookeeper ensemble
Hello! As I told I wouldn't use the Zookeeper that is embedded into Solr, but rather setup a standalone one. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch First of all: thank you for your answers. Yes, I meant side by side configuration. I think the worst case for ZKs here is to loose two of them. However, I'm going to use 4 availability zones in same region so at least this will reduce the risk of loosing both of them at the same time. Regards. On 21 November 2012 17:06, Rafał Kuć r@solr.pl wrote: Hello! Zookeeper by itself is not demanding, but if something happens to your nodes that have Solr on it, you'll loose ZooKeeper too if you have them installed side by side. However if you will have 4 Solr nodes and 3 ZK instances you can get them running side by side. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Separate is generally nice because then you can restart Solr nodes without consideration for ZooKeeper. Performance-wise, I doubt it's a big deal either way. - Mark On Nov 21, 2012, at 8:54 AM, Marcin Rzewucki mrzewu...@gmail.com wrote: Hi, I have 4 solr collections, 2-3mn documents per collection, up to 100K updates per collection daily (roughly). I'm going to create SolrCloud4x on Amazon's m1.large instances (7GB mem,2x2.4GHz cpu each). The question is what about zookeeper? It's going to be external ensemble, but is it better to use same nodes as solr or dedicated micro instances? Zookeeper does not seem to be resources demanding process, but what would be better in this case ? To keep it inside of solrcloud or separately (micro instances seem to be enough here) ? Thanks in advance. Regards.
Re: Additional field informations?
Hello! I think that's not possible out of the box in Solr. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hello all, We import xml documents to solr with solrj. We use xsl to proccess the objects to fields. We got the language informations in our objects. After xsl out Documents look like this: add doc ... field name=titlegerman title/title field name=titleenglish title/title field name=titlefrench title/title ... /doc /add Our schema.xml looks like this. (we use it as a filter too..) ... field name=title type=string indexed=true stored=true multiValued=true / ... Our results look like this. (we want to transform it directly to html with xsl) arr name=title strgerman title/str strenglish title/str strfrench title/str /arr Is there any possibillity to get a result like this: arr name=title str lang=degerman title/str str lang=enenglish title/str str lang=frgerman title/str /arr
Re: SOLR USING 100% percent CPU and not responding after a while
Hello! How about some information how Garbage Collector works during the time when you have Solr unresponsive ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hello, We were doing fine with memory, comparitively. This was in a one hour span of load testing we did. And no out of memory exceptions seen in the log. I have copied the memory chart here. It shows that most of the memory was being used for cache. Thank you -Original Message- From: Rafał Kuć [mailto:r@solr.pl] Sent: Tuesday, November 20, 2012 11:25 AM To: solr-user@lucene.apache.org Subject: Re: SOLR USING 100% percent CPU and not responding after a while Hello! The first thing one should ask is what is the cause of the 100% CPU utilization. It can be an issue with the memory or your queries may be quite resource intensive. Can you provide some more information about your issues ? Do you see any OutOfMemory exceptions in your logs ? How are JVM garbage collector behaving when you are experiencing Solr being unresponsive ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hello all, We have solr 3.6 in a tomcat server with 16 GB memory allocated to it in a Linux server. It is a multicore solr and one of the core has over 100 million records. We have a web application that is fed by this solr. We don't use sharding, different cores are for different purposes. There are all kind of queries involved: simple queries, queries with filters, queries with facets, queries with pagination ( deep pagination as well) and a combination of them . When there are 100 concurrent users and quite some queries thrown to the solr the CPU utilization by the solr exceeds 100% and it becomes unresponsive. Has someone from this group have had this problem and found a solution. We did look at the queries from the logs and there is some optimization to be done there but is there anything we can do in the configuration. I know that there is a parameter called MergeFactor which can be set low in the solrconfig.xml for enabling faster searching but I don't see anything about CPU utilization related to that settings. So my question in a jest is that is there any CPU utilization related configuration settings for SOLR. Any help will be appreciated a lot Thank you a bunch Biva
Re: Handle Queries which return 1000s of records
Hello! By pieces you mean by paging the results ? If yes, please look at http://wiki.apache.org/solr/CommonQueryParameters - start and rows parameters. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I am integrating solr search on my website. I am having some keywords that would return 1000s of records. Is there any way in solr by which I can get results in pieces. Pratyul
Re: How to insert documents into differenet indexes
Hello! Just use the update handler that is specific to a given core. For example if you have two cores named core1 and core2, you should use the following addresses (if you didn't change the default configuration): /solr/core1/update/ and /solr/core2/update/ -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I 've set up a Solr instance with multiple cores to be able to use different indexes for different applications. The point I'm struggling with is how do I insert documents into the index running on a specific core? Any clue appreciated. best
Re: Sorting by boolean function
Hello! From looking at your sort parameter it seems that you are missing the order of sorting. The function query will return a value on which Solr will sort and Solr needs to know if the documents should be sorted in descending or ascending order. Try something like that: sort=if(exists(start_time:[* TO 1721] AND end_time:[1721 TO *]),100,0)+desc -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi All, I am trying to sort by a boolean function to sort the places that are open and not open as given below sort=if(exists(start_time:[* TO 1721] AND end_time:[1721 TO *]),100,0) However I keep getting an error message saying Can't determine a Sort Order (asc or desc) in sort spec 'if(exists(start_time:[* TO 1721] AND end_time:[1721 TO *]),100,0) asc', pos=23 I assume that boolean functions can be used to sort. Is there anything missing from the sort function ? Thanks, Indika
Re: Sorting by boolean function
Hello! Ahhh, sorry. I see now. The function query you are specifying for the sort is not proper. Solr doesn't understand that part: exists(start_time:[* TO 1721] AND end_time:[1721 TO *]) try something like that: sort=if(exists(query($innerQuery)),100,0)+descinnerQuery={!edismax}(start_time:[*+TO+1721] AND end_time:[1721+TO+*]) However remember that exists function is only Solr 4.0. I don't have any example data with me right now, so I can't say if it will work for sure, but clean Solr 4.0 doesn't cry about errors now :) -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Thanks for the response, I did append the sort order but the result was the same, Can't determine a Sort Order (asc or desc) in sort spec 'if(exists(start_time:[* TO 1721] AND end_time:[1721 TO *]),100,0) asc', pos=23 Thanks, Indika On 8 November 2012 04:02, Rafał Kuć r@solr.pl wrote: Hello! From looking at your sort parameter it seems that you are missing the order of sorting. The function query will return a value on which Solr will sort and Solr needs to know if the documents should be sorted in descending or ascending order. Try something like that: sort=if(exists(start_time:[* TO 1721] AND end_time:[1721 TO *]),100,0)+desc -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi All, I am trying to sort by a boolean function to sort the places that are open and not open as given below sort=if(exists(start_time:[* TO 1721] AND end_time:[1721 TO *]),100,0) However I keep getting an error message saying Can't determine a Sort Order (asc or desc) in sort spec 'if(exists(start_time:[* TO 1721] AND end_time:[1721 TO *]),100,0) asc', pos=23 I assume that boolean functions can be used to sort. Is there anything missing from the sort function ? Thanks, Indika
Re: Retrieve unique documents on a non id field
Hello! Look at the field collapsing functionality - http://wiki.apache.org/solr/FieldCollapsing It allows you to group documents based on field value, query or function query. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi All, Currently I am using Solr for searching and filtering restaurants based on certain criteria. For example I use Solr to obtain the list of restaurants open in the day. A restaurant can have sessions when its open, e.g. Breakfast, Lunch and Dinner, and the time information related to these sessions are stored in three different documents with a field to identify the restaurant (restaurant_id). When I query for restaurants that are open at 11:00 AM using (start_time:[* TO 1100] AND end_time:[1100 TO *]) and if sessions overlap (say Breakfast and Lunch) I would get both the Breakfast document and the Lunch document. These are in fact two different documents and would have the same restaurant_id. My question is, is there a way to retrive only one document if the restaurant_id repeated in the response. Thanks, Indika
Re: SOLR 3.5 sometimes throws java.lang.NumberFormatException: For input string: java.math.BigDecimal:1848.66
Hello! Look at what Solr returns in the error - you send the following value java.math.BigDecimal:1848.66 - remove the java.math.BigDecimal: and your problem should be gone. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Welcome all, We have a very strange problem with SOLR 3.5. It SOMETIMES throws exceptions: 2012-10-31 10:20:06,408 SEVERE [org.apache.solr.core.SolrCore:185] (http-10.205.49.74-8080-155) org.apache.solr.common.SolrException: ERROR: [doc=MyDoc # d3mo1351674222122-1 # 2012-10-31 08:03:42.122] Error adding field 'AMOUNT'='java.math.BigDecimal:1848.66' at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:324) at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:60) at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:53) at my.package.solr.dihutils.DateCheckUpdateProcessorFactory$DateCheckUpdateProcessor.processAdd(DateCheckUpdateProcessorFactory.java:91) at org.apache.solr.handler.BinaryUpdateRequestHandler$2.document(BinaryUpdateRequestHandler.java:79) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$2.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:139) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$2.readIterator(JavaBinUpdateRequestCodec.java:129) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:211) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$2.readNamedList(JavaBinUpdateRequestCodec.java:114) at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:176) at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:102) at org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:144) at org.apache.solr.handler.BinaryUpdateRequestHandler.parseAndLoadDocs(BinaryUpdateRequestHandler.java:69) at org.apache.solr.handler.BinaryUpdateRequestHandler.access$000(BinaryUpdateRequestHandler.java:45) at org.apache.solr.handler.BinaryUpdateRequestHandler$1.load(BinaryUpdateRequestHandler.java:56) at org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:58) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:235) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.jboss.web.tomcat.security.SecurityAssociationValve.invoke(SecurityAssociationValve.java:190) at org.jboss.web.tomcat.security.JaccContextValve.invoke(JaccContextValve.java:92) at org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.process(SecurityContextEstablishmentValve.java:126) at org.jboss.web.tomcat.security.SecurityContextEstablishmentValve.invoke(SecurityContextEstablishmentValve.java:70) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.jboss.web.tomcat.service.jca.CachedConnectionValve.invoke(CachedConnectionValve.java:158) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:330) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:829) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:598) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:662) Caused by: java.lang.NumberFormatException: For input string: java.math.BigDecimal:1848.66 at sun.misc.FloatingDecimal.readJavaFormatString(FloatingDecimal.java:1222) at java.lang.Double.parseDouble(Double.java:510) at org.apache.solr.schema.TrieField.createField(TrieField.java:418) at org.apache.solr.schema.SchemaField.createField(SchemaField.java:104) at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:203
Re: facet prefix with tokenized fields
Hello! Do you have to use faceting for prefixing ? Maybe it would be better to use ngram based field and return the stored value ? -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi. Is there any solution to facet documents with specified prefix on some tokenized field, but in result gets the original value of a field? e.q.: field name=category_ac type=text_autocomplete indexed=true stored=true multiValued=true / fieldType name=text_autocomplete class=solr.TextField analyzer tokenizer class=solr.StandardTokenizerFactory / filter class=solr.LowerCaseFilterFactory / /analyzer /fieldType Indexed value: toys for children query: q=start=0rows=0facet.limit=-1facet.mincount=1f.category_ac.facet.prefix=chifacet.field=category_acfacet=true I'd like to get exacly toys for children, not children -- Grzegorz Sobczyk
Re: select one document with a given value for an attribute in a single page
Hello! Try the grouping feature of Solr - http://wiki.apache.org/solr/FieldCollapsing . You can collapse documents based on the field value, which would be a shop identifier in your case. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch I have the Hi all, I have the following schema offer: - offer_id - offer_title - offer_description - related_shop_id - related shop_name - offer_price Each offer is related to a shop. In one shop, we have many offers I would like show in one page (26 offers) only one offer from a shop. I need help to implement this feature. Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/select-one-document-with-a-given-value-for-an-attribute-in-a-single-page-tp4016679.html Sent from the Solr - User mailing list archive at Nabble.com.
Re: Having problem runing apache-solr on my linux server
Hello! What you are experiencing is normal - when you run an application in the foreground it will close when you close the terminal. You can use command like screen or you can use nohup. However I would advise installing a standalone Jetty or Tomcat and just deploy Solr as any other web application. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch hello dear, I try running apache-solr 3.6.1 on my linux host using java -jar start.jar, everything is running fine. but java stop running as soon as the command terminal close. i have searching for means of making it run continuously but i did not see. please i need help on making apache-solr running without stopping at the terminal close. thanks. Zakari M
Re: Is it possible to use something like sum() in a solr-query?
Hello! Take a look at StatsComponent - http://wiki.apache.org/solr/StatsComponent -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch I have for example an integer field and want to sum all these values for all the matching documents. Similar to this in sql: SELECT SUM(/expression/ ) FROM tables WHERE predicates; Regards, Markus On 29.10.2012 22:25, Jack Krupansky wrote: Unfortunately, neither the subject nor your message says it all. Be specific - what exactly do you want to sum? All matching docs? Just the returned docs? By group? Or... what? You can of course develop your own search component that does whatever it wants with the search results. -- Jack Krupansky -Original Message- From: Markus.Mirsberger Sent: Monday, October 29, 2012 11:08 AM To: solr-user@lucene.apache.org Subject: Is it possible to use something like sum() in a solr-query? Hi, the subject says it all :) Is there something like sum() available in a solr query to sum all values of a field ? Regards, Markus Mirsberger
Re: Filtering HTML content in Solr 4.0.0
Hello! You try to put the HTML into the XML sent to Solr right ? You should use the proper UTF-8 encoding to do that. For example look at the utf8-example.xml file from the exampledocs directory that comes with Solr and you'll see something like this: field name=featurestag with escaped chars: lt;nicetag/gt;/field As you can see the and are properly encoded as lt; and gt; -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch Hi, I am using Solr 4.0.0. I have a HTML content as description of a product. If I index it without any filtering it is giving errors on search. How can I filter an HTML content. Pratyul
Re: Filtering HTML content in Solr 4.0.0
Hello! You don't need a custom update request processor - there is a char filter dedicated to strip HTML tags from your content and index only relevant parts of it - http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.HTMLStripCharFilterFactory However, you first need to properly send it to Solr for indexing. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch I think you will have to write an UpdateProcessor to strip out html tags. http://wiki.apache.org/solr/UpdateRequestProcessor As per Solr 4.0 you can also use scripting languages like Python, Ruby and Javascript to write scripts for use as updateprocessors too. -Mensagem Original- From: Pratyul Kapoor Sent: Friday, October 26, 2012 3:56 AM To: solr-user@lucene.apache.org Subject: Filtering HTML content in Solr 4.0.0 Hi, I am using Solr 4.0.0. I have a HTML content as description of a product. If I index it without any filtering it is giving errors on search. How can I filter an HTML content. Pratyul
Re: Occasional Solr performance issues
Hello! You can check if the long warming is causing the overlapping searchers. Check Solr admin panel and look at cache statistics, there should be warmupTime property. Lowering the autowarmCount should lower the time needed to warm up, howere you can also look at your warming queries (if you have such) and see how long they take. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch When Solr is slow, I'm seeing these in the logs: [collection1] Error opening new searcher. exceeded limit of maxWarmingSearchers=2, try again later. [collection1] PERFORMANCE WARNING: Overlapping onDeckSearchers=2 Googling, I found this in the FAQ: Typically the way to avoid this error is to either reduce the frequency of commits, or reduce the amount of warming a searcher does while it's on deck (by reducing the work in newSearcher listeners, and/or reducing the autowarmCount on your caches) http://wiki.apache.org/solr/FAQ#What_does_.22PERFORMANCE_WARNING:_Overlapping_onDeckSearchers.3DX.22_mean_in_my_logs.3F I happen to know that the script will try to commit once every 60 seconds. How does one reduce the work in newSearcher listeners? What effect will this have? What effect will reducing the autowarmCount on caches have? Thanks.
Re: Error: _version_field must exist in schema
Hello! Look at your solrconfig.xml file, you should see something like that: updateLog str name=dir${solr.data.dir:}/str /updateLog Just remove it and Solr shouldn't bother you with the version field information. However remember that some features won't work (like the real time get or partial documents update). You can also add _version_ field to your schema and forget about it. You don't need to do anything with it as it is used internally by Solr. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch On Thu, Oct 18, 2012 at 12:25 AM, Rafał Kuć r@solr.pl wrote: Hello! You can some find information about requirements of SolrCloud at http://wiki.apache.org/solr/SolrCloud . I don't know if _version_ is mentioned elsewhere. As for Websolr - I'm afraid I can't say anything about the cause of those errors without seeing the exception. I see, thanks. I don't think that I'm using the SolrCloud feature. Is it enable because there exist solr/collection1 and also multicore/core0?
Re: What does _version_ field used for?
Hello! It is used internally by Solr, for example by features like partial update functionality and update log. -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch I ma moving to solr4.0 from beta version. There is a exception was thrown, Caused by: org.apache.solr.common.SolrException: _version_field must exist in schema, using indexed=true stored=true and multiValued=false (_version_ does not exist) at org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:57) at org.apache.solr.core.SolrCore.init(SolrCore.java:606) ... 26 more 2 It's seem that there need a field like field name=_version_ type=long indexed=true stored=true/ in schema.xml. I am wonder what does this used for?