Re: Testing Solr4 - first impressions and problems

2012-10-14 Thread Shawn Heisey
On 10/14/2012 5:45 PM, Erick Erickson wrote: About your second point. Try committing more often with openSearcher set to false. There's a bit here: http://wiki.apache.org/solr/SolrConfigXml 1 15000 false That should keep the size of the transaction log d

Missing document on indexing with Solr 4.0

2012-10-14 Thread Shotaro Kamio
Hi, I've tried Solr4.0 release version with SolrCloud feature. But some of documents are not properly indexed. Is this a bug of solr? Steps to reproduce: 1) Make two solr instances running: (Almost same with Example A in http://wiki.apache.org/solr/SolrCloud). --- cp -r example example2

Re: Sum of scores for documents from a query.

2012-10-14 Thread Amit Nithian
Are you looking for the sum of the scores of each document in the result? In other words, if there were 1000 documents in the numFound but you only of course show 10 (or 0 depending on rows parameter) you want the sum of all the scores of 1000 documents in a separate section of the results? If so,

Re: Testing Solr4 - first impressions and problems

2012-10-14 Thread Alexandre Rafalovitch
Do these settings apply to DIH? The example linked seems to refer to updateHandler, but I am not sure how/whether that affects DIH. Regards, Alex. P.s. I was also having OOMs on large DIH imports. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalo

Re: core.SolrCore - java.io.FileNotFoundException

2012-10-14 Thread Jun Wang
PS, I have found that there lots of segment in index directory, and most of them is empty, like . totoal file number is 35314 in index directory. -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3n.fdx -rw-rw-r-- 1 admin systems 0 Oct 14 11:37 _k3o.fdt -rw-rw-r-- 1 admin systems 0 Oct 14 11

Re: Solr Cloud and Hadoop

2012-10-14 Thread Otis Gospodnetic
Hi Rui, You don't need to merge the resulting indices (1 index per Reducer is what I assume you are asking about). Each could be copied to a different Solr server and you could then use regular old Solr distributed search to search across them. You don't want to search indices while they are in

Re: SolrJ, optimize, maxSegments

2012-10-14 Thread Gopal Patwa
Did you tried below options 0.0 10.0 This is from java doc /** When forceMergeDeletes is called, we only merge away a * segment if its delete percentage is over this * threshold. Default is 10%. */ public TieredMergePolicy setForceMergeDeletesPctAllowed(double

Re: Solr4 - no examples of postingsFormat in schema.xml

2012-10-14 Thread Shawn Heisey
On 10/14/2012 3:21 PM, Rafał Kuć wrote: Hello! Try adding the following to solrconfig.xml: I did this and got a little further, but still no go. From what it's saying now, I don't think it will be possible in the current state of branch_4x to use anything but the default. SEVERE: null:j

Re: Sum of scores for documents from a query.

2012-10-14 Thread Erick Erickson
bq: is there any way to get a sum of all the scores for a query not that I know of. I'm not sure what value this would be anyway, what do you want to use it for? This seems like an XY problem... Best Erick On Sun, Oct 14, 2012 at 4:39 PM, Gilles Comeau wrote: > Hi all, > > Very quick question

Re: Testing Solr4 - first impressions and problems

2012-10-14 Thread Erick Erickson
About your second point. Try committing more often with openSearcher set to false. There's a bit here: http://wiki.apache.org/solr/SolrConfigXml 1 15000 false That should keep the size of the transaction log down to reasonable levels... Best Erick On Sun, Oct

Re: Solr4 - no examples of postingsFormat in schema.xml

2012-10-14 Thread Rafał Kuć
Hello! Try adding the following to solrconfig.xml: -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > On 10/14/2012 12:19 AM, Walter Underwood wrote: >> There is a bit more info in this post, look for "alternative codecs": >> >> http://searchhu

Re: Solr4 - no examples of postingsFormat in schema.xml

2012-10-14 Thread Shawn Heisey
On 10/14/2012 12:19 AM, Walter Underwood wrote: There is a bit more info in this post, look for "alternative codecs": http://searchhub.org/dev/2012/10/12/apache-solr-and-lucene-4-0-0-released/ I'm running on branch_4x checked out yesterday at 13:59 MDT. I tried postingsFormat="Block" and "Bl

Re: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7

2012-10-14 Thread Rogerio Pereira
I'll try to be more specific Jack. I just download the apache-solr-4.0.0.zip, from this archive I took the core1 and core2 folders from multicore example and rename them to collection1 and collection2, I also did all necessary changes on solr.xml and solrconfig.xml and schema.xml on these two corr

Testing Solr4 - reference thread

2012-10-14 Thread Shawn Heisey
This thread will serve as a reference on my config layout for at least one other thread that I will be creating for discussion. I have an existing infrastructure for Solr 3.5.0, using the included Jetty6. I am working on a new infrastructure for Solr 4, using its included Jetty8. My solrcon

Testing Solr4 - first impressions and problems

2012-10-14 Thread Shawn Heisey
Please see my other thread called "Testing Solr4 - reference thread"for general information about my config layout. If more specific information is required, please let me know. So far I cannot get a solr.war built without slf4j bindings to work right. There does not seem to be any centrally

Re: Any filter to map mutiple tokens into one ?

2012-10-14 Thread Jack Krupansky
There's a miscommunication here somewhere. Is Solr 4.0 still passing "*:*" to the analyzer? Show us the parsed query for "*:*", as well as the debugQuery "explain" for the score. I mean, "*:*" (MatchAllDocsQuery) has a "constant score", so there isn't any way for it to be "suboptimal". -- Ja

Re: Any filter to map mutiple tokens into one ?

2012-10-14 Thread T. Kuro Kurosaka
Jack, I don't think SOLR-3261 describes this issue. I ran the same experiment with Solr-3.6 and the score for all the matches was 0.1626374. The newly released Solr 4.0.0 also returns a suboptimal score of 0.14764866. Kuro On 10/12/12 2:03 PM, Jack Krupansky wrote: I don't have a Solr 3.5 to c

Re: SolrCloud - unable to get leader props after ZK timeout

2012-10-14 Thread Mark Miller
Can you file an issue and attach your logs? You might also try the 4.0 release to see if the problem was fixed after the beta. - Mark On 10/14/2012 08:48 AM, Jam Luo wrote: Yes, I have the same problem. 2012/10/5 Kyryl Bilokurov Hi, I have a functional/performance test SolrCloud cluster

Re: SolrCloud - distributed architecture considerations

2012-10-14 Thread Shawn Heisey
On 10/14/2012 11:16 AM, Erick Erickson wrote: No, that's not what I'm thinking at all. There would be _no_ replication configured. You'd just have two completely independent installations, one in each of your separate locations. The only communication path would be that somehow the original docum

Re: java.lang.NoSuchFieldError: severeErrors

2012-10-14 Thread Leif Neve
Thanks! The configuration left to me by my predecessor expects those JARs to be in a "lib" subdirectory of the instance. Now that I've copied the new JARs into the instance, things are working much better. -- View this message in context: http://lucene.472066.n3.nabble.com/java-lang-NoSuchField

RE: Building solr with maven

2012-10-14 Thread Michael Ryan
We have a maven project to build a war containing everything from the Solr war, plus some of our own code. Here's the relevant stuff from our pom.xml: war org.apache.solr solr-core org.apache.solr solr

Re: SolrCloud - distributed architecture considerations

2012-10-14 Thread Erick Erickson
No, that's not what I'm thinking at all. There would be _no_ replication configured. You'd just have two completely independent installations, one in each of your separate locations. The only communication path would be that somehow the original documents would need to get to both locations for ind

Re: java.lang.NoSuchFieldError: severeErrors

2012-10-14 Thread Erick Erickson
Well, the lines below don't look right, notice that it's finding dataimporthandler...3.1.0.jar. Looks like you have a bunch of old jars hanging around, how did you install 4.0 anyaway? ** INFO: Adding 'file:/opt/solr/multicore/lpf/lib/apache-solr-dataimporthandler-extras-3.1.0.jar' to

Re: Solr4 - no examples of postingsFormat in schema.xml

2012-10-14 Thread Shawn Heisey
On 10/14/2012 12:19 AM, Walter Underwood wrote: There is a bit more info in this post, look for "alternative codecs": http://searchhub.org/dev/2012/10/12/apache-solr-and-lucene-4-0-0-released/ If I were to add this to the Solr wiki as potential options under postingsFormat, would it be correc

Re: SolrCloud - distributed architecture considerations

2012-10-14 Thread AlexeyK
In other words, I would have to apply a mixture of modes: SolrCloud for each location + old-style replication for mirroring. BTW, I've seen a notion of 'role' in node cloud state. Is it in use or is there for future extensions? Having 'indexer' and 'searcher' roles backed by the infrastructure wou

Re: java.lang.NoSuchFieldError: severeErrors

2012-10-14 Thread Leif Neve
It's finding all the config files just fine. Here is the full traceback: 2012-10-14 11:56:00.501:INFO:oejs.Server:jetty-8.1.2.v20120308 2012-10-14 11:56:00.516:INFO:oejs.NCSARequestLog:Opened /opt/solr/logs/request.2012_10_14.log 2012-10-14 11:56:00.519:INFO:oejdp.ScanningAppProvider:Deployment mo

Re: java.lang.NoSuchFieldError: severeErrors

2012-10-14 Thread Erick Erickson
The directory structure is a bit changed between 3.6 and 4.x, there's an additional level. Be sure you're not being caught by that, you'll see "collection1" in there by default Try specifying -Dsolr.conf.dir=, although it's just a guess. Best Erick On Sun, Oct 14, 2012 at 11:37 AM, Alexandre

Re: java.lang.NoSuchFieldError: severeErrors

2012-10-14 Thread Alexandre Rafalovitch
My guess it is having troubles finding a directory where lpf core lives. You see to be on a *nix system, have you tried running truss/strace and seeing which directories solr is looking for lpf core in? Maybe the definition of 'home' directory is not being picked up? Regards, Alex. Personal

java.lang.NoSuchFieldError: severeErrors

2012-10-14 Thread Leif Neve
Just upgraded from SOLR 3.6 to SOLR 4 and am getting this error. Can anyone she any light on this? Is there a way to turn on more debugging? Here is the traceback: Oct 13, 2012 1:34:13 PM org.apache.solr.core.CoreContainer create SEVERE: Unable to create core: lpf org.apache.solr.common.SolrExcept

Re: Multicore setup is ignored when deploying solr.war on Tomcat 5/6/7

2012-10-14 Thread Jack Krupansky
I can't quite parse "the same multicore deployment as we have on apache solr 4.0 distribution archive". Could you rephrase and be more specific. What "archive"? Were you already using 4.0-ALPHA or BETA (or some snapshot of 4.0) or are you moving from pre-4.0 to 4.0? The directory structure did

Re: How to import a part of index from main Solr server(based on a query) to another Solr server and then do incremental import at intervals later(the updated index)?

2012-10-14 Thread Erick Erickson
Right. Then it seems like you have to index the doc to both solr1 and solr2 then. This assumes that you do NOT intend the two indexes to be identical when all is said and done, right? They'll contain different documents. Because if you _do_ intend the indexes to be identical, replication is your a

Re: SolrCloud - distributed architecture considerations

2012-10-14 Thread Erick Erickson
First, remember that SolrCloud is relatively new, operational issues like this will doubtless accrue "folk wisdom" as we all gain experience... But my current thinking is that the remote installations are essentially completely separate installations with no knowledge of each other. Your indexing

Re: Solr - db-data-config.xml general asking to entity

2012-10-14 Thread Marvin
Thanks for the response! Its a bad news that it isnt that simple I hoped. Certainly I need names and a timestamp for the comment. There are any problems if I want to add a timestamp in the one long string? Apart from this can I add this one long string to the index? Example: table blog: id, autho

Re: SolrCloud - unable to get leader props after ZK timeout

2012-10-14 Thread Jam Luo
Yes, I have the same problem. 2012/10/5 Kyryl Bilokurov > Hi, > > I have a functional/performance test SolrCloud cluster (using Solr > 4.0-BETA) with the following setup: 4 servers, each server hosts 1/4th of > the collection (no replicas, so there are only leaders for each shard). > Current ZK

Re: How to import a part of index from main Solr server(based on a query) to another Solr server and then do incremental import at intervals later(the updated index)?

2012-10-14 Thread Yury Kats
You can merge indexes. You cannot split them. jefferyyuan wrote: >Thanks for the reply, but I think SolrReplication may not help in this case, >as we don't want to replicate all indexs to solr2, just a part of >index(index of doc created by me). Seems SolrReplication doesn't support >replicate

Re: How to import a part of index from main Solr server(based on a query) to another Solr server and then do incremental import at intervals later(the updated index)?

2012-10-14 Thread Lance Norskog
Solr's Java Replication feature downloads changes to an index. It does not need to pull the entire index. I think what you need to do with the SolrEntityProcessor is this: do a Solr sorted query on your "last modified" field and fetch the timestamp from the first row. This would go in an outer 'en

Re: Solr - db-data-config.xml general asking to entity

2012-10-14 Thread Lance Norskog
Two answers: 1) Do you have maybe user names or timestamps for the comments? Usually people want those also. 2) You can store the comments as one long string, or as multiple entries in a field. Your database should have a concatenate function that will take field X from multiple documents in a join

SolrCloud - distributed architecture considerations

2012-10-14 Thread AlexeyK
Hi, As far as I understand, SolrCloud eliminates the master-slave specifics, and automates both update and search seamlessly. What should I take into account configuring SolrCloud for a large customer with multiple physical locations? I mean, for older Solr I would define master 'close to the data'