RE: Internal shard communication - performance?

2013-08-11 Thread Alexey Kozhemiakin
Hi Tim, Torsten, Please review following threads which covers chatty shard-shard and shard-replica conversations, and since you index large volumes of data it can be a potential bottleneck in your case. http://lucene.472066.n3.nabble.com/Sharding-and-Replication-td4071614.html http://lucene.4

Re: very simple boolean query not working

2013-08-11 Thread S L
Jack Krupansky-2 wrote > What query parser and release of Solr are you using? > > There was a bug at one point where a fielded term immediately after a left > parenthesis was not handled properly. > > If I recall, just insert a space after the left parenthesis. > > Also, the dismax query parser

Re: Internal shard communication - performance?

2013-08-11 Thread Tim Vaillancourt
For me the biggest deal with increased chatter between SolrCloud is object creation and GCs. The resulting CPU load from the increase GCing seems to affect performance for me in some load tests, but I'm still trying to gather hard numbers on it. Cheers, Tim On 07/08/13 04:05 PM, Shawn Heis

Re: very simple boolean query not working

2013-08-11 Thread Jack Krupansky
What query parser and release of Solr are you using? There was a bug at one point where a fielded term immediately after a left parenthesis was not handled properly. If I recall, just insert a space after the left parenthesis. Also, the dismax query parser does not support parentheses. -- Ja

Re: Adding Postgres and Mysql JDBC drivers to Solr

2013-08-11 Thread Tim Vaillancourt
Another option is defining the location of these jars in your solrconfig.xml and storing the libraries external to jetty, which has some advantages. Eg: MySQL connector is located at '/opt/mysql_connector' and adding this to your solrconfig.xml alongside the other lib entities: Cheers,

very simple boolean query not working

2013-08-11 Thread S L
When I do this query: q=catcode:CC001 I get a bunch of results. One of them looks like this: CC001 Cooper, John If I then do this query: q=start_url_title:cooper I also match the record above, as expected. But, if I do this: q=(catcode:CC001 AND start_u

Re: Problem running Solr indexing in Amazon EMR

2013-08-11 Thread Dmitriy Shvadskiy
Erick, It actually suppose to be just one version of Solr that is bundled with our map/reduce jar. To be clear: Map/Reduce job is generating a new index, not reading an existing one. But it fails even before as an instance of EmbeddedSolrServer is created at the first line of the following code.

Multipoint date ranges with spatial - Invalid Longitude Exception?

2013-08-11 Thread zonski
Hi, I'm trying to implement date range searching using spatial features as per: http://lucene.472066.n3.nabble.com/Modeling-openinghours-using-multipoints-td4025336.html I've followed the steps and read through the linked articles but I can't get past an exception: InvalidShapeExceptio

Re: Problem running Solr indexing in Amazon EMR

2013-08-11 Thread Erick Erickson
Have you checked the luceneMatchVersion in all your solrconfig.xml files? I'm guessing it't set to 40 somewhere in the process as evidenced by the line: org.apache.lucene.codecs.lucene40.Lucene40FieldInfosFormat.( Lucene40FieldInfosFormat.java:99) so it looks like somehow a Lucene 4.0 codec is bein

Re: What do you use for solr's logging analysis?

2013-08-11 Thread William Bell
Loggly cannot accept our SOLR queries as fast as we get them in production. We get 2.5M lines of queries in the log file per every 10 minutes, and to send to Loggly it takes literally 1.5 hours even when having 20 Hadoop servers sending them. What we really need from Loggly is a way to point Loggl

Re: Problem running Solr indexing in Amazon EMR

2013-08-11 Thread Dmitriy Shvadskiy
Erick, Thank you for the reply. Cloudera image includes Solr 4.3. I'm not sure what version Amazon EMR includes. We are not directly referencing or using their version of Solr but instead build our jar against Solr 4.4 and include all dependencies in our jar file. Also error occurs not while read

Re: Shard splitting failure, with and without composite hashing

2013-08-11 Thread Greg Preston
Oops, I somehow forgot to mention that. The errors I'm seeing are with the release version of Solr 4.4.0. I mentioned 4.1.0 as that's what we currently have in prod, and we want to upgrade to 4.4.0 so we can do shard splitting. Towards that end, I'm testing shard splitting in 4.4.0 and seeing th

Re: commit vs soft-commit

2013-08-11 Thread Erick Erickson
Take a loot at solrconfig.xml. You configure filtrerCache, documentCache, queryResultCache. These (and some others I believe, but certainly these) are _not_ per-segment caches, so are invalidated on soft commit. Any autowarming you've specified also gets executed if applicable. On the other hand,

Re: Spellchecker suggests Tokens

2013-08-11 Thread tamanjit.bin...@yahoo.co.in
I think the issue lies in the analysis of the field you use for spellchecking. It also contains NGramFilterFactory. So wither copy your data to another field with some other fieldType which doesnot do NGramFilterFactory analysis and then try this out. -- View this message in context: http://lu

Re: commit vs soft-commit

2013-08-11 Thread tamanjit.bin...@yahoo.co.in
Erik- /It does invalidate the "top level" caches, including the caches you configure in solrconfig.xml. / Could you elucidate? -- View this message in context: http://lucene.472066.n3.nabble.com/commit-vs-soft-commit-tp4083817p4083844.html Sent from the Solr - User mailing list archive at Nabb

Re: Configuring SpellCehckComponent

2013-08-11 Thread tamanjit.bin...@yahoo.co.in
There are two portions here: 1. To build a dictionary. Since you are using IndexBasedSpellChecker, you would have to tell Solr, what field from your index to build up the dictionary from. 2. To actually be able to search for your corrected spellings. For this you would need a new requestHandler, to

Re: commit vs soft-commit

2013-08-11 Thread Erick Erickson
Soft commits also do not rebuild certain per-segment caches etc. It does invalidate the "top level" caches, including the caches you configure in solrconfig.xml. So no, it's not free at all. Your soft commits should still be as long an interval as makes sense in your app. But they're still much fa

Re: commit vs soft-commit

2013-08-11 Thread Shreejay Nair
Yes a new searcher is opened with every soft commit. It's still considered faster because it does not write to the disk which is a slow IO operation and might take a lot more time. On Sunday, August 11, 2013, tamanjit.bin...@yahoo.co.in wrote: > Hi, > Some confusion in my head. > http:// > http:/

Re: Could not load config for solrconfig.xml

2013-08-11 Thread Erick Erickson
bq: I have no idea what to do First thing to do is look at the full stack trace in the log. The offending bits are usually farther down the stack. Best Erick On Sat, Aug 10, 2013 at 2:10 PM, shuargan wrote: > Do you remember what was your mistake? > Im having the same issue > > I have this so

Re: What do you use for solr's logging analysis?

2013-08-11 Thread Shreejay Nair
There are a lot of tools out there with varying degrees of functionality ( and ease of setup) we also have multiple solr servers in production ( both cloud and single nodes ) and we have decided to use http://loggly. We will probably be setting it up for all our servers in the

Re: Purging unused segments.

2013-08-11 Thread Erick Erickson
Robert: Thanks a million, that'll teach me to grep for the obvious ... It's not even clear (I'm working twice-removed) that there _are_ unused files. I'm grasping at straws here Thanks again, Erick On Fri, Aug 9, 2013 at 9:32 PM, Robert Muir wrote: > On Fri, Aug 9, 2013 at 7:48 PM, Erick

Re: Shard splitting failure, with and without composite hashing

2013-08-11 Thread Erick Erickson
The very first thing I'd do is go to Solr 4.4. There have been a lot of improvements in this code in the intervening 3 versions. If the problem still occurs in 4.4, it'll get a lot more attention than 4.1 FWIW, Erick On Fri, Aug 9, 2013 at 7:32 PM, Greg Preston wrote: > Howdy, > > I'm tryi

Re: Problem running Solr indexing in Amazon EMR

2013-08-11 Thread Erick Erickson
What version of Solr is Cloudera's CDH built on? Looks to me like the Solr you're using to read the M/R produced index is different than the one used to build it. Or the version specified in the Solr configs, evidenced by the LUCENE40 in the error message. See in solrconfig.xml. But probably a be

Re: Edismax vs Dismax

2013-08-11 Thread Jack Krupansky
Just escape the special characters of the URL with backslash or put the entire URL in quotes. The slash is particularly problematic since it introduces a regular expression. Dismax has a less-sophisticated syntax and automatically escapes more special characters. -- Jack Krupansky -Origin

commit vs soft-commit

2013-08-11 Thread tamanjit.bin...@yahoo.co.in
Hi, Some confusion in my head. http://http://wiki.apache.org/solr/UpdateXmlMessages#A.22commit.22_and_.22optimize.22 says that /A soft commit is much faster since it only makes index changes visible and does

Re: Spelling suggestions.

2013-08-11 Thread tamanjit.bin...@yahoo.co.in
I think the issue is that you are trying to use WordBreakSolrSpellChecker (which was introduced in Solr 4.x version) in your Solr App of 3.5 version. You need to correct that. -- View this message in context: http://lucene.472066.n3.nabble.com/Spelling-suggestions-tp4083519p4083816.html Sent fr

Re: Configuring SpellCehckComponent

2013-08-11 Thread tamanjit.bin...@yahoo.co.in
The searchComponent would be placed in your solrconfig.xml. There is no specific place for it. This is what the comment in you solrconfig.xml says: Search Components Search components are registered to SolrCore and used by instances of SearchHandler (which can access them by name)

Edismax vs Dismax

2013-08-11 Thread heaven
Hi, the application I am working on switched to edismax parser and I found some weird behavior. I have this field: The string that is indexed is: facebook.com/profile.php?id=123456789 When I do use

What do you use for solr's logging analysis?

2013-08-11 Thread adfel70
Hi I'm looking at a tool that could help me perform solr logging analysis. I use SolrCloud on multiple servers, so the tool should be able to collect logs from multiple servers. Any tool you use and can advice of? Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/What

Re: additional requests sent to solr

2013-08-11 Thread alxsss
Hi, Could someone please confirm that this must me so or this is a bug in SOLR. In short, I see three logs in SOLR for one request http://server1:8983/solr/mycollection/select?q=alex&wt=xml&defType=edismax&facet.field=school&facet.field=company&facet=true&facet.limit=10&facet.mincount=1&qf=schoo