Re: Exception while loading 2 Billion + Documents in Solr 4.8.0

2015-02-10 Thread Erick Erickson
I guess my $0.02 is that you'd have to have strong evidence that extending Lucene to 64 bit is even useful. Or more generally, useful enough to pay the penalty. All the structures that allocate maxDoc id arrays would suddenly require twice the memory for instance, plus all the coding effort that co

Re: Solrcloud (to HDFS) poor indexing performance

2015-02-10 Thread Otis Gospodnetic
Hi Tim, Although I doubt Kafka is the problem, I'd look at that first and eliminate that. What about those Flume agents? How are they behaving in terms of CPU/GC, and such? You have 18 Solr nodes. what happens if you increase the number of Flume sinks? Are you seeing anything specific that

Re: Exception while loading 2 Billion + Documents in Solr 4.8.0

2015-02-10 Thread Shawn Heisey
On 2/4/2015 3:31 PM, Arumugam, Suresh wrote: > We are trying to do a POC for searching our log files with a single node > Solr(396 GB RAM with 14 TB Space). > Since the server is powerful, added 2 Billion records successfully & search > is working fine without much issues. > > Due to the restrict

Re: Upgrading Solr 4.7.2 to 4.10.3

2015-02-10 Thread Shawn Heisey
On 2/10/2015 2:29 PM, Elan Palani wrote: > Planning to Upgrade solr from 4.7.2 to 4.10.3 , I just want through the > Documentation > seems like a straight forward download/install.. > > Anything specifically issues I should look for? Chances are that replacing the war and any extra jars you're u

RE: Upgrading Solr 4.7.2 to 4.10.3

2015-02-10 Thread Markus Jelsma
Well, the CHANGES.txt is filled with just the right information you need :) -Original message- > From:Elan Palani > Sent: Tuesday 10th February 2015 22:30 > To: solr-user@lucene.apache.org > Subject: Upgrading Solr 4.7.2 to 4.10.3 > > Team.. > > Planning to Upgrade solr from 4.7.2 to

codec factory versus posting format versus documentation

2015-02-10 Thread Benson Margulies
I think perhaps there is a minor doc drought, or perhaps just I'm having an SEO bad hair day. I'm trying to understand the relationship of codecFactory and postingFormat. Experiment 1: I just want to use my own codec. So, I make a CodecFactory, declare it in solrconfig.xml, and stand back? If so,

Re: Solr on Tomcat

2015-02-10 Thread Chris Hostetter
: I think until Solr become completely standalone, it could be major task for Solr 5.0 is already completley standalone. Running bin/solr (or bin/solr.cmd) as a standalone daemon is the only documented & supported way to run Solr 5. *Internally* Solr is using jetty -- but that is 100% an impl

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread Dyer, James
I opened LUCENE-6237 for this. I can't promise when I or someone else will actually complete this, but it wouldn't be very difficult to do either. Seeing your use-case, I think this would be a nice little improvement. James Dyer Ingram Content Group -Original Message- From: O. Klein

Upgrading Solr 4.7.2 to 4.10.3

2015-02-10 Thread Elan Palani
Team.. Planning to Upgrade solr from 4.7.2 to 4.10.3 , I just want through the Documentation seems like a straight forward download/install.. Anything specifically issues I should look for? Any help will be appreciated. Thanks Elan

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread O. Klein
Yeah that should work. Is this something you will change in the code? -- View this message in context: http://lucene.472066.n3.nabble.com/alternativeTermCount-and-WordBreakSolrSpellChecker-combination-not-working-tp4185352p4185489.html Sent from the Solr - User mailing list archive at Nabble.co

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread Dyer, James
Got it. Took a quick look at the code and I see it uses the maximum frequency of the terms. And in your case, one of these terms ("holy" and "wood"), occurs 71,000 times. It wouldn't be too difficult to change this to use the average frequency of the terms or the minimum. But currently the o

Re: Solr 4.10.x on Oracle Java 1.8.x ?

2015-02-10 Thread Shawn Heisey
On 2/10/2015 1:03 PM, Jakov Sosic wrote: > at the end of April Java 1.7 will be obsoleted, and Oracle will stop > updating it. > > Is it safe to run Tomcat7 / Solr 4.10 on Java 1.8? Did anyone tried it > already? Yes, we know that Java 8 works just fine with newer Solr 4.x releases, and the unrele

Re: Adding new core to solr cloud?

2015-02-10 Thread Shawn Heisey
On 2/10/2015 1:01 PM, Jakov Sosic wrote: > Hi guys > > I need to add a new core to existing solr cloud of 4 nodes (2 replicas > and 2 shardS), this is the procedure I have in mind: > > 1) stop node01 > 2) change solr.xml to include new core (included in tomcat configuration) > 3) add "-Dbootstrap

Solr 4.10.x on Oracle Java 1.8.x ?

2015-02-10 Thread Jakov Sosic
Hi guys, at the end of April Java 1.7 will be obsoleted, and Oracle will stop updating it. Is it safe to run Tomcat7 / Solr 4.10 on Java 1.8? Did anyone tried it already?

Adding new core to solr cloud?

2015-02-10 Thread Jakov Sosic
Hi guys I need to add a new core to existing solr cloud of 4 nodes (2 replicas and 2 shardS), this is the procedure I have in mind: 1) stop node01 2) change solr.xml to include new core (included in tomcat configuration) 3) add "-Dbootstrap_conf=true" to JAVA_OPTS 4) start tomcat on node01 No

Re: Solr on Tomcat

2015-02-10 Thread Jakov Sosic
On 02/10/2015 07:55 PM, Dan Davis wrote: As an application developer, I have to agree with this direction. I ran ManifoldCF and Solr together in the same Tomcat, and the sl4j configurations of the two conflicted with strange results. From a systems administrator/operations perspective, a sepa

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread O. Klein
I did some testing and the order of dictionaries doesn't seem to have an effect. They are sorted by frequency. So if mm was applied "holy wood" would have a lower frequency and solve this problem. "suggestions":[ "holywood",{ "numFound":4, "startOffset":0, "endOffse

Re: Solr on Tomcat

2015-02-10 Thread Dan Davis
As an application developer, I have to agree with this direction. I ran ManifoldCF and Solr together in the same Tomcat, and the sl4j configurations of the two conflicted with strange results. From a systems administrator/operations perspective, a separate install allows better packaging, e.g.

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread Dyer, James
I think the problem is when it combines suggestions from DirectSolrSpellChecker and WorkBreakSolrSpellChecker, it gets two lists of possiblities in edit distance order. And when it combines these lists, all it does is interleave the 2 lists: 1 from the first list, then 1 from the 2nd list, then

Re: Solr on Tomcat

2015-02-10 Thread Gopal Patwa
I think until Solr become completely standalone, it could be major task for all folks who run Solr as war or repackage Solr war maven release to adopt 5.0 release, since they need to remove tomcat or any other container they have in production for running Solr. Not to mention there will tools bui

RE: Solr on Tomcat

2015-02-10 Thread Matt Kuiper
Thanks for all the responses. I am planning a new project, and considering deployment options at this time. It's helpful to see where Solr is headed. Thanks, Matt Kuiper  -Original Message- From: Shawn Heisey [mailto:apa...@elyograg.org] Sent: Tuesday, February 10, 2015 10:05 AM To:

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread O. Klein
James, That is very useful information. I tested it and can confirm that disabling spellcheck in warmer solves core reload problem. Now with my use case I'm not trying to spellcheck and correct a whitespace. If "holy wood" was queried with a mm of 100% it would have fewer hits then hollywood and

Re: Solr on Tomcat

2015-02-10 Thread Shawn Heisey
On 2/10/2015 9:48 AM, Matt Kuiper wrote: > I am starting to look in to Solr 5.0. I have been running Solr 4.* on > Tomcat. I was surprised to find the following notice on > https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+Tomcat > (Marked as Unreleased) > > Beginning wi

Re: log location when using bin/start

2015-02-10 Thread Timothy Potter
The bin/solr script in 4 didn't do a good job at allowing you to control the location of the redirected console log or gc log, so you'll probably have to hack that script a bit. The location of the main Solr log can be configured in the example/resources/log4j.properties This has been improved in

Re: Solr pattern tokenizer

2015-02-10 Thread Erick Erickson
Please do not do this. By having such different tokenizers in your index and query time fieldType definition, I pretty much guarantee that you will have endless problems and spend forever chasing your tail trying to solve them. Please do yourself a favor and take the time to get to know the admin/

Re: Solr on Tomcat

2015-02-10 Thread Erik Hatcher
Matt - That is true about the recommendation; use bin/solr to start and stop Solr and consider it a black box service in that manner. We’re getting out of the business of supporting other containers and reigning it in like this. Underneath there currently is still a .war web app (which may cha

Re: Solr on Tomcat

2015-02-10 Thread Timothy Potter
Correct. Solr 5.0 is not a Web application; any WAR or Web app'ish things in Solr 5 are implementation details that may change in the future. The ref guide will include some content about how to migrate to Solr 5 from 4. On Tue, Feb 10, 2015 at 9:48 AM, Matt Kuiper wrote: > I am starting to look

Re: indexed and stored fields don't appear in the response

2015-02-10 Thread Erick Erickson
You're confusing indexing and storing. You can _search_ on anything where indexed="true", thus your q=bizid: 2380505101 query returns documents (I'm assuming) works. That has nothing to do with what's returned in your documents. For seeing a field in your documents for which you've set stored="tr

Solr on Tomcat

2015-02-10 Thread Matt Kuiper
I am starting to look in to Solr 5.0. I have been running Solr 4.* on Tomcat. I was surprised to find the following notice on https://cwiki.apache.org/confluence/display/solr/Running+Solr+on+Tomcat (Marked as Unreleased) Beginning with Solr 5.0, Support for deploying Solr as a WAR in

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread Dyer, James
Okke, There is no way to have it both correct spelling and whitespace in the same correction. So unfortunately there is no easy fix for your use-case. The old shingle method of correcting whitespace might work for this, but it might also introduce other problems. I saw your comments on SOLR-

RE: 1 Solr many Shards?

2015-02-10 Thread Matt Kuiper
Thanks Anshum! Very helpful. Matt Kuiper - Software Engineer Intelligent Software Solutions p. 719.452.7721 | matt.kui...@issinc.com www.issinc.com | LinkedIn: intelligent-software-solutions -Original Message- From: Anshum Gupta [mailto:ans...@anshumgupta.net] Sent: Monday, February 09

Re: Index string returned by 'splitby' by further splitting instead of multivalue

2015-02-10 Thread Alexandre Rafalovitch
Can you do after DIH in the UpdateRequestProcessor chain? Might be cleaner than trying to hack the multi-path processing in DIH. Regards, Alex. Sign up for my Solr resources newsletter at http://www.solr-start.com/ On 10 February 2015 at 04:29, Pankaj Sonawane wrote: > Hi, > > I am usin

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread O. Klein
Thank you for that answer James. Increasing spellcheck.count did the trick. Funny result for query "holywood" the suggestion is "holy wood" instead of "hollywood". Eventhough I have a mm of 100%. Any way to fix that? BTW when using maxCollationTries Solr hangs on core reload. Apparantly an old

Re: Collection API calls on SSL sometimes hang

2015-02-10 Thread Shawn Heisey
On 2/9/2015 8:49 PM, Avanish Raju wrote: > I'm using self-signed certificates between my SolrCloud (4.10.3 > ) > instance and Curl/Solr http client with needClientAuth=true on jetty.xml. > I'm able to load the Sol

RE: alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread Dyer, James
Okke, My first guess is that the additional results from the word break spellchecker is causing additional per-term results and the correct answer is not making the list. So you might need to increase "spellcheck.count" and/or "spellcheck.alternativeTermCount" . My second guess is that the co

RE: Collations are not working fine.

2015-02-10 Thread Dyer, James
Nitin, I have not tested using shingles with collations but my guess here is the collation feature is not going to work as expected with a shingled index. So try re-indexing without the shingles and see if it gives you more intuitive results. If that helps, and if you want to still correct wh

alternativeTermCount and WordBreakSolrSpellChecker combination not working

2015-02-10 Thread O. Klein
Because of a lot of misspellings in content I am using alternativeTermCount and maxResultsForSuggest to get suggestions even if terms are in index. However when adding wordbreak dictionary the collation that was given before is now empty. Is there a way to make this work? -- View this message i

Re: Sort on multivalued attributes

2015-02-10 Thread Flavio Pompermaier
I could but I think that this could be handled natively in solr :) On Mon, Feb 9, 2015 at 4:58 PM, Alexandre Rafalovitch wrote: > Could you inject an UpdateRequestProcesssor into the processing chain? Then > you could copy the field to a sort specific field and choose only one > value. And use d

Index string returned by 'splitby' by further splitting instead of multivalue

2015-02-10 Thread Pankaj Sonawane
Hi, I am using Solr DataImportHandler to index data from database table(Oracle). One of the column contains String of ='s and ','s (Please column3 in example below) Like Column1 = "F" Column2 = "ASDF" *Column3 = "A=1,B=2,C=3,D=4..Z=26"* I want solr to index each 'alphabet' against it

Re: Solr pattern tokenizer

2015-02-10 Thread Nivedita
I tried solving issue like It works for query like CHQ PAID-INWARD TRANHDFC LTD 00036529 But if HDFC LTD is preceding with underscore(-) or any digi