Need Debug Direction on Performance Problem

2015-01-16 Thread Naresh Yadav
Hi all, We have single solr index with 3 fixed fields(on of field is tokenized with space) and rest dynamic fields(string fields in range of 10-20). Current size of index is 2 GB with around 12 lakh docs and solr nodes are of 4 core, 16 gb ram linux machines. Writes performance is good then we t

Solr Cloud Stress Test

2015-01-16 Thread david mitche
Hi, I am a student, planning to learn and do a features and functionality test of solr-cloud as one of my project. I liked to do the stress and performance test of solr-cloud on my local machine. (machine of 16gb ram, 250 gb ssd and 2.2 GHz Intel Core i7). Multiple features of cloud. What is

Re: Solr numFound > 0 but doc list empty in Solr Cloud setup

2015-01-16 Thread Jaikit Savla
Anshuman, You are right about @shards param not required. One of my shard was down and hence when I added &shards.tolerant=true, it worked without shards param. However document list is still empty. content of solrconfig.xml http://pastebin.com/CJxD22t1 On Friday, January 16, 2015 1:24

Re: Solr numFound > 0 but doc list empty in Solr Cloud setup

2015-01-16 Thread Jaikit Savla
I followed all the steps listed here: http://wiki.apache.org/solr/SolrCloud#Example_A:_Simple_two_shard_cluster I have not updated solrconfig.xml and it is same as what comes default with 4.10. The only thing I added extra was list of my fields in example/solr/collection1/conf/schema.xml @sha

Re: Solr numFound > 0 but doc list empty in Solr Cloud setup

2015-01-16 Thread Anshum Gupta
Looks like a config issue to me more than anything else. Can you share your solrconfig? You will not be able to attach a file here but you could share it via pastebin or something similar. Also, why are you adding the "shards=http://localhost:8983/solr/collection1"; part to your request? You don't

Re: Solr example for Solr 4.10.2 gives warning about Multiple request handlers with same name

2015-01-16 Thread Michael Sokolov
I've seen the same thing, poked around a bit and eventually decided to ignore it. I think there may be a ticket related to that saying it's a logging bug (ie not a real issue), but I couldn't swear to it. -Mike On 01/16/2015 12:36 PM, Tom Burton-West wrote: Hello, I'm running Solr 4.10.2 ou

Re: Solr numFound > 0 but doc list empty in Solr Cloud setup

2015-01-16 Thread Jaikit Savla
One more point: In cloud mode: If I submit a request with fl=id, it returns doc list. But when I add any other field, I get an empty doc list. http://localhost:/solr/select?q=domain:ebay&wt=json&shards=http://localhost:/solr/&fl=id&rows=1 { responseHeader: { status: 0, QTime: 7, params

Re: Solr numFound > 0 but doc list empty in Solr Cloud setup

2015-01-16 Thread Jaikit Savla
As I said earlier - single core set up works fine with same solrconfig.xml and schema.xml cd example java -Djetty.port= -Dsolr.data.dir=/index/path -jar start.jar I am running Solr-4.10. Do I need to change any other configuration for running in solr cloud mode ? On Friday, January 16,

Re: Solr numFound > 0 but doc list empty in Solr Cloud setup

2015-01-16 Thread Jaikit Savla
Verified that all my fields are stored and marked as indexed. --> http://localhost:/solr/collection1/query?q=body%3A%22from%22&wt=json&indent=true&shards=http://localhost:/solr/collection1&start=1&rows=10&shards.info=true { responseHeader: { status: 0, QTime: 19, params: { shards: "ht

Re: Solr numFound > 0 but doc list empty in Solr Cloud setup

2015-01-16 Thread Erick Erickson
Any chance that you've defined &rows=0 in your handler? Or is it possible that you have not set stored="true" for any of your fields? Best, Erick On Fri, Jan 16, 2015 at 9:46 AM, Jaikit Savla wrote: > I am using below tutorial for Solr Cloud setup with 2 shards > http://wiki.apache.org/solr/Solr

Re: Query ReRanking question

2015-01-16 Thread Erick Erickson
Ravi: Yep, this is the standard way to have recency influence the rank rather than take over absolute ordering via a sort=date_time or similar. Of course how strongly the rank is influenced is "more an art than a science" as far as figuring out what actual constants to put in Best, Erick On

Solr numFound > 0 but doc list empty in Solr Cloud setup

2015-01-16 Thread Jaikit Savla
I am using below tutorial for Solr Cloud setup with 2 shards http://wiki.apache.org/solr/SolrCloud#Example_A:_Simple_two_shard_cluster I am able to get the default set up working. However, I have a requirement where my index is not in default location (data/index) and hence when I start jvm for

Solr example for Solr 4.10.2 gives warning about Multiple request handlers with same name

2015-01-16 Thread Tom Burton-West
Hello, I'm running Solr 4.10.2 out of the box with the Solr example. i.e. ant example cd solr/example java -jar start.jar in /example/log At start-up the example gives this message in the log: WARN - 2015-01-16 12:31:40.895; org.apache.solr.core.RequestHandlers; Multiple requestHandler regist

Re: Apache Solr quickstart tutorial - error while loading main class SimplePostTool

2015-01-16 Thread Shubhanshu Gupta
Thanks a lot. It did work. A last favor - can you please explain me, why did the old command didn't work and why this one worked? Although, I do know that the command you have given assumes that I did not set the environment through: "export CLASSPATH =dist/solr-core-4.10.2.jar" . But I already s

Re: Query ReRanking question

2015-01-16 Thread Ravi Solr
As per Erick's suggestion reposting my response to the group. Joel and Erick Thank you very much for helping me out with the ReRanking question a while ago. I have an alternative which seems to be working better for me than ReRanking, can you kindly let me know of any pitfalls that you guys can th

Re: OutOfMemoryError for PDF document upload into Solr

2015-01-16 Thread Erick Erickson
Here's an example of using Tika in a stand-alone Java program. https://lucidworks.com/blog/indexing-with-solrj/ Best, Erick On Fri, Jan 16, 2015 at 7:42 AM, Jack Krupansky wrote: > It would be nice to have a SolrJ-level implementation as well as a > command-line implementation of the extraction

Re: OutOfMemoryError for PDF document upload into Solr

2015-01-16 Thread Jack Krupansky
It would be nice to have a SolrJ-level implementation as well as a command-line implementation of the extraction request handler so that app ingestion code could do the extraction outside of Solr at the app level and even as a separate process to stream to the app or Solr. That would permit the to

Re: OutOfMemoryError for PDF document upload into Solr

2015-01-16 Thread Markus Jelsma
Tika 1.6 has PDFBox 1.8.4, which has memory issues, eating excessive RAM! Either upgrade to Tika 1.7 (out now) or manually use the PDFBox 1.8.8 dependency. M. On Friday 16 January 2015 15:21:55 Charlie Hull wrote: > On 16/01/2015 04:02, Dan Davis wrote: > > Why re-write all the document convers

Re: OutOfMemoryError for PDF document upload into Solr

2015-01-16 Thread Charlie Hull
On 16/01/2015 04:02, Dan Davis wrote: Why re-write all the document conversion in Java ;) Tika is very slow. 5 GB PDF is very big. Or you can run Tika in a separate process, or even on a separate machine, wrapped with something to cope if it dies due to some horrible input...we generally a

Re: Apache Solr quickstart tutorial - error while loading main class SimplePostTool

2015-01-16 Thread Ahmet Arslan
Hi Shubhanshu, How about this one? java -classpath dist/solr-core-*jar -Dauto -Drecursive org.apache.solr.util.SimplePostTool docs/ Ahmet On Friday, January 16, 2015 3:13 PM, Shubhanshu Gupta wrote: I am following Apache Solr quickstart tutorial

Apache Solr quickstart tutorial - error while loading main class SimplePostTool

2015-01-16 Thread Shubhanshu Gupta
I am following Apache Solr quickstart tutorial . The tutorial comes across indexing a directory of rich files which requires implementing java -Dauto -Drecursive org.apache.solr.util.SimplePostTool docs/ . I am getting an error which says: Could not f

Re: How to select the correct number of Shards in SolrCloud

2015-01-16 Thread Manohar Sripada
Thanks Daniel and Shawn for your valuable suggestions, Daniel, If you have a query and it needs to get results from 64 cores, if 63 return in 100ms but the last core is in GC pause and takes 500ms, your query will take just over 500ms. > There is only single JVM running per machine. I will get the

Re: Solr groups not matching with terms in a field

2015-01-16 Thread Naresh Yadav
thanks Ahmet..my problem solved...reason of slow performance of facet query was : not doing setRows(0).. once i done it then it came out in seconds like terms query. On Fri, Jan 16, 2015 at 3:25 PM, Ahmet Arslan wrote: > Hi, > > Thats a different problem : speed-up faceting. > Faceting used all

Re: Easiest way to embed solr in a desktop application

2015-01-16 Thread Ramkumar R. Aiyengar
That's correct, even though it should still be possible to embed Jetty, that could change in the future, and that's why support for pluggable containers is being taken away. If you need to deal with the index at a lower level, there's always Lucene you can use as a library instead of Solr. But I

Re: Solr groups not matching with terms in a field

2015-01-16 Thread Ahmet Arslan
Hi, Thats a different problem : speed-up faceting. Faceting used all over the place and it is fast. I suggest you looks for faceting improvements. Ahmet On Friday, January 16, 2015 11:17 AM, Naresh Yadav wrote: I tried facetting also but not worked smoothly for me. Case i had mentioned in em

Re: How to select the correct number of Shards in SolrCloud

2015-01-16 Thread Shawn Heisey
On 1/15/2015 10:58 PM, Manohar Sripada wrote: > The reason I have created 64 Shards is there are 4 CPU cores on each VM; > while querying I can make use of all the CPU cores. On an average, Solr > QTime is around 500ms here. > > Last time to my other discussion, Erick suggested that I might be ove

Re: Solr groups not matching with terms in a field

2015-01-16 Thread Naresh Yadav
I tried facetting also but not worked smoothly for me. Case i had mentioned in email is dummy one and my actual index is with 12 lakh docs and 2 GB size on single machine. Each of tenant_pool field value has 20-30 tokens. Getting all terms in tenant_pool is fast in seconds but when i go with facet

Re: How to select the correct number of Shards in SolrCloud

2015-01-16 Thread Daniel Collins
Sharding a query lets you parallel the actual querying the index part of the search. But remember that as soon as you spread the query out more, you also need to bring all 64 results sets back together and consolidate them into a single result set for the end user. At some point, the gain of being

Re: OutOfMemoryError for PDF document upload into Solr

2015-01-16 Thread Siegfried Goeschl
Hi Dan, neat idea - made a mental note :-) That brings us back to the point that in complex setups you should not do the document pre-processing directly in SOLR but have an import process which can safely crash when processing a 4GB PDF file Cheers, Siegfried Goeschl On 16.01.15 05:02, Da

Re: Solr groups not matching with terms in a field

2015-01-16 Thread Ahmet Arslan
Hi Naresh, Yup terms component does not respect q or fq parameter. Luckily, thats easy with facet component. Example : facet=true&facet.field=tenant_pool&q=type:1 Please see more here : https://cwiki.apache.org/confluence/display/solr/Faceting happy faceting, ahmet On Friday, January 16, 20

Re: Solr groups not matching with terms in a field

2015-01-16 Thread Naresh Yadav
Hi ahmet, Thanks, now i understand better, i will not try my usecase with grouping. Actually i am interested in unique terms in a field i.e tenant_pool. That i get perfectly with http://www.imagesup.net/?di=614212438580 But i am not able to get terms after applying some filter say "type":"1". Tha