Re: Informal poll on running Solr 4 on Java 7 with G1GC

2013-06-20 Thread Bernd Fehling
Hi Shawn, some good information about G1 tuning, may be something for the wiki about GC tuning. Am 20.06.2013 18:01, schrieb Shawn Heisey: > On 6/20/2013 8:02 AM, John Nielsen wrote: > ... > ... When you take a look at the overall stats and the memory graph over time, > G1 looks way better. > U

Re: Queuing for Solr Updates?

2013-06-20 Thread Gora Mohanty
On 21 June 2013 11:12, William Bell wrote: > Is there a simpler way to kick off a DIH handler update when it is running? > > Scenario: > > 1. Doing an update using DIH > 2. We need to kick off another update. Cannot since DIH is already running. > So the program inserts into a table (ID=55) > 3. S

Varnish

2013-06-20 Thread William Bell
Who is using varnish in front of SOLR? Anyone have any configs that work with the cache control headers of SOLR? -- Bill Bell billnb...@gmail.com cell 720-256-8076

Queuing for Solr Updates?

2013-06-20 Thread William Bell
Is there a simpler way to kick off a DIH handler update when it is running? Scenario: 1. Doing an update using DIH 2. We need to kick off another update. Cannot since DIH is already running. So the program inserts into a table (ID=55) 3. Since the DIH is still running old update, we cannot fire a

Re: SolrCloud replication issues

2013-06-20 Thread Sven Stark
I think you're onto it. Our schema.xml had it I'll change and test it. Will probably not happen before Monday though. Many thanks already, Sven On Fri, Jun 21, 2013 at 2:18 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > Okay so from the same thread, have you made sure the _ver

Re: Informal poll on running Solr 4 on Java 7 with G1GC

2013-06-20 Thread William Bell
It would be good to see some CMS configs too... Can you send your java params? On Wed, Jun 19, 2013 at 8:46 PM, Shawn Heisey wrote: > On 6/19/2013 4:18 PM, Timothy Potter wrote: > > I'm sure there's some site to do this but wanted to get a feel for > > who's running Solr 4 on Java 7 with G1 gc

Re: SolrCloud replication issues

2013-06-20 Thread Shalin Shekhar Mangar
Okay so from the same thread, have you made sure the _version_ field is a long in schema? On Fri, Jun 21, 2013 at 7:44 AM, Sven Stark wrote: > Actually this looks very much like > > http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201304.mbox/%3ccacbkj07ob4kjxwe_ogzfuqg5qg99qwpovbzkdot

Re: SolrCloud replication issues

2013-06-20 Thread Sven Stark
Actually this looks very much like http://mail-archives.apache.org/mod_mbox/lucene-solr-user/201304.mbox/%3ccacbkj07ob4kjxwe_ogzfuqg5qg99qwpovbzkdota8bihcis...@mail.gmail.com%3E Sven On Fri, Jun 21, 2013 at 11:54 AM, Sven Stark wrote: > Thanks for the super quick reply. > > The logs are pretty

Re: SolrCloud replication issues

2013-06-20 Thread Sven Stark
Thanks for the super quick reply. The logs are pretty big, but one thing comes up over and over again: Leader side: ERROR - 2013-06-21 01:44:24.014; org.apache.solr.common.SolrException; shard update error StdNode: http://xxx:xxx:xx:xx:8983/solr/collection1/:org.apache.solr.client.solrj.impl.Htt

Re: Sharding and Replication clarification

2013-06-20 Thread Shalin Shekhar Mangar
On Wed, Jun 19, 2013 at 11:12 PM, Asif wrote: > Hi, > > I had questions on implementation of Sharding and Replication features of > Solr/Cloud. > > 1. I noticed that when sharding is enabled for a collection - individual > requests are sent to each node serving as a shard. Yes, search requests ar

Re: SolrCloud replication issues

2013-06-20 Thread Shalin Shekhar Mangar
This doesn't seem right. A leader will ask a replica to recover only when an update request could not be forwarded to it. Can you check your leader logs to see why updates are not being sent through to replicas? On Fri, Jun 21, 2013 at 7:03 AM, Sven Stark wrote: > Hello, > > first: I am pretty mu

SolrCloud replication issues

2013-06-20 Thread Sven Stark
Hello, first: I am pretty much a Solr newcomer, so don't necessarily assume basic solr knowledge. My problem is that in my setup SolrCloud seems to create way too much network traffic for replication. I hope I'm just missing some proper config options. Here's the setup first: * I am running a fi

Re: Multivalued facet with 0 unexpected results

2013-06-20 Thread Shalin Shekhar Mangar
It sounds suspiciously similar to https://issues.apache.org/jira/browse/SOLR-3793 which was fixed in Solr 4.0 You should upgrade to a more recent Solr version (4.3.1 is the latest) and see if it's still a problem for you. On Fri, Jun 21, 2013 at 3:19 AM, Samuel García Martínez wrote: > just to c

Re: Partial update using solr 4.3 with csv input

2013-06-20 Thread Shalin Shekhar Mangar
Note that even though partial updates sounds like what you should do (because only part of your data has changed), unless you are dealing with lots of data, just re-adding everything (if possible) can be plenty fast. So before you write complex code to construct partial updates from your csv files,

Re: Multivalued facet with 0 unexpected results

2013-06-20 Thread Samuel García Martínez
just to clarify, we send manually the "commit and optimize" we use to fix this problem. The index process send its own commit, making "searchable" the new facet values. But it seems that this process is not deleting any previous value filled used by the uninvertedfield. On Thu, Jun 20, 2013 at 11

Multivalued facet with 0 unexpected results

2013-06-20 Thread Samuel García Martínez
Hi all, we are getting some facet (faceting a multivalued field) values with 0 results using *:* query. I think this is really strange, since we are using MatchAllQuery there is no way we can get 0 results in any value. That 0 results values were present in the index before the reindex we made. We

Re: Partial update using solr 4.3 with csv input

2013-06-20 Thread Jack Krupansky
I'd have to see the whole scenario... What's an example of the original input, and then some examples of the kind of updates. Generally, CSV is most useful simply to bulk import (or export) data. It wasn't really designed for incremental update of existing documents. -- Jack Krupansky

Re: Solr, ICUTokenizer with Latin-break-only-on-whitespace

2013-06-20 Thread Jonathan Rochkind
Thank you... I started out writing an email with screenshots proving that it wasn't working for me in 4.3.0... and of course, having to confirm every single detail in order to say I confirmed it... I realized it was a mistake on my part, not testing what I thought I was testing. Does indeed ap

Re: solr rpm

2013-06-20 Thread Shawn Heisey
On 6/20/2013 1:44 PM, Alexandre Rafalovitch wrote: On Thu, Jun 20, 2013 at 12:48 PM, Shawn Heisey wrote: They've done a very different job than you would see if we were to make an installer, because they are integrating Lucene as a separate dependency for both Solr and for other search packages

Re: solr rpm

2013-06-20 Thread Alexandre Rafalovitch
On Thu, Jun 20, 2013 at 12:48 PM, Shawn Heisey wrote: > They've done a very different job than you would see if we were to make an > installer, because they are integrating Lucene as a separate dependency for > both Solr and for other search packages. Is that the only thing that's different? Does

RE: Informal poll on running Solr 4 on Java 7 with G1GC

2013-06-20 Thread Petersen, Robert
I've been trying it out on solr 3.6.1 with a 32GB heap and G1GC seems to be more prone to OOMEs than CMS. I have been running it on one slave box in our farm and the rest of the slaves are still on CMS and three times now it has gone OOM on me whereas the rest of our slaves kept chugging along

Re: Solr, ICUTokenizer with Latin-break-only-on-whitespace

2013-06-20 Thread Shawn Heisey
On 6/20/2013 1:26 PM, Jonathan Rochkind wrote: I want, for instance, "C++ Language" to be tokenized into "C++", "Language". But the ICUTokenizer, even with the rulefiles="Latn:Latin-break-only-on-whitespace.rbbi", with the rbbi file from the Solr 4.3 source [1]. But the ICUTokenizer, even wi

Re: Need help on Solr

2013-06-20 Thread Raymond Wiker
On Jun 20, 2013, at 18:26 , Abhishek Bansal wrote: > Yeah I know, out of the box there is one id field. I removed it from > schema.xml > > I have also added below code to automatically generate an ID. > > multiValued="false"/> > > Is that a valid configuration for an id field (assuming t

Re: Partial update using solr 4.3 with csv input

2013-06-20 Thread smanad
Thanks for confirming. So if my input is a csv file, I will need a script to read the "delta" changes one by one, convert it to json and then use 'update' handler with that piece of json data. Makes sense? Jack Krupansky-2 wrote > Correct, no atomic update for CSV format. There just isn't any

Solr, ICUTokenizer with Latin-break-only-on-whitespace

2013-06-20 Thread Jonathan Rochkind
(to solr-user, CC'ing author I'm responding to) I found the solr-user listserv contribution at: https://mail-archives.apache.org/mod_mbox/lucene-solr-user/201305.mbox/%3c51965e70.6070...@elyograg.org%3E Which explain a way you can supply custom rulefiles to ICUTokenizer, in this case to tell i

Re: update solr.xml dynamically to add new cores

2013-06-20 Thread smanad
Thanks Michael, both the reasons make sense. Currently I am not planning on using SolrCloud so as you suggested if I can use http://wiki.apache.org/solr/CoreAdmin api. While doing that did you mean running a curl command similar to this, http://localhost:8983/solr/admin/cores?action=CREATE&name=

RE: solr rpm

2013-06-20 Thread Boogie Shafer
there is an rpm build framework for building a jetty powered solr rpm here if you are interested https://github.com/boogieshafer/jetty-solr-rpm its currently set for solr 4.3.0 + built in jetty example + jetty start script and configs + jmx + logging via logback framework edit the build script

Re: solr rpm

2013-06-20 Thread adamc
Thanks Shawn for explaining so fully. -- View this message in context: http://lucene.472066.n3.nabble.com/solr-rpm-tp4071905p4071941.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: DataImportHandler: Problems with delta-import and CachedSqlEntityProcessor

2013-06-20 Thread Dyer, James
Instead of specifying CachedSqlEntityProcessor, you can specify SqlEntityProcessor with "cacheImpl='SortedMapBackedCache'". If you parametertize this, to have "SortedMapBackedCache" for full updates but blank for deltas I think it will cache only on the full import. Another option is to parame

Re: solr rpm

2013-06-20 Thread Shawn Heisey
On 6/20/2013 9:30 AM, adamc wrote: I am wondering why there is no official Solr RPM. I wish Solr releases rpm like Sphinx does, http://sphinxsearch.com/downloads/release/ I agree with you that Solr should be much easier to get running than it is. There are some roadblocks, though. Solr isn'

Re: Need help on Solr

2013-06-20 Thread Abhishek Bansal
As I am running Solr on windows + tomcat I am using below command to index pdf. I hope this command is not faulty. Please check java -jar -Durl=" http://localhost:8080/solr-4.3.0/update/extract?literal.id=1&commit=true"; post.jar sample.pdf with regards, Abhishek Bansal On 20 June 2013 21:56, A

Re: Need help on Solr

2013-06-20 Thread Abhishek Bansal
Yeah I know, out of the box there is one id field. I removed it from schema.xml I have also added below code to automatically generate an ID. with regards, Abhishek Bansal On 20 June 2013 21:49, Shreejay wrote: > org.apache.solr.common.SolrException: [schema.xml] Duplicate field > d

solr rpm

2013-06-20 Thread adamc
I am wondering why there is no official Solr RPM. I wish Solr releases rpm like Sphinx does, http://sphinxsearch.com/downloads/release/ -- View this message in context: http://lucene.472066.n3.nabble.com/solr-rpm-tp4071905.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Informal poll on running Solr 4 on Java 7 with G1GC

2013-06-20 Thread Timothy Potter
Awesome info, thanks Shawn! I'll post back my results with G1 after we've had some time to analyze it in production. On Thu, Jun 20, 2013 at 11:01 AM, Shawn Heisey wrote: > On 6/20/2013 8:02 AM, John Nielsen wrote: >> >> We used to use G1, but recently went back to CMS. >> >> G1 gave us too long

Re: Need help on Solr

2013-06-20 Thread Shreejay
org.apache.solr.common.SolrException: [schema.xml] Duplicate field definition for 'id' You might have defined an id field in the schema file. The out of box schema file already contains an id field . -- Shreejay On Thursday, June 20, 2013 at 9:16, Abhishek Bansal wrote: > Hello, > > I am

Need help on Solr

2013-06-20 Thread Abhishek Bansal
Hello, I am trying to index a pdf file on Solr. I am running icurrently Solr on Apache Tomcat 6. When I try to index it I get below error. Please help. I was not able to rectify this error with help of internet. ERROR - 2013-06-20 20:43:41.549; org.apache.solr.core.CoreContainer; Unable to cr

Re: DataImportHandler: Problems with delta-import and CachedSqlEntityProcessor

2013-06-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
yes. that's right On Thu, Jun 20, 2013 at 8:16 PM, Constantin Wolber < constantin.wol...@medicalcolumbus.de> wrote: > Hi, > > i may have been a little to fast with my response. > > After reading a bit more I imagine you meant running the full-import with > the entity param for the root entity fo

Re: Informal poll on running Solr 4 on Java 7 with G1GC

2013-06-20 Thread Shawn Heisey
On 6/20/2013 8:02 AM, John Nielsen wrote: We used to use G1, but recently went back to CMS. G1 gave us too long stop-the-world events. CMS uses more ressources for the same work, but it is more predictable and we get better worst-case performance out of it. This is exactly the behavior I saw.

Re: Get all values from a field

2013-06-20 Thread Erik Hatcher
David - This is effectively faceting. If you want to see all cat values across all documents, do /select?q=*:*&rows=0&facet=on&facet.field=cat and you'll get what you're looking for. Erik On Jun 20, 2013, at 11:35 , It-forum wrote: > Hello, > > I'm looking to retreive all distinct v

Get all values from a field

2013-06-20 Thread It-forum
Hello, I'm looking to retreive all distinct values of a specific field. My documents have field like : id name cat ref model brand I wich to be able to retreive all cat distinct values. How could I do that with Solr, I'm totally block Please help Regards David

Re: SolrCloud - Score calculation

2013-06-20 Thread Jack Krupansky
Even if shards are exactly the same size, the distribution of terms may not be equal in each shard. But, yes, if shard size and term distribution are equal, then IDF should be comparable across shards, sort of. -- Jack Krupansky -Original Message- From: Learner Sent: Thursday, June 2

Re: SolrCloud - Score calculation

2013-06-20 Thread Learner
Thanks for your response. So in case of SolrCloud, SOLR/zookeeper takes care of managing the indexing / searching. So in that case I assume most of the shards will be of equal size (I am just going to push the data to a leader). I assume IDF wont be a big issue then since the shards size are almo

Re: update solr.xml dynamically to add new cores

2013-06-20 Thread Michael Della Bitta
Hi, I wouldn't edit solr.xml directly for two reasons. One being that an already running Solr installation won't update with changes to that file, and might actually overwrite the changes that you make to it. And two, it's going away in a future release of Solr. Instead, I'd make the package that

AW: DataImportHandler: Problems with delta-import and CachedSqlEntityProcessor

2013-06-20 Thread Constantin Wolber
Hi, i may have been a little to fast with my response. After reading a bit more I imagine you meant running the full-import with the entity param for the root entity for full import. And running the delta import with the entity param for the delta entity. Is that correct? Regards Constantin

AW: DataImportHandler: Problems with delta-import and CachedSqlEntityProcessor

2013-06-20 Thread Constantin Wolber
Hi, and thanks for the answer. But I'm a little bit confused about what you are suggesting. I did not really use the rootEntity attribute before. But from what I read in the documentation as far as I can tell that would result in two documents (maybe with the same id which would probably resul

RE: Steps for creating a custom query parser and search component

2013-06-20 Thread Swati Swoboda
Hi Juha, If it's just a matter of format, have you considered adding another layer between Solr where you've got a class that just takes in your queries in the proprietary format and then converts them to what Solr needs? Similarly, if you need your results in a format, just convert them again?

Re: Steps for creating a custom query parser and search component

2013-06-20 Thread Jack Krupansky
First, my standard admonition: DON'T DO IT!!! Try harder to use the features Solr provides before trying to shoehorn even more code into Solr. And... think again about whether this code needs to be inside of Solr as opposed to simply doing multiple requests in a clean, RESTful application laye

Re: Getting the String which matched in the document as response

2013-06-20 Thread Jack Krupansky
Take a look at the explain section when you add the debugQuery=true parameter. You can additionally set debug.explain.structured=true to get the scoring explanation in XML if parsing the text is a problem for you. -- Jack Krupansky -Original Message- From: Prathik Puthran Sent: Thurs

Re: Informal poll on running Solr 4 on Java 7 with G1GC

2013-06-20 Thread John Nielsen
We used to use G1, but recently went back to CMS. G1 gave us too long stop-the-world events. CMS uses more ressources for the same work, but it is more predictable and we get better worst-case performance out of it. Med venlig h

Getting the String which matched in the document as response

2013-06-20 Thread Prathik Puthran
Hi, Is it possible to get the exact matched string in the index in the select response of Solr. For eg : If the search query is "Hello World" and the query parser is "OR" solr would return all documents which matched both "Hello World", only "Hello" or only "World". Now I want to know which of th

Steps for creating a custom query parser and search component

2013-06-20 Thread Juha Haaga
Hello list followers, I need to write a custom Solr query parser and a search component. The requirements for the component are that the raw query that may need to be split into separate Solr queries is in a proprietary format encoded in JSON, and the output is also going to be in a similar pro

Re: solr performance problem from 4.3.0 with sorting

2013-06-20 Thread Shane Perry
Ariel, I just went up against a similar issue with upgrading from 3.6.1 to 4.3.0. In my case, my solrconfig.xml for 4.3.0 (which was based on my 3.6.1 file) did not provide a newSearcher or firstSearcher warming query. After adding a query to each listener, my query speeds drastically increased.

Re: DataImportHandler: Problems with delta-import and CachedSqlEntityProcessor

2013-06-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
it is possible to create two separate root entities . one for full-import and another for delta. for the delta-import you can skip Cache that way On Thu, Jun 20, 2013 at 1:50 PM, Constantin Wolber < constantin.wol...@medicalcolumbus.de> wrote: > Hi, > > i searched for a solution for quite some

Re: yet another optimize question

2013-06-20 Thread Jack Krupansky
Take a look at using DocValues for facets that are problematic. It not only moves the memory off-heap, but stores values in a much more optimal manner. -- Jack Krupansky -Original Message- From: Toke Eskildsen Sent: Thursday, June 20, 2013 3:26 AM To: solr-user@lucene.apache.org Subje

Re: Informal poll on running Solr 4 on Java 7 with G1GC

2013-06-20 Thread Bernd Fehling
Am 20.06.2013 00:18, schrieb Timothy Potter: > I'm sure there's some site to do this but wanted to get a feel for > who's running Solr 4 on Java 7 with G1 gc enabled? > > Cheers, > Tim > Currently using Solr 4.2.1 in production with Oracle Java(TM) SE Runtime Environment (build 1.7.0_07-b10) an

Re: UnInverted multi-valued field

2013-06-20 Thread Bernd Fehling
Hello, Am 20.06.2013 09:34, schrieb Jochen Lienhard: > Hello, > > well ... we have 5 multi-valued facet fields, so you had to wait sometimes up > to one minute. > > The old searcher blocks during this time. May be related to an already fixed SOLR-4589 issue? Generally there is no blocking by

solr performance problem from 4.3.0 with sorting

2013-06-20 Thread Ariel Zerbib
Hi, We updated to version 4.3.0 from 4.2.1 and we have some performance problem with the sorting. A query that returns 1 hits has a query time more than 100ms (can be more than 1s) against less than 10ms for the same query without the sort parameter: query with sorting option: q=level_4_id:5310

DataImportHandler: Problems with delta-import and CachedSqlEntityProcessor

2013-06-20 Thread Constantin Wolber
Hi, i searched for a solution for quite some time but did not manage to find some real hints on how to fix it. I'm using solr 4.3.0 1477023 - simonw - 2013-04-29 15:10:12 running in a tomcat 6 container. My data import setup is basically the following: Data-config.xml:

Re: UnInverted multi-valued field

2013-06-20 Thread Jochen Lienhard
Hello, well ... we have 5 multi-valued facet fields, so you had to wait sometimes up to one minute. The old searcher blocks during this time. @Toke Eskildsen: the example I posted was a very small update, usually there are more terms. We are using Solr 3.6. I don't know if it will be faste

RE: yet another optimize question

2013-06-20 Thread Toke Eskildsen
Petersen, Robert [robert.peter...@mail.rakuten.com] wrote: > We actually have hundreds of facet-able fields, but most are specialized > and are only faceted upon if the user has drilled into the particular category > to which they are applicable and so they are only indexed for products > in those