date:20150107

Re: Determining the Number of Solr Shards

2015-01-07 Thread Shawn Heisey

On 1/7/2015 7:14 PM, Nishanth S wrote: > Thanks Shawn and Walter.Yes those are 12,000 writes/second.Reads for the > moment would be in the 1000 reads/second. Guess finding out the right > number of shards would be my starting point. I don't think indexing 12000 docs per second would be too much

Re: Determining the Number of Solr Shards

2015-01-07 Thread Jack Krupansky

Anybody on the list have a feel for how many simultaneous queries Solr can handle in parallel? Will it be linear WRT the number of CPU cores? Or are their other bottlenecks or locks in Lucene or Solr such that even with more CPU cores the Solr server will be saturated with fewer queries than the nu

Re: Determining the Number of Solr Shards

2015-01-07 Thread Erick Erickson

1,000 queries/second is not trivial either. My starting point for QPS is about 50. But that's entirely "straw man" and (and as the link Shawn provided indicates) only testing will determine if that's realistic. So going for 1,000 queries/second, you're talking 20 replicas for each shard. And

Re: Determining the Number of Solr Shards

2015-01-07 Thread Nishanth S

Thanks Shawn and Walter.Yes those are 12,000 writes/second.Reads for the moment would be in the 1000 reads/second. Guess finding out the right number of shards would be my starting point. Thanks, Nishanth On Wed, Jan 7, 2015 at 6:28 PM, Walter Underwood wrote: > This is described as “write

Re: UUIDUpdateProcessorFactory causes repeated documents when uploading csv files?

2015-01-07 Thread Chris Hostetter

: It's a single Solr Instance, and in my files, I used 'doc_key' everywhere, : but I changed it to "id" in the email I sent out wanting to make it easier : to read, sorry don't mean to confuse you :) https://wiki.apache.org/solr/UsingMailingLists - what version of solr? - how exactly are you doi

Re: Determining the Number of Solr Shards

2015-01-07 Thread Walter Underwood

This is described as “write heavy”, so I think that is 12,000 writes/second, not queries. Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ On Jan 7, 2015, at 5:16 PM, Shawn Heisey wrote: > On 1/7/2015 3:29 PM, Nishanth S wrote: >> I am working on coming up with a solr a

Re: Determining the Number of Solr Shards

2015-01-07 Thread Shawn Heisey

On 1/7/2015 3:29 PM, Nishanth S wrote: > I am working on coming up with a solr architecture layout for my use > case.We are a very write heavy application with no down time tolerance and > have low SLAs on reads when compared with writes.I am looking at around > 12K tps with average index size

Re: How large is your solr index?

2015-01-07 Thread Shawn Heisey

On 1/7/2015 2:26 PM, Joseph Obernberger wrote: > Thank you Toke - yes - the data is indexed throughout the day. We are > handling very few searches - probably 50 a day; this is an R&D system. > Our HDFS cache, I believe, is too small at 10GBytes per shard. This > comes out to 20GBytes of HDFS cac

Determining the Number of Solr Shards

2015-01-07 Thread Nishanth S

Hi All, I am working on coming up with a solr architecture layout for my use case.We are a very write heavy application with no down time tolerance and have low SLAs on reads when compared with writes.I am looking at around 12K tps with average index size of solr document in the range of 6kB.I

Re: Solr support for multi-tenant applications

2015-01-07 Thread Jack Krupansky

Indeed, it is all about the numbers. So, Danesh, what are your numbers - number of tenants and number of documents per tenant. What is the expected distribution curve of documents per tenant? The only "limit" I would suggest is that you not have more than "low hundreds" of cores/tenants. Will ten

Re: How large is your solr index?

2015-01-07 Thread Erick Erickson

You shouldn't _have_ to keep track of this yourself since Solr 4.4, see SOLR-4965 and the associated Lucene JIRA. Those are supposed to make issuing a commit on an index that hasn't changed a no-op. If you do issue commits and do open new searchers when the index has NOT changed, it's worth a JIRA

Re: Solr Date Range not returning results for last 1 month

2015-01-07 Thread Chris Hostetter

: However the facets I am getting for the date is till last month, say today : is 24th December and I am getting it till 24th November. How should I : modify my query to obtain results till today? Tried a few options using HIT : and TRIAL :) but could not arrive at a solution. it's not clear what

Re: How large is your solr index?

2015-01-07 Thread Joseph Obernberger

Thank you Toke - yes - the data is indexed throughout the day. We are handling very few searches - probably 50 a day; this is an R&D system. Our HDFS cache, I believe, is too small at 10GBytes per shard. This comes out to 20GBytes of HDFS cache per physical machine plus about 10G each for the

Re: How large is your solr index?

2015-01-07 Thread Peter Sturge

> Is there a problem with multi-valued fields and distributed queries? > No. But there are some components that don't do the right thing in > distributed mode, joins for instance. The list is actually quite small and > is getting smaller all the time. Yes, joins is the main one. There used to be

RE: How large is your solr index?

2015-01-07 Thread Toke Eskildsen

Joseph Obernberger [j...@lovehorsepower.com] wrote: [HDFS, 9M docs, 2.9TB, 22 shards, 11 bare metal boxes] > A typical query takes about 7 seconds to run, but we also do faceting > and clustering. Those can take in the 3 - 5 minute range depends on > what was queried, but can be as little as 10

Re: ignoring bad documents during index

2015-01-07 Thread SolrUser1543

I have implemented an update processor as described above. On single solr instance it works fine. When I testing it on solr cloud with several nodes and trying to index few documents , when some of them are incorrect , each instance is creating its response, but it is not aggregated by the ins

Re: Is defining facet fields in solrconfig.xml mandatory ?

2015-01-07 Thread Chris Hostetter

: I am exploring faceting in SOLR in collection1 example Faceting fields are : defined in solrconfig.xml under browse request handler which is used in : in-built "VelocityResponseWriter" context is everything -- you cut out the key line that would answer & explain your question...

Re: Is defining facet fields in solrconfig.xml mandatory ?

2015-01-07 Thread Erik Hatcher

No, that’s not mandatory. That is just an example of how a request handler could spell that out, but those parameters can be (and often are, depending on the nature of the application) specified per request. Erik > On Jan 7, 2015, at 1:27 PM, Vishal Swaroop wrote: > > Hi, > > I am e

Is defining facet fields in solrconfig.xml mandatory ?

2015-01-07 Thread Vishal Swaroop

Hi, I am exploring faceting in SOLR in collection1 example Faceting fields are defined in solrconfig.xml under browse request handler which is used in in-built "VelocityResponseWriter" ... on cat I think it is not at all mandatory to define facet fields in solrconfig.xml, right ?

Re: How large is your solr index?

2015-01-07 Thread Erick Erickson

bq: I'm wondering if anyone has been using SolrCloud with HDFS at large scales Absolutely, there are several companies doing this, see Lucidworks and Cloudera for two instances. Solr itself has the MapReduceIndexerTool for indexing to Solr's running on HDFS FWIW. About needing 3x the memory.. si

Re: How large is your solr index?

2015-01-07 Thread Joseph Obernberger

Kinda late to the party on this very interesting thread, but I'm wondering if anyone has been using SolrCloud with HDFS at large scales? We really like this capability since our data is inside of Hadoop and we can run the Solr shards on the same nodes, and we only need to manage one pool of st

Re: Running Multiple Solr Instances

2015-01-07 Thread Nishanth S

Hey Ganesh, This was not for clustering.I do not think you would need clustering with solr cloud.With solr cloud when you create a collection from scratch it creates the data directories under solr home.Now if your drives are mounted as (/d/1,/d/2 etc) you would want to use all the storage ava

Re: How to limit the number of result sets of the 'export' handler

2015-01-07 Thread Joel Bernstein

Sandy, Export uses a very different approach then the normal select approach. Export uses an incremental stream sorting approach that won't run out of memory when sorting very large result sets. And Export does not use stored fields to return results, it uses docValues caches to return results. T

Re: Solr Memory Usage - How to reduce memory footprint for solr

2015-01-07 Thread Erick Erickson

And keep in mind that starving the OS of memory to give it to the JVM is an anti-pattern, see Uwe's excellent blog on MMapDirectory here: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html Best, Erick On Wed, Jan 7, 2015 at 5:55 AM, Shawn Heisey wrote: > On 1/6/2015 1:10 PM

Re: How large is your solr index?

2015-01-07 Thread Erick Erickson

See below: On Wed, Jan 7, 2015 at 1:25 AM, Bram Van Dam wrote: > On 01/06/2015 07:54 PM, Erick Erickson wrote: >> >> Have you considered pre-supposing SolrCloud and using the SPLITSHARD >> API command? > > > I think that's the direction we'll probably be going. Index size (at least > for us) can

problem with solr server start

2015-01-07 Thread paulding

I am new to lucene-solr. I downloaded solr 4.10.3 and installed it in windows server 2008. I tried to start the server following README in example template DIH, java -Dsolr.solr.home="./example-DIH/solr/" -jar start.jar There is no error message in the command line console. When I use a browser to

Re: Garbage Collection tuning - G1 is now a good option

2015-01-07 Thread Otis Gospodnetic

Not sure about AggressiveOpts, but G1 has been working for us nicely. We've successfully used it with HBase, Hadoop, Elasticsearch, and other custom Java apps (all still Java 7, but Java 8 should be even better). Not sure if we are using in on our Solr instances. e.g. see http://blog.sematext.com

Re: How to limit the number of result sets of the 'export' handler

2015-01-07 Thread Alexandre Rafalovitch

I believe export is streaming and it avoids building various caches, so it will not blow up Solr's memory on large datasets. You can read a lot more details in the JIRA that introduced it: https://issues.apache.org/jira/browse/SOLR-5244 I am not sure how it compares with deep-paging though. Rega

Re: leader split-brain at least once a day - need help

2015-01-07 Thread Alan Woodward

I had a similar issue, which was caused by https://issues.apache.org/jira/browse/SOLR-6763. Are you getting long GC pauses or similar before the leader mismatches occur? Alan Woodward www.flax.co.uk On 7 Jan 2015, at 10:01, Thomas Lamy wrote: > Hi there, > > we are running a 3 server cloud

Re: Solr Memory Usage - How to reduce memory footprint for solr

2015-01-07 Thread Shawn Heisey

On 1/6/2015 1:10 PM, Abhishek Sharma wrote: > *Q* - I am forced to set Java Xmx as high as 3.5g for my solr app.. If i > keep this low, my CPU hits 100% and response time for indexing increases a > lot.. And i have hit OOM Error as well when this value is low.. > > Is this too high? If so, how can

Re: leader split-brain at least once a day - need help

2015-01-07 Thread Ugo Matrangolo

Hi Thomas, I did not get these split brains (probably our use case is simpler) but we got the spammed Zk phenomenon. The easiest way to fix it is to: 1. Shut down all the Solr servers in the failing cluster 2. Connect to zk using its CLI 3. rmr overseer/queue 4. Restart Solr Think is way faster

leader split-brain at least once a day - need help

2015-01-07 Thread Thomas Lamy

Hi there, we are running a 3 server cloud serving a dozen single-shard/replicate-everywhere collections. The 2 biggest collections are ~15M docs, and about 13GiB / 2.5GiB size. Solr is 4.10.2, ZK 3.4.5, Tomcat 7.0.56, Oracle Java 1.7.0_72-b14 10 of the 12 collections (the small ones) get fil

Re: Solr support for multi-tenant applications

2015-01-07 Thread Bram Van Dam

One possibility is to have separate core for each tenant domain. You could do that, and it's probably the way to go if you have a lot of data. However, if you don't have much data, you can achieve multi-tenancy by adding a filter to all your queries, for instance: query = userQuery filterQ

Re: How large is your solr index?

2015-01-07 Thread Bram Van Dam

On 01/06/2015 07:54 PM, Erick Erickson wrote: Have you considered pre-supposing SolrCloud and using the SPLITSHARD API command? I think that's the direction we'll probably be going. Index size (at least for us) can be unpredictable in some cases. Some clients start out small and then grow exp

Solr support for multi-tenant applications

2015-01-07 Thread Danesh Kuruppu

Hi all, I need to use solr for multi-tenant application. What is the best way I could achieve multi tenancy with solr? One possibility is to have separate core for each tenant domain. 1. Is it recommended to do it? 2. Are there any issues with have a large number of Solr Cores? Please sug

Re: Determining the Number of Solr Shards

Re: Determining the Number of Solr Shards

Re: Determining the Number of Solr Shards

Re: Determining the Number of Solr Shards

Re: UUIDUpdateProcessorFactory causes repeated documents when uploading csv files?

Re: Determining the Number of Solr Shards

Re: Determining the Number of Solr Shards

Re: How large is your solr index?

Determining the Number of Solr Shards

Re: Solr support for multi-tenant applications

Re: How large is your solr index?

Re: Solr Date Range not returning results for last 1 month

Re: How large is your solr index?

Re: How large is your solr index?

RE: How large is your solr index?

Re: ignoring bad documents during index

Re: Is defining facet fields in solrconfig.xml mandatory ?

Re: Is defining facet fields in solrconfig.xml mandatory ?

Is defining facet fields in solrconfig.xml mandatory ?

Re: How large is your solr index?

Re: How large is your solr index?

Re: Running Multiple Solr Instances

Re: How to limit the number of result sets of the 'export' handler

Re: Solr Memory Usage - How to reduce memory footprint for solr

Re: How large is your solr index?

problem with solr server start

Re: Garbage Collection tuning - G1 is now a good option

Re: How to limit the number of result sets of the 'export' handler

Re: leader split-brain at least once a day - need help

Re: Solr Memory Usage - How to reduce memory footprint for solr

Re: leader split-brain at least once a day - need help

leader split-brain at least once a day - need help

Re: Solr support for multi-tenant applications

Re: How large is your solr index?

Solr support for multi-tenant applications

35 matches

Site Navigation

Mail list logo

Footer information