Re: Optimal configuration for high throughput indexing

2015-05-04 Thread Vinay Pothnis
CloudSolrServer and testing it. Thanks Vinay On 4 May 2015 at 16:09, Shawn Heisey wrote: > On 5/4/2015 2:36 PM, Vinay Pothnis wrote: > > But nonetheless, we will give the latest solrJ client + cloudSolrServer a > > try. > > > > * Yes, the documents are pretty s

Re: Optimal configuration for high throughput indexing

2015-05-04 Thread Vinay Pothnis
y, I'd concentrate on the nodes going into recovery before > anything else. Until that's > fixed any other things you do will not be predictive of much. > > BTW, I typically start with batch sizes of 1,000 FWIW. Sometimes > that's too big, sometimes > too small but i

Optimal configuration for high throughput indexing

2015-04-30 Thread Vinay Pothnis
Hello, I have a usecase with the following characteristics: - High index update rate (adds/updates) - High query rate - Low index size (~800MB for 2.4Million docs) - The documents that are created at the high rate eventually "expire" and are deleted regularly at half hour intervals I current

Re: clarification on index-to-ram ratio

2014-06-19 Thread Vinay Pothnis
Thanks! And yes, the replica belongs to a different shard - not the same data. -Vinay On 19 June 2014 11:21, Toke Eskildsen wrote: > Vinay Pothnis [poth...@gmail.com] wrote: > > *"... Let's say that you have a Solr index size of 8GB. If your OS, > Solr's > >

clarification on index-to-ram ratio

2014-06-19 Thread Vinay Pothnis
Hello All, The documentation and general feedback on the mailing list suggest the following: *"... Let's say that you have a Solr index size of 8GB. If your OS, Solr's Java heap, and all other running programs require 4GB of memory, then an ideal memory size for that server is at least 12GB ..."*

Re: deleting large amount data from solr cloud

2014-04-17 Thread Vinay Pothnis
e them from > the 1 big segment. Your merge policy will, at some point, select this > segment to merge and it'll disappear... > > FWIW, > er...@pedantic.com > > On Thu, Apr 17, 2014 at 7:24 AM, Vinay Pothnis wrote: > > Thanks a lot Shalin! > > > > > >

Re: deleting large amount data from solr cloud

2014-04-17 Thread Vinay Pothnis
Thanks a lot Shalin! On 16 April 2014 21:26, Shalin Shekhar Mangar wrote: > You can specify maxSegments parameter e.g. maxSegments=5 while optimizing. > > > On Thu, Apr 17, 2014 at 6:46 AM, Vinay Pothnis wrote: > > > Hello, > > > > Couple of follow up quest

Re: deleting large amount data from solr cloud

2014-04-16 Thread Vinay Pothnis
documents space? In other words, can I issue "forceMerge=20"? If so, how would the command look like? Any examples for this? Thanks Vinay On 16 April 2014 07:59, Vinay Pothnis wrote: > Thank you Erick! > Yes - I am using the expunge deletes option. > > Thanks for the note

Re: deleting large amount data from solr cloud

2014-04-16 Thread Vinay Pothnis
t; > not seem to get cleaned up. Is there anyway to monitor/follow the > progress > > of index compaction? > > > > Also, does triggering "optimize" from the admin UI help to compact the > > index size on disk? > > > > Thanks > > Vinay > > > &g

Re: Tipping point of solr shards (Num of docs / size)

2014-04-15 Thread Vinay Pothnis
You could look at this link to understand about the factors that affect the solrcloud performance: http://wiki.apache.org/solr/SolrPerformanceProblems Especially, the sections about RAM and disk cache. If the index grows too big for one node, it can lead to performance issues. From the looks of it

Re: deleting large amount data from solr cloud

2014-04-15 Thread Vinay Pothnis
admin UI help to compact the index size on disk? Thanks Vinay On 14 April 2014 12:19, Vinay Pothnis wrote: > Some update: > > I removed the auto warm configurations for the various caches and reduced > the cache sizes. I then issued a call to delete a day's worth of data (

Re: deleting large amount data from solr cloud

2014-04-14 Thread Vinay Pothnis
ad. If not, the only option I see is to do a "trickle" delete of 100 documents per second or something. Also - the other suggestion of using "distributed=false" might not help because the issue currently is that the replication is going to "full copy". Any thoug

Re: deleting large amount data from solr cloud

2014-04-14 Thread Vinay Pothnis
to check whether you still get an OOM or not. > > Thanks; > Furkan KAMACI > > > 2014-04-14 7:09 GMT+03:00 Vinay Pothnis : > > > Aman, > > Yes - Will do! > > > > Furkan, > > How do you mean by 'bulk delete'? > > > >

Re: deleting large amount data from solr cloud

2014-04-13 Thread Vinay Pothnis
p you not to > hit OOM. > > Thanks; > Furkan KAMACI > > > 2014-04-12 8:22 GMT+03:00 Aman Tandon : > > > Vinay please share your experience after trying this solution. > > > > > > On Sat, Apr 12, 2014 at 4:12 AM, Vinay Pothnis > wrote: > > > > &

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
nputs! -Vinay On 11 April 2014 15:28, Shawn Heisey wrote: > On 4/10/2014 7:25 PM, Vinay Pothnis wrote: > >> When we tried to delete the data through a query - say 1 day/month's worth >> of data. But after deleting just 1 month's worth of data, the master node >>

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
iving enough memory to your JVM and this is > just the first OOM you've hit. Look on the Solr admin page and see how > much is being reported, if it's near the limit of your 16G that's the > "smoking gun"... > > Best, > Erick > > On Fri, Apr 11,

Re: deleting large amount data from solr cloud

2014-04-11 Thread Vinay Pothnis
document, even by query should just mark the docs as deleted, a pretty > low-cost operation. > > how much memory are you giving the JVM? > > Best, > Erick > > On Thu, Apr 10, 2014 at 6:25 PM, Vinay Pothnis wrote: > > [solr version 4.3.1] > > > > Hello, >

deleting large amount data from solr cloud

2014-04-10 Thread Vinay Pothnis
[solr version 4.3.1] Hello, I have a solr cloud (4 nodes - 2 shards) with a fairly large amount documents (~360G of index per shard). Now, a major portion of the data is not required and I need to delete those documents. I would need to delete around 75% of the data. One of the solutions could b

Re: Solr + SPDY

2013-10-26 Thread Vinay Pothnis
.@gmail.com> wrote: > I'm rusty on SPDY. Can you summarize the benefits in Solr context? Thanks. > > Otis > Solr & ElasticSearch Support > http://sematext.com/ > On Oct 25, 2013 10:46 AM, "Vinay Pothnis" wrote: > > > Hello, > > > > Co

Solr + SPDY

2013-10-25 Thread Vinay Pothnis
Hello, Couple of questions related to using SPDY with solr. 1. Does anybody have experience running Solr on Jetty 9 with SPDY support - and using Jetty Client (SPDY capable client) to talk to Solr over SPDY? 2. This is related to Solr - Cloud - inter node communication. This might not be a user-

Re: [solr cloud] solr hangs when indexing large number of documents from multiple threads

2013-06-26 Thread Vinay Pothnis
s, perhaps even one and just rack > together a zillion threads to get throughput. > > FWIW, > Erick > > On Tue, Jun 25, 2013 at 8:55 AM, Vinay Pothnis wrote: > > Jason and Scott, > > > > Thanks for the replies and pointers! > > Yes, I will consider the '

Re: [solr cloud] solr hangs when indexing large number of documents from multiple threads

2013-06-25 Thread Vinay Pothnis
3, Jason Hellman wrote: > > > >> Vinay, > >> > >> You may wish to pay attention to how many transaction logs are being > >> created along the way to your hard autoCommit, which should truncate the > >> open handles for those files. I

Re: [solr cloud] solr hangs when indexing large number of documents from multiple threads

2013-06-24 Thread Vinay Pothnis
I have 'softAutoCommit' at 1 second and 'hardAutoCommit' at 30 seconds. On Mon, Jun 24, 2013 at 1:54 PM, Jason Hellman < jhell...@innoventsolutions.com> wrote: > Vinay, > > What autoCommit settings do you have for your indexing process? > > Jason >

Re: [solr cloud] solr hangs when indexing large number of documents from multiple threads

2013-06-24 Thread Vinay Pothnis
e same issues because they are reach the ulimit maximum defined for > descriptor and process. > > -- > Yago Riveiro > Sent with Sparrow (http://www.sparrowmailapp.com/?sig) > > > On Monday, June 24, 2013 at 7:49 PM, Vinay Pothnis wrote: > > > Hello All, > >

[solr cloud] solr hangs when indexing large number of documents from multiple threads

2013-06-24 Thread Vinay Pothnis
Hello All, I have the following set up of solr cloud. * solr version 4.3.1 * 3 node solr cloud + replciation factor 2 * 3 zoo keepers * load balancer in front of the 3 solr nodes I am seeing this strange behavior when I am indexing a large number of documents (10 mil). When I have more than 3-5

Re: [solr cloud 4.1] Issue with order in a batch of commands

2013-02-20 Thread Vinay Pothnis
in the same buffer in the right order. > > Even then though, I don't think SolrJ update requests order deletes and > adds in the same request either, so that would also need to be addressed. > Pretty sure solrj will do the adds then the deletes. > > - Mark > > On Feb 1

Re: [solr cloud 4.1] Issue with order in a batch of commands

2013-02-19 Thread Vinay Pothnis
Also, I was referring to this wiki page: http://wiki.apache.org/solr/UpdateJSON#Update_Commands Thanks Vinay On Tue, Feb 19, 2013 at 6:12 PM, Vinay Pothnis wrote: > Thanks for the reply Eric. > > * I am not using SolrJ > * I am using plain http (apache http client) to send a batch

Re: [solr cloud 4.1] Issue with order in a batch of commands

2013-02-19 Thread Vinay Pothnis
em? SolrJ? and in a single > server.add(doclist) format or with individual adds? > > Individual commands being sent can come 'round out of sequence, that's what > the whole optimistic locking bit is about. > > I guess my other question is what's your evidence that this isn&

[solr cloud 4.1] Issue with order in a batch of commands

2013-02-19 Thread Vinay Pothnis
Hello, I have the following set up: * solr cloud 4.1.0 * 2 shards with embedded zookeeper * plain http to communicate with solr I am testing a scenario where i am batching multiple commands and sending to solr. Since this is the solr cloud setup, I am always sending the updates to one of the nod