Re: change index/store at indexing time

2012-04-28 Thread Jeevanandam

Maria,

thanks for detailed explanation.
as per schema.xml; stored or indexed should be defined at design-time. 
Per my understanding defining at runtime is not feasible.

BTW, you can have multiValued="true" attribute for dynamic fields too.

- Jeevanandam

On 29-04-2012 2:06 am, Vazquez, Maria (STM) wrote:

Thanks Jeevanandam.
That still doesn't have the same behavior as Lucene since multiple
fields with different names have to be created.
What I want is this exactly (multi-value field)

document.add(new Field("geoids", geoId, Field.Store.YES,
Field.Index.NOT_ANALYZED_NO_NORMS));

document.add(new Field("geoids", geoId, Field.Store.NO,
Field.Index.NOT_ANALYZED_NO_NORMS));

In Lucene I can save geoids first as stored and in the next line as
not stored and it will do exactly that. I want to duplicate this
behavior in Solr but I can't do it having only one field in the 
schema

called geoids that I an manipulate at inde time whether to store or
not depending on a condition.

Thanks again for the help, hope this explanation makes it more clear
in what I'm trying to do.

Maria

On Apr 28, 2012, at 11:49 AM, "Jeevanandam"
mailto:je...@myjeeva.com>> wrote:

Maria,

For your need please define unique pattern using dynamic field in 
schema.xml


Please have a look 
http://wiki.apache.org/solr/SchemaXml#Dynamic_fields


Hope that helps!

-Jeevanandam

Technology keeps you connected!

On Apr 28, 2012, at 10:33 PM, "Vazquez, Maria (STM)"
mailto:maria.vazq...@dexone.com>> wrote:

I can call a script for the logic part but what I want to figure out
is how to save the same field sometimes as stored and indexed,
sometimes as stored not indexed, etc. From a transformer or a script 
I

didn't see anything where I can modify that at indexing time.
Thanks a lot,
Maria


On Apr 27, 2012, at 18:38, "Bill Bell"
mailto:billnb...@gmail.com>> wrote:

Yes you can. Just use a script that is called for each row.

Bill Bell
Sent from mobile


On Apr 27, 2012, at 6:38 PM, "Vazquez, Maria (STM)"
mailto:maria.vazq...@dexone.com>> wrote:

Hi,
I'm migrating a project from Lucene 2.9 to Solr 3.4.
There is a special case in the code that indexes the same field in
two different ways, which is completely legal in Lucene directly but 
I

don't know how to duplicate this same behavior in Solr:

if (isFirstGeo) {
 document.add(new Field("geoids", geoId, Field.Store.YES,
Field.Index.NOT_ANALYZED_NO_NORMS));
 isFirstGeo = false;
} else {
 if (countProducts < 100)
  document.add(new Field("geoids", geoId, Field.Store.NO,
Field.Index.NOT_ANALYZED_NO_NORMS));
 else
  document.add(new Field("geoids", geoId, Field.Store.YES,
Field.Index.NO));
}

Is there any way to do this in Solr in a Tranformer? I'm using the
DIH to index and I can't see a way to do this other than having three
fields in the schema like geoids_store_index, geoids_nostore_index,
and geoids_store_noindex.

Thanks a lot in advance.
Maria




Re: CJKBigram filter questons: single character queries, bigrams created across sript/character types

2012-04-28 Thread Lance Norskog
This does not address the question. A single-ideogram query will not
find ideograms in the middle of phrases.

I have also found that phrase slop does not work with bigrams. At all.
I created a separate field type with unigrams. The CJK fields use the
StandardAnalyzer. I made a stack with just the SA which gives raw Euro
text and single terms for CJK ideograms. This worked well for direct
phrase and phrase slop queries. You should use both kinds of fields-
the bigram search helps boost similar phrases.

You should also try the SmartChineseAnalyzer and new Japanese analyzer
suite. I've discovered that CJK search is a very tricky thing, and
different use cases like different strategies.

On Fri, Apr 27, 2012 at 10:57 AM, Walter Underwood
 wrote:
> Bigrams across character types seems like a useful thing, especially for 
> indexing adjective and verb endings.
>
> An n-gram approach is always going to generate a lot of junk along with the 
> gold. Tighten the rules and good stuff is missed, guaranteed. The only way to 
> sort it out is to use a tokenizer with some linguistic rules.
>
> wunder
>
> On Apr 27, 2012, at 10:43 AM, Burton-West, Tom wrote:
>
>> I have a few questions about the CJKBigram filter.
>>
>> About 10% of our queries that contain Han characters are single character 
>> queries.   It looks like the CJKBigram filter only outputs single characters 
>> when there are no adjacent bigrammable characters in the input.   This means 
>> we would have to create a separate field to index Han unigrams in order to 
>> address single character queries.  Is this correct?
>>
>> For Japanese, the default settings form bigrams across character types.  So 
>> for a string containing Hiragana and Han characters bigrams containing a 
>> mixture of Hiragana and Han characters are formed:
>> いろは革命歌   =>    “いろ” ”ろは“  “は革”   ”革命” “命歌”
>>
>> Is there a way to specify that you don’t want bigrams across character types?
>>
>> Tom
>>
>> Tom Burton-West
>> Digital Library Production Service
>> University of Michigan Library
>>
>> http://www.hathitrust.org/blogs/large-scale-search
>>
>
>
>
>
>



-- 
Lance Norskog
goks...@gmail.com


Re: Weird query results with edismax and boolean operator +

2012-04-28 Thread Jan Høydahl
Hi,

What is your "qf" parameter?
Can you run the three queries with debugQuery=true&echoParams=all and attach 
parsed query and all params? It will probably explain what is happening.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com
Solr Training - www.solrtraining.com

On 27. apr. 2012, at 11:21, Vadim Kisselmann wrote:

> Hi folks,
> 
> i use solr 4.0 from trunk, and edismax as standard query handler.
> In my schema i defined this:  
> 
> I have this simple problem:
> 
> nascar +author:serg* (3500 matches)
> 
> +nascar +author:serg* (1 match)
> 
> nascar author:serg* (5200 matches)
> 
> nascar  AND author:serg* (1 match)
> 
> I think i understand the query syntax, but this behavior confused me.
> Why this match-differences?
> 
> By the way, i get in all matches at least one of my terms.
> But not always both.
> 
> Best regards
> Vadim



Re: change index/store at indexing time

2012-04-28 Thread Vazquez, Maria (STM)
Thanks Jeevanandam.
That still doesn't have the same behavior as Lucene since multiple fields with 
different names have to be created.
What I want is this exactly (multi-value field)

document.add(new Field("geoids", geoId, Field.Store.YES, 
Field.Index.NOT_ANALYZED_NO_NORMS));

document.add(new Field("geoids", geoId, Field.Store.NO, 
Field.Index.NOT_ANALYZED_NO_NORMS));

In Lucene I can save geoids first as stored and in the next line as not stored 
and it will do exactly that. I want to duplicate this behavior in Solr but I 
can't do it having only one field in the schema called geoids that I an 
manipulate at inde time whether to store or not depending on a condition.

Thanks again for the help, hope this explanation makes it more clear in what 
I'm trying to do.

Maria

On Apr 28, 2012, at 11:49 AM, "Jeevanandam" 
mailto:je...@myjeeva.com>> wrote:

Maria,

For your need please define unique pattern using dynamic field in schema.xml

Please have a look http://wiki.apache.org/solr/SchemaXml#Dynamic_fields

Hope that helps!

-Jeevanandam

Technology keeps you connected!

On Apr 28, 2012, at 10:33 PM, "Vazquez, Maria (STM)" 
mailto:maria.vazq...@dexone.com>> wrote:

I can call a script for the logic part but what I want to figure out is how to 
save the same field sometimes as stored and indexed, sometimes as stored not 
indexed, etc. From a transformer or a script I didn't see anything where I can 
modify that at indexing time.
Thanks a lot,
Maria


On Apr 27, 2012, at 18:38, "Bill Bell" 
mailto:billnb...@gmail.com>> wrote:

Yes you can. Just use a script that is called for each row.

Bill Bell
Sent from mobile


On Apr 27, 2012, at 6:38 PM, "Vazquez, Maria (STM)" 
mailto:maria.vazq...@dexone.com>> wrote:

Hi,
I'm migrating a project from Lucene 2.9 to Solr 3.4.
There is a special case in the code that indexes the same field in two 
different ways, which is completely legal in Lucene directly but I don't know 
how to duplicate this same behavior in Solr:

if (isFirstGeo) {
 document.add(new Field("geoids", geoId, Field.Store.YES, 
Field.Index.NOT_ANALYZED_NO_NORMS));
 isFirstGeo = false;
} else {
 if (countProducts < 100)
  document.add(new Field("geoids", geoId, Field.Store.NO, 
Field.Index.NOT_ANALYZED_NO_NORMS));
 else
  document.add(new Field("geoids", geoId, Field.Store.YES, Field.Index.NO));
}

Is there any way to do this in Solr in a Tranformer? I'm using the DIH to index 
and I can't see a way to do this other than having three fields in the schema 
like geoids_store_index, geoids_nostore_index, and geoids_store_noindex.

Thanks a lot in advance.
Maria





Re: change index/store at indexing time

2012-04-28 Thread Jeevanandam
Maria,

For your need please define unique pattern using dynamic field in schema.xml

Please have a look http://wiki.apache.org/solr/SchemaXml#Dynamic_fields

Hope that helps!

-Jeevanandam

Technology keeps you connected!

On Apr 28, 2012, at 10:33 PM, "Vazquez, Maria (STM)"  
wrote:

> I can call a script for the logic part but what I want to figure out is how 
> to save the same field sometimes as stored and indexed, sometimes as stored 
> not indexed, etc. From a transformer or a script I didn't see anything where 
> I can modify that at indexing time.
> Thanks a lot,
> Maria
> 
> 
> On Apr 27, 2012, at 18:38, "Bill Bell"  wrote:
> 
>> Yes you can. Just use a script that is called for each row.
>> 
>> Bill Bell
>> Sent from mobile
>> 
>> 
>> On Apr 27, 2012, at 6:38 PM, "Vazquez, Maria (STM)" 
>>  wrote:
>> 
>>> Hi,
>>> I'm migrating a project from Lucene 2.9 to Solr 3.4.
>>> There is a special case in the code that indexes the same field in two 
>>> different ways, which is completely legal in Lucene directly but I don't 
>>> know how to duplicate this same behavior in Solr:
>>> 
>>> if (isFirstGeo) {
>>>   document.add(new Field("geoids", geoId, Field.Store.YES, 
>>> Field.Index.NOT_ANALYZED_NO_NORMS));
>>>   isFirstGeo = false;
>>> } else {
>>>   if (countProducts < 100)
>>>document.add(new Field("geoids", geoId, Field.Store.NO, 
>>> Field.Index.NOT_ANALYZED_NO_NORMS));
>>>   else
>>>document.add(new Field("geoids", geoId, Field.Store.YES, 
>>> Field.Index.NO));
>>> }
>>> 
>>> Is there any way to do this in Solr in a Tranformer? I'm using the DIH to 
>>> index and I can't see a way to do this other than having three fields in 
>>> the schema like geoids_store_index, geoids_nostore_index, and 
>>> geoids_store_noindex.
>>> 
>>> Thanks a lot in advance.
>>> Maria
>>> 
>>> 
>>> 


SolrJ core admin - can it share server objects with queries?

2012-04-28 Thread Shawn Heisey
I have a SolrJ application that uses the core admin as well as doing 
queries against each core.  I have an object of my own design for each 
core that uses SolrJ directly.  Two of my core objects (one for build 
and one for live) are used in an object that represents a shard, and 
multiple shard objects are used in an object that represents an entire 
index chain.


Within the core object, I am currently creating two solr server 
objects.  One has the URL ending in "/solr" and is used for 
CoreAdminRequest.  The other includes the core name and is used for 
updates/queries.  I have taken steps to share the first server object 
between core objects when the host and port are the same.


My question - could I use one server object for both of these, or am I 
doing things correctly?  I guess it comes down to whether or not there 
is any way to specify a core when doing queries or updates.  I have not 
been able to see a way to do it.  If there is a way, I could reduce the 
number of server objects that my program uses.


Currently there are two index chains, each of which has seven shards.  
With two cores per shard, the program builds 28 server objects for 
queries.  Since I have four servers, I also end up with four shared 
server objects for CoreAdminRequest.  If there's a way to specify the 
core for queries, I would only need those four shared objects.  If such 
a capability doesn't already exist, should I file a jira issue?


Thanks,
Shawn



Re: change index/store at indexing time

2012-04-28 Thread Vazquez, Maria (STM)
I can call a script for the logic part but what I want to figure out is how to 
save the same field sometimes as stored and indexed, sometimes as stored not 
indexed, etc. From a transformer or a script I didn't see anything where I can 
modify that at indexing time.
Thanks a lot,
Maria


On Apr 27, 2012, at 18:38, "Bill Bell"  wrote:

> Yes you can. Just use a script that is called for each row.
> 
> Bill Bell
> Sent from mobile
> 
> 
> On Apr 27, 2012, at 6:38 PM, "Vazquez, Maria (STM)" 
>  wrote:
> 
>> Hi,
>> I'm migrating a project from Lucene 2.9 to Solr 3.4.
>> There is a special case in the code that indexes the same field in two 
>> different ways, which is completely legal in Lucene directly but I don't 
>> know how to duplicate this same behavior in Solr:
>> 
>> if (isFirstGeo) {
>>document.add(new Field("geoids", geoId, Field.Store.YES, 
>> Field.Index.NOT_ANALYZED_NO_NORMS));
>>isFirstGeo = false;
>> } else {
>>if (countProducts < 100)
>> document.add(new Field("geoids", geoId, Field.Store.NO, 
>> Field.Index.NOT_ANALYZED_NO_NORMS));
>>else
>> document.add(new Field("geoids", geoId, Field.Store.YES, 
>> Field.Index.NO));
>> }
>> 
>> Is there any way to do this in Solr in a Tranformer? I'm using the DIH to 
>> index and I can't see a way to do this other than having three fields in the 
>> schema like geoids_store_index, geoids_nostore_index, and 
>> geoids_store_noindex.
>> 
>> Thanks a lot in advance.
>> Maria
>> 
>> 
>> 


Re: commit fail

2012-04-28 Thread Yonik Seeley
On Sat, Apr 28, 2012 at 7:02 AM, mav.p...@holidaylettings.co.uk
 wrote:
> Hi,
>
> This is what the thread dump looks like.
>
> Any ideas?

Looks like the thread taking up CPU is in LukeRequestHandler

> 1062730578@qtp-1535043768-5' Id=16, RUNNABLE on lock=, total cpu
> time=16156160.ms user time=16153110.msat
> org.apache.solr.handler.admin.LukeRequestHandler.getIndexedFieldsInfo(LukeR
> equestHandler.java:320)

That probably accounts for the 1 CPU doing things... but it's not
clear at all why commits are failing.

Perhaps the commit is succeeding, but the client is just not waiting
long enough for it to complete?

-Yonik
lucenerevolution.com - Lucene/Solr Open Source Search Conference.
Boston May 7-10


Re: Scaling Solr - Suggestions !!

2012-04-28 Thread Michael Della Bitta
Just my opinion, but I'm not sure I see the value in deploying the cores
to different webapps in a single container on a single machine to avoid
a single point of failure... You still have a single point of failure at
the process level down to the hardware, which when you think about it,
is mostly everything. But perhaps you're at least using more than one
container.

It sounds to me that the easiest route to scalability for you would be
to add more machines. Unless your cores are particularly complex or your
traffic is heavy, a 3GB core should be no match for a single machine.
And the traffic problem can be solved by replication and load balancing.

Michael

On Sat, 2012-04-28 at 13:24 +0530, Sujatha Arun wrote:
> Hello,
> 
> *Background* :For each of our  customers, we create 3 solr webapps with
> different search  schema's,serving different search requirements and we
> have about 70 customers.So we have about 210 webapps curently .
> 
> *Hardware*: Single Server , one JVM , Heap memory 19GB ,Total Ram :32GB ,
> Permgen initally 1GB ,now increased to 2GB.
> 
> *Solr Indexes* : Most are the order of a few MB ,about 2  big index of
> about 3GB  each
> 
> *Scaling Step 1 *:  We saw the permgen value go upto to nearly 850 mb ,when
> we created so  many webapps ,hence now we are moving to solr cores and we
> are going to have about 50 cores per webapp ,bringing the number of webapps
> to about 5 . We want to distribute the cores with multiple webapps to avoid
> a single point of failure.
> 
> 
> *Requirement* :
> 
> 
>-   We need to only scale the cores horizontally ,whose index sizes are
>big.
>-   We also require permission based search for each webapp ,would solr
>NRT fit our needs ,where we can index the permission into the document
>,which would mean   there would be frequent addition and deletion of
>permissions to the documents across cores.
>-   We also require  automatic fail over
> 
> What technology would be ideal fit given Solr Cloud ,Katta , Solandra
> ,Lily,Elastic Search etc [Preferably Open source] [ We would be required to
> maintain many webapps with multicores ] and what about the commercial
> offering given out use case
> 
> Thanks.
> 
> Regards,
> Sujatha




question about NRT(soft commit) and Transaction Log in trunk

2012-04-28 Thread Li Li
hi
   I checked out the trunk and played with its new soft commit
feature. it's cool. But I've got a few questions about it.
   By reading some introductory articles and wiki, and hasted code
reading, my understand of it's implementation is:
   For normal commit(hard commit), we should flush all into disk and
commit it. flush is not very time consuming because of
os level cache. the most time consuming one is sync in commit process.
   Soft commit just flush postings and pending deletions into disk
and generating new segments. Then solr can use a
new searcher to read the latest indexes and warm up and then register itself.
   if there is no hard commit and the jvm crashes, then new data may lose.
   if my understanding is correct, then why we need transaction log?
   I found in DirectUpdateHandler2, every time a command is executed,
TransactionLog will record a line in log. But the default
sync level in RunUpdateProcessorFactory is flush, which means it will
not sync the log file. does this make sense?
   in database implementation, we usually write log and modify data
in memory because log is smaller than real data. if crashes.
we can redo the unfinished log and make data correct. will Solr
leverage this log like this? if it is, why it's not synced?


Re: commit fail

2012-04-28 Thread mav.p...@holidaylettings.co.uk
Hi,

This is what the thread dump looks like.

Any ideas?

Mav

Java HotSpot(TM) 64-Bit Server VM20.1-b02Thread Count: current=19,
peak=20, daemon=6'DestroyJavaVM' Id=26, RUNNABLE on lock=, total cpu
time=198450.ms user time=196890.ms'Timer-2' Id=25, TIMED_WAITING
on lock=java.util.TaskQueue@33799a1e, total cpu time=0.ms user
time=0.msat java.lang.Object.wait(Native Method)
at java.util.TimerThread.mainLoop(Timer.java:509)
at java.util.TimerThread.run(Timer.java:462)
'pool-3-thread-1' Id=24, WAITING on
lock=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@
747541f8, total cpu time=0.ms user time=0.msat
sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await
(AbstractQueuedSynchronizer.java:1987)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
 
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947
) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
907) 
at java.lang.Thread.run(Thread.java:662)
'pool-1-thread-1' Id=23, WAITING on
lock=java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@
3e3e3c83, total cpu time=480.ms user time=460.msat
sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158)
at 
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await
(AbstractQueuedSynchronizer.java:1987)
at 
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399)
 
at 
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:947
) 
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
907) 
at java.lang.Thread.run(Thread.java:662)
'Timer-1' Id=21, TIMED_WAITING on lock=java.util.TaskQueue@67f6dc61, total
cpu time=180.ms user time=120.msat java.lang.Object.wait(Native
Method) 
at java.util.TimerThread.mainLoop(Timer.java:509)
at java.util.TimerThread.run(Timer.java:462)
'2021372560@qtp-1535043768-9 - Acceptor0 SocketConnector@0.0.0.0:8983'
Id=20, RUNNABLE on lock=, total cpu time=60.ms user time=60.msat
java.net.PlainSocketImpl.socketAccept(Native Method)
at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:408)
at java.net.ServerSocket.implAccept(ServerSocket.java:462)
at java.net.ServerSocket.accept(ServerSocket.java:430)
at org.mortbay.jetty.bio.SocketConnector.accept(SocketConnector.java:99)
at 
org.mortbay.jetty.AbstractConnector$Acceptor.run(AbstractConnector.java:708
) 
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:58
2) 
'1384828782@qtp-1535043768-8' Id=19, TIMED_WAITING on
lock=org.mortbay.thread.QueuedThreadPool$PoolThread@528acf6e, total cpu
time=274160.ms user time=273060.msat java.lang.Object.wait(Native
Method) 
at 
org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:62
6) 
'1715374531@qtp-1535043768-7' Id=18, RUNNABLE on lock=, total cpu
time=15725890.ms user time=15723380.msat
sun.management.ThreadImpl.getThreadInfo1(Native Method)
at sun.management.ThreadImpl.getThreadInfo(ThreadImpl.java:154)
at 
org.apache.jsp.admin.threaddump_jsp._jspService(org.apache.jsp.admin.thread
dump_jsp:264) 
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:109)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at 
org.apache.jasper.servlet.JspServletWrapper.service(JspServletWrapper.java:
389) 
at 
org.apache.jasper.servlet.JspServlet.serviceJspFile(JspServlet.java:486)
at org.apache.jasper.servlet.JspServlet.service(JspServlet.java:380)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:511)
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:401)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:327)
at org.mortbay.jetty.servlet.Dispatcher.forward(Dispatcher.java:126)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java
:275) 
at 
org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandle
r.java:1212) 
at 
org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:399)
at 
org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
 
at 
org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:182)
at 
org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:450)
at 
org.mortbay.jetty.handler.ContextHandlerCollection.hand

Scaling Solr - Suggestions !!

2012-04-28 Thread Sujatha Arun
Hello,

*Background* :For each of our  customers, we create 3 solr webapps with
different search  schema's,serving different search requirements and we
have about 70 customers.So we have about 210 webapps curently .

*Hardware*: Single Server , one JVM , Heap memory 19GB ,Total Ram :32GB ,
Permgen initally 1GB ,now increased to 2GB.

*Solr Indexes* : Most are the order of a few MB ,about 2  big index of
about 3GB  each

*Scaling Step 1 *:  We saw the permgen value go upto to nearly 850 mb ,when
we created so  many webapps ,hence now we are moving to solr cores and we
are going to have about 50 cores per webapp ,bringing the number of webapps
to about 5 . We want to distribute the cores with multiple webapps to avoid
a single point of failure.


*Requirement* :


   -   We need to only scale the cores horizontally ,whose index sizes are
   big.
   -   We also require permission based search for each webapp ,would solr
   NRT fit our needs ,where we can index the permission into the document
   ,which would mean   there would be frequent addition and deletion of
   permissions to the documents across cores.
   -   We also require  automatic fail over

What technology would be ideal fit given Solr Cloud ,Katta , Solandra
,Lily,Elastic Search etc [Preferably Open source] [ We would be required to
maintain many webapps with multicores ] and what about the commercial
offering given out use case

Thanks.

Regards,
Sujatha