Re: Commit Issue in Solr 3.4

2014-02-08 Thread samarth s
Yes it is amazon ec2 indeed.

To expqnd on that,
This solr deployment was working fine, handling the same load, on a 34 GB
instance on ebs storage for quite some time. To reduce the time taken by a
commit, I shifted this to a 30 GB SSD instance. It performed better in
writes and commits for sure. But, since the last week I started facing this
problem of infinite back to back commits. Not being able to resolve this, I
have finally switched back to a 34 GB machine with ebs storage, and now the
commits are working fine, though slow.

Any thoughts?
On 6 Feb 2014 23:00, "Shawn Heisey"  wrote:

> On 2/6/2014 9:56 AM, samarth s wrote:
> > Size of index = 260 GB
> > Total Docs = 100mn
> > Usual writing speed = 50K per hour
> > autoCommit-maxDocs = 400,000
> > autoCommit-maxTime = 1500,000 (25 mins)
> > merge factor = 10
> >
> > M/c memory = 30 GB, Xmx = 20 GB
> > Server - Jetty
> > OS - Cent OS 6
>
> With 30GB of RAM (is it Amazon EC2, by chance?) and a 20GB heap, you
> have about 10GB of RAM left for caching your Solr index.  If that server
> has all 260GB of index, I am really surprised that you have only been
> having problems for a short time.  I would have expected problems from
> day one.  Even if it only has half or one quarter of the index, there is
> still a major discrepancy in RAM vs. index size.
>
> You either need more memory or you need to reduce the size of your
> index.  The size of the indexed portion generally has more of an impact
> on performance than the size of the stored portion, but they do both
> have an impact, especially on indexing and committing.  With regular
> disks, it's best to have at least 50% of your index size available to
> the OS disk cache, but 100% is better.
>
> http://wiki.apache.org/solr/SolrPerformanceProblems#OS_Disk_Cache
>
> If you are already using SSD, you might think there can't be
> memory-related performance problems ... but you still need a pretty
> significant chunk of disk cache.
>
> https://wiki.apache.org/solr/SolrPerformanceProblems#SSD
>
> Thanks,
> Shawn
>
>


Commit Issue in Solr 3.4

2014-02-06 Thread samarth s
Hi,

I have been using the solr version 3.4 in a project for about more than a
year. It is only now that I have started facing a weird problem of never
ending back to back commit cycles. I can say this looking at the InfoStream
logs, that, as soon as one commit cycle is done with another one almost
immediately spawns. My writer processes, which use solrj as the client, do
not get a chance to write even a single document between these commits. I
have waited for hours to let these commits take its own course and get over
in a natural way, but they dont. Finally, I had to restart the solr server.
Post that, my writers could get away with writing a few thousand docs,
after which the same infinite commit cycles start. Could not find any
related JIRA on this.

Size of index = 260 GB
Total Docs = 100mn
Usual writing speed = 50K per hour
autoCommit-maxDocs = 400,000
autoCommit-maxTime = 1500,000 (25 mins)
merge factor = 10

M/c memory = 30 GB, Xmx = 20 GB
Server - Jetty
OS - Cent OS 6


Please let me know if any other details are needed on the setup. Any help
is highly appreciated. Thanks.

-- 
Regards,
Samarth


Sum as a Projection for Facet Queries

2013-07-01 Thread samarth s
Hi,

We have a need of finding the sum of a field for each facet.query. We have
looked at StatsComponent  but
that supports only facet.field. Has anyone written a patch over
StatsComponent that supports the same along with some performance measures?

Is there any way we can do this using the Function Query -
Sum
?

-- 
Regards,
Samarth


Re: updateLog in Solr 4.2

2013-04-14 Thread samarth s
I have a similar problem on this one. The reason for this is my application
performs back to back updates. And, as came out of my performance tests,
the update immediately after the first one, seems to be a lot slower than
as compared to not having any update logs.

Is this a genuine case, or did I miss something in my performance tests ?
Any pointers on this one are highly appreciated.



On Fri, Apr 12, 2013 at 6:47 PM, vicky desai wrote:

> If i disable update log in solr 4.2 then i get the following exception
> SEVERE: :java.lang.NullPointerException
> at
>
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:190)
> at
>
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156)
> at
>
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100)
> at
> org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266)
> at
> org.apache.solr.cloud.ZkController.joinElection(ZkController.java:935)
> at
> org.apache.solr.cloud.ZkController.register(ZkController.java:761)
> at
> org.apache.solr.cloud.ZkController.register(ZkController.java:727)
> at
> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:908)
> at
> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:892)
> at
> org.apache.solr.core.CoreContainer.register(CoreContainer.java:841)
> at
> org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:638)
> at
> org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
>
> Apr 12, 2013 6:39:56 PM org.apache.solr.common.SolrException log
> SEVERE: null:org.apache.solr.common.cloud.ZooKeeperException:
> at
> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:931)
> at
> org.apache.solr.core.CoreContainer.registerCore(CoreContainer.java:892)
> at
> org.apache.solr.core.CoreContainer.register(CoreContainer.java:841)
> at
> org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:638)
> at
> org.apache.solr.core.CoreContainer$3.call(CoreContainer.java:629)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:619)
> Caused by: java.lang.NullPointerException
> at
>
> org.apache.solr.cloud.ShardLeaderElectionContext.runLeaderProcess(ElectionContext.java:190)
> at
>
> org.apache.solr.cloud.LeaderElector.runIamLeaderProcess(LeaderElector.java:156)
> at
>
> org.apache.solr.cloud.LeaderElector.checkIfIamLeader(LeaderElector.java:100)
> at
> org.apache.solr.cloud.LeaderElector.joinElection(LeaderElector.java:266)
> at
> org.apache.solr.cloud.ZkController.joinElection(ZkController.java:935)
> at
> org.apache.solr.cloud.ZkController.register(ZkController.java:761)
> at
> org.apache.solr.cloud.ZkController.register(ZkController.java:727)
> at
> org.apache.solr.core.CoreContainer.registerInZk(CoreContainer.java:908)
> ... 12 more
>
> and solr fails to start . However if i add updatelog in my solrconfig.xml
> it
> starts. Is the update log parameter mandatory for solr4.2
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/updateLog-in-Solr-4-2-tp4055548.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Regards,
Samarth


Re: Error on using the projection parameter - fl - in Solr 4

2013-01-26 Thread samarth s
Thanks Erik.

Will take your advice as a long term solution. Currently working around by
using the the regex capability added in parsing the 'fl' parameter, using
'fl=E_*'


On Wed, Jan 9, 2013 at 6:07 AM, Erick Erickson wrote:

> You really have a field name with '@' symbols in it? If it worked in 3.6,
> it was probably not intentional, classic "undocumented behavior".
>
> The first thing I'd try is replacing the @ with __ in my schema...
>
> Best
> Erick
>
> On Tue, Jan 8, 2013 at 6:58 AM, samarth s  >wrote:
>
> > q=*:*&fl=E_abc@@xyz
>



-- 
Regards,
Samarth


Error on using the projection parameter - fl - in Solr 4

2013-01-08 Thread samarth s
Hi all,

I am in a process of migrating my application from Solr 3.6 to Solr 4. A
query that used to work is giving an error with Solr 4.

The query looks like:
q=*:*&fl=E_abc@@xyz

The error displayed on the admin page is:
can not use FieldCache on multivalued field: E_abc

The field printed in the error has dropped the part after the character '@'.

Could not find any useful pointers on the forums, except one that has a
similar issue but while using the 'qt' parameter. Reference to this chain
is:
Subject: "multivalued filed question (FieldCache error)" on solr-user forums

Thanks for any pointers.

-- 
Regards,
Samarth


Re: Atomicity of commits (soft OR hard) across replicas - Solr Cloud

2013-01-07 Thread samarth s
Thanks *Tomás !! *This was useful.


On Mon, Dec 31, 2012 at 6:03 PM, Tomás Fernández Löbbe <
tomasflo...@gmail.com> wrote:

> If by "cronned commit" you mean "auto-commit": auto-commits are local to
> each node, are not distributed, so there is no something like a
> "cluster-wide" atomicity there. The commit may be performed in one node
> now, and in other nodes in 5 minutes (depending on the "maxTime" you have
> configured).
> If you mean that you are issuing commits from outside Solr, those are going
> to be by default distributed to all the nodes. The operation will succeed
> only if all nodes succeed, but if one of the nodes fail, the operation will
> fail. However, the nodes that did succeed WILL have a new view of the index
> at this point. (I'm not sure if something is done in this situation with
> the failing node).
>
> The local commit operation in one node *is* atomic.
>
> Tomás
>
>
> On Mon, Dec 31, 2012 at 7:04 AM, samarth s  >wrote:
>
> > Tried reading articles online, but could not find one that confirmed the
> > same 100% :).
> >
> > Does a cronned soft commit complete its commit cycle only after all the
> > replicas have the newest data visible ?
> >
> > --
> > Regards,
> > Samarth
> >
>



-- 
Regards,
Samarth


Solr cloud in 4.0 with NRT performance

2012-09-14 Thread samarth s
Hi,

I am currently using features like facet and group/collapse on solr 3.6.
The frequency of writing is user driven, and hence is expected to be
visible real time or at least near real time. These updates should be
consistent in facet and group results as well. Also to handle the query
load, I may have to use replication/sharding w/ or w/o solr cloud.

I am planning to migrate to solr 4.0, and use its powerful features of NRT
( soft commit ) and Solr Cloud ( using Zookeeper ) to achieve the above
requirements.

Is a Solr Cloud with a replication level greater than 1, capable of giving
NRT results ?
If yes, do these NRT results work with all kinds of querying, like,
faceting and grouping ?

It would be great if some one could share their insights and numbers on
these questions.

-- 
Regards,
Samarth


Re: Solr 4.0 Beta Release

2012-09-13 Thread samarth s
Thanks Jack.

On Wed, Sep 12, 2012 at 8:08 PM, Jack Krupansky wrote:

> Yes, it has been released. Read the details here (including download
> instructions/links):
> http://lucene.apache.org/solr/**solrnews.html<http://lucene.apache.org/solr/solrnews.html>
>
> -- Jack Krupansky
>
> -Original Message- From: samarth s
> Sent: Wednesday, September 12, 2012 9:54 AM
> To: solr-user@lucene.apache.org
> Subject: Solr 4.0 Beta Release
>
>
> Hi All,
>
> Would just like to verify if Solr 4.0 Beta has been released. Does the
> following url give the official beta release:
> http://www.apache.org/dyn/**closer.cgi/lucene/solr/4.0.0-**BETA<http://www.apache.org/dyn/closer.cgi/lucene/solr/4.0.0-BETA>
>
> --
> Regards,
> Samarth
>



-- 
Regards,
Samarth


Solr 4.0 Beta Release

2012-09-12 Thread samarth s
Hi All,

Would just like to verify if Solr 4.0 Beta has been released. Does the
following url give the official beta release:
http://www.apache.org/dyn/closer.cgi/lucene/solr/4.0.0-BETA

-- 
Regards,
Samarth


Re: Too many connections in CLOSE_WAIT state on master solr server

2012-03-18 Thread samarth s
Hi Ranveer,

You can try this '-Dhttp.maxConnections' out, may resolve the issue.
But the root cause I figured may lie with some queries made to solr
that are too heavy to have decent turnaround times. As a result the
client may close the connection abruptly, resulting in half closed
connections. You can also try adding search time out to solr queries:
https://issues.apache.org/jira/browse/SOLR-502

On Tue, Jan 10, 2012 at 8:06 AM, Ranveer  wrote:
> Hi,
>
> I am facing same problem. Did  -Dhttp.maxConnections resolve the problem ?
>
> Please let us know!
>
> regards
> Ranveer
>
>
>
> On Thursday 15 December 2011 11:30 AM, samarth s wrote:
>>
>> Thanks Erick and Mikhail. I'll try this out.
>>
>> On Wed, Dec 14, 2011 at 7:11 PM, Erick Erickson
>>  wrote:
>>>
>>> I'm guessing (and it's just a guess) that what's happening is that
>>> the container is queueing up your requests while waiting
>>> for the other connections to close, so Mikhail's suggestion
>>> seems like a good idea.
>>>
>>> Best
>>> Erick
>>>
>>> On Wed, Dec 14, 2011 at 12:28 AM, samarth s
>>>   wrote:
>>>>
>>>> The updates to the master are user driven, and are needed to be
>>>> visible quickly. Hence, the high frequency of replication. It may be
>>>> that too many replication requests are being handled at a time, but
>>>> why should that result in half closed connections?
>>>>
>>>> On Wed, Dec 14, 2011 at 2:47 AM, Erick Erickson
>>>>  wrote:
>>>>>
>>>>> Replicating 40 cores every 20 seconds is just *asking* for trouble.
>>>>> How often do your cores change on the master? How big are
>>>>> they? Is there any chance you just have too many cores replicating
>>>>> at once?
>>>>>
>>>>> Best
>>>>> Erick
>>>>>
>>>>> On Tue, Dec 13, 2011 at 3:52 PM, Mikhail Khludnev
>>>>>   wrote:
>>>>>>
>>>>>> You can try to reuse your connections (prevent them from closing) by
>>>>>> specifying
>>>>>>  -Dhttp.maxConnections=<http://download.oracle.com/javase/1.4.2/docs/guide/net/properties.html>N
>>>>>> in jvm startup params. At client JVM!. Number should be chosen
>>>>>> considering
>>>>>> the number of connection you'd like to keep alive.
>>>>>>
>>>>>> Let me know if it works for you.
>>>>>>
>>>>>> On Tue, Dec 13, 2011 at 2:57 PM, samarth
>>>>>> swrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I am using solr replication and am experiencing a lot of connections
>>>>>>> in the state CLOSE_WAIT at the master solr server. These disappear
>>>>>>> after a while, but till then the master solr stops responding.
>>>>>>>
>>>>>>> There are about 130 open connections on the master server with the
>>>>>>> client as the slave m/c and all are in the state CLOSE_WAIT. Also,
>>>>>>> the
>>>>>>> client port specified on the master solr server netstat results is
>>>>>>> not
>>>>>>> visible in the netstat results on the client (slave solr) m/c.
>>>>>>>
>>>>>>> Following is my environment:
>>>>>>> - 40 cores in the master solr on m/c 1
>>>>>>> - 40 cores in the slave solr on m/c 2
>>>>>>> - The replication poll interval is 20 seconds.
>>>>>>> - Replication part in solrconfig.xml in the slave solr:
>>>>>>> 
>>>>>>>           
>>>>>>>
>>>>>>>                   
>>>>>>>                   >>>>>> name="masterUrl">$mastercorename/replication
>>>>>>>
>>>>>>>                   
>>>>>>>                   00:00:20
>>>>>>>                   
>>>>>>>                   5000
>>>>>>>                   1
>>>>>>>          
>>>>>>>   
>>>>>>>
>>>>>>> Thanks for any pointers.
>>>>>>>
>>>>>>> --
>>>>>>> Regards,
>>>>>>> Samarth
>>>>>>>
>>>>>>
>>>>>>
>>>>>> --
>>>>>> Sincerely yours
>>>>>> Mikhail Khludnev
>>>>>> Developer
>>>>>> Grid Dynamics
>>>>>> tel. 1-415-738-8644
>>>>>> Skype: mkhludnev
>>>>>> <http://www.griddynamics.com>
>>>>>>  
>>>>
>>>>
>>>>
>>>> --
>>>> Regards,
>>>> Samarth
>>
>>
>>
>



-- 
Regards,
Samarth


Request Timeout Parameter in update queries

2012-03-16 Thread samarth s
Hi,

Does an update query to solr work well when sent with a timeout
parameter ? https://issues.apache.org/jira/browse/SOLR-502
For example, consider an update query was fired with a timeout of 30
seconds, and the request got aborted half way due to the timeout. Can
this corrupt the index in any way ?

-- 
Regards,
Samarth


Re: Too many connections in CLOSE_WAIT state on master solr server

2011-12-14 Thread samarth s
Thanks Erick and Mikhail. I'll try this out.

On Wed, Dec 14, 2011 at 7:11 PM, Erick Erickson  wrote:
> I'm guessing (and it's just a guess) that what's happening is that
> the container is queueing up your requests while waiting
> for the other connections to close, so Mikhail's suggestion
> seems like a good idea.
>
> Best
> Erick
>
> On Wed, Dec 14, 2011 at 12:28 AM, samarth s
>  wrote:
>> The updates to the master are user driven, and are needed to be
>> visible quickly. Hence, the high frequency of replication. It may be
>> that too many replication requests are being handled at a time, but
>> why should that result in half closed connections?
>>
>> On Wed, Dec 14, 2011 at 2:47 AM, Erick Erickson  
>> wrote:
>>> Replicating 40 cores every 20 seconds is just *asking* for trouble.
>>> How often do your cores change on the master? How big are
>>> they? Is there any chance you just have too many cores replicating
>>> at once?
>>>
>>> Best
>>> Erick
>>>
>>> On Tue, Dec 13, 2011 at 3:52 PM, Mikhail Khludnev
>>>  wrote:
>>>> You can try to reuse your connections (prevent them from closing) by
>>>> specifying  
>>>> -Dhttp.maxConnections=<http://download.oracle.com/javase/1.4.2/docs/guide/net/properties.html>N
>>>> in jvm startup params. At client JVM!. Number should be chosen considering
>>>> the number of connection you'd like to keep alive.
>>>>
>>>> Let me know if it works for you.
>>>>
>>>> On Tue, Dec 13, 2011 at 2:57 PM, samarth s 
>>>> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I am using solr replication and am experiencing a lot of connections
>>>>> in the state CLOSE_WAIT at the master solr server. These disappear
>>>>> after a while, but till then the master solr stops responding.
>>>>>
>>>>> There are about 130 open connections on the master server with the
>>>>> client as the slave m/c and all are in the state CLOSE_WAIT. Also, the
>>>>> client port specified on the master solr server netstat results is not
>>>>> visible in the netstat results on the client (slave solr) m/c.
>>>>>
>>>>> Following is my environment:
>>>>> - 40 cores in the master solr on m/c 1
>>>>> - 40 cores in the slave solr on m/c 2
>>>>> - The replication poll interval is 20 seconds.
>>>>> - Replication part in solrconfig.xml in the slave solr:
>>>>> 
>>>>>           
>>>>>
>>>>>                   
>>>>>                   $mastercorename/replication
>>>>>
>>>>>                   
>>>>>                   00:00:20
>>>>>                   
>>>>>                   5000
>>>>>                   1
>>>>>          
>>>>>   
>>>>>
>>>>> Thanks for any pointers.
>>>>>
>>>>> --
>>>>> Regards,
>>>>> Samarth
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Sincerely yours
>>>> Mikhail Khludnev
>>>> Developer
>>>> Grid Dynamics
>>>> tel. 1-415-738-8644
>>>> Skype: mkhludnev
>>>> <http://www.griddynamics.com>
>>>>  
>>
>>
>>
>> --
>> Regards,
>> Samarth



-- 
Regards,
Samarth


Re: Too many connections in CLOSE_WAIT state on master solr server

2011-12-13 Thread samarth s
The updates to the master are user driven, and are needed to be
visible quickly. Hence, the high frequency of replication. It may be
that too many replication requests are being handled at a time, but
why should that result in half closed connections?

On Wed, Dec 14, 2011 at 2:47 AM, Erick Erickson  wrote:
> Replicating 40 cores every 20 seconds is just *asking* for trouble.
> How often do your cores change on the master? How big are
> they? Is there any chance you just have too many cores replicating
> at once?
>
> Best
> Erick
>
> On Tue, Dec 13, 2011 at 3:52 PM, Mikhail Khludnev
>  wrote:
>> You can try to reuse your connections (prevent them from closing) by
>> specifying  
>> -Dhttp.maxConnections=<http://download.oracle.com/javase/1.4.2/docs/guide/net/properties.html>N
>> in jvm startup params. At client JVM!. Number should be chosen considering
>> the number of connection you'd like to keep alive.
>>
>> Let me know if it works for you.
>>
>> On Tue, Dec 13, 2011 at 2:57 PM, samarth s 
>> wrote:
>>
>>> Hi,
>>>
>>> I am using solr replication and am experiencing a lot of connections
>>> in the state CLOSE_WAIT at the master solr server. These disappear
>>> after a while, but till then the master solr stops responding.
>>>
>>> There are about 130 open connections on the master server with the
>>> client as the slave m/c and all are in the state CLOSE_WAIT. Also, the
>>> client port specified on the master solr server netstat results is not
>>> visible in the netstat results on the client (slave solr) m/c.
>>>
>>> Following is my environment:
>>> - 40 cores in the master solr on m/c 1
>>> - 40 cores in the slave solr on m/c 2
>>> - The replication poll interval is 20 seconds.
>>> - Replication part in solrconfig.xml in the slave solr:
>>> 
>>>           
>>>
>>>                   
>>>                   $mastercorename/replication
>>>
>>>                   
>>>                   00:00:20
>>>                   
>>>                   5000
>>>                   1
>>>          
>>>   
>>>
>>> Thanks for any pointers.
>>>
>>> --
>>> Regards,
>>> Samarth
>>>
>>
>>
>>
>> --
>> Sincerely yours
>> Mikhail Khludnev
>> Developer
>> Grid Dynamics
>> tel. 1-415-738-8644
>> Skype: mkhludnev
>> <http://www.griddynamics.com>
>>  



-- 
Regards,
Samarth


Too many connections in CLOSE_WAIT state on master solr server

2011-12-13 Thread samarth s
Hi,

I am using solr replication and am experiencing a lot of connections
in the state CLOSE_WAIT at the master solr server. These disappear
after a while, but till then the master solr stops responding.

There are about 130 open connections on the master server with the
client as the slave m/c and all are in the state CLOSE_WAIT. Also, the
client port specified on the master solr server netstat results is not
visible in the netstat results on the client (slave solr) m/c.

Following is my environment:
- 40 cores in the master solr on m/c 1
- 40 cores in the slave solr on m/c 2
- The replication poll interval is 20 seconds.
- Replication part in solrconfig.xml in the slave solr:

  

  
  $mastercorename/replication

  
  00:00:20
  
  5000
  1
         
  

Thanks for any pointers.

--
Regards,
Samarth


Re: Solr Open File Descriptors

2011-10-22 Thread samarth s
Thanks for sharing your insights shawn

On Mon, Oct 17, 2011 at 1:27 AM, Shawn Heisey  wrote:

> On 10/16/2011 12:01 PM, samarth s wrote:
>
>> Hi,
>>
>> Is it safe to assume that with a megeFactor of 10 the open file
>> descriptors
>> required by solr would be around (1+ 10) * 10 = 110
>> ref: *http://onjava.com/pub/a/**onjava/2003/03/05/lucene.html#**
>> indexing_speed*<http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*>
>> Solr wiki:
>> http://wiki.apache.org/solr/**SolrPerformanceFactors#**Optimization_**
>> Considerationsstates<http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates>
>>
>> that FD's required per segment is around 7.
>>
>> Are these estimates appropriate. Does it in anyway depend on the size of
>> the
>> index&  number of docs (assuming same number of segments in any case) as
>> well?
>>
>
> My index has 10 files per normal  segment (the usual 7 plus three more for
> termvectors).  Some of the segments also have a ".del" file, and there is a
> segments_* file and a segments.gen file.  Your servlet container and other
> parts of the OS will also have to open files.
>
> I have personally seen three levels of segment merging taking place at the
> same time on a slow filesystem during a full-import, along with new content
> coming in at the same time.  With a mergefactor of 10, each merge is 11
> segments - the ten that are being merged and the merged segment.  If you
> have three going on at the same time, that's 33 segments, and you can have
> up to 10 more that are actively being built by ongoing index activity, so
> that's 43 potential segments.  If your filesystem is REALLY slow, you might
> end up with even more segments as existing merges are paused for new ones to
> start, but if you run into that, you'll want to udpate your hardware, so I
> won't consider it.
>
> Multiplying 43 segments by 11 files per segment yields a working
> theoretical maximum of 473 files.  Add in the segments files, you're up to
> 475.
>
> Most operating systems have a default FD limit that's at least 1024.  If
> you only have one index (core) on your Solr server, Solr is the only thing
> running on that server, and it's using the default mergeFactor of 10, you
> should be fine with the default.  If you are going to have more than one
> index on your Solr server (such as a build core and a live core), you plan
> to run other things on the server, or you want to increase your mergeFactor
> significantly, you might need to adjust the OS configuration to allow more
> file descriptors.
>
> Thanks,
> Shawn
>
>


-- 
Regards,
Samarth


Solr Open File Descriptors

2011-10-16 Thread samarth s
Hi,

Is it safe to assume that with a megeFactor of 10 the open file descriptors
required by solr would be around (1+ 10) * 10 = 110
ref: *http://onjava.com/pub/a/onjava/2003/03/05/lucene.html#indexing_speed*
Solr wiki:
http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerationsstates
that FD's required per segment is around 7.

Are these estimates appropriate. Does it in anyway depend on the size of the
index & number of docs (assuming same number of segments in any case) as
well?


-- 
Regards,
Samarth


Field Cache

2011-05-07 Thread samarth s
Hi,

I have read lucene field cache is used in faceting and sorting. Is it also
populated/used when only selected fields are retrieved using the 'fl' OR
'included fields in collapse' parameters? Is it also used for collapsing?

-- 
Regards,
Samarth


Re: exception obtaining write lock on startup

2011-01-17 Thread samarth s
In that case why is there a separate lock factory of SingleInstanceLockFactory?

On Fri, Dec 31, 2010 at 6:25 AM, Lance Norskog  wrote:
> This will not work. At all.
>
> You can only have one Solr core instance changing an index.
>
> On Thu, Dec 30, 2010 at 4:38 PM, Tri Nguyen  wrote:
>> Hi,
>>
>> I'm getting this exception when I have 2 cores as masters.  Seems like one 
>> of the cores obtains a lock (file) and then the other tries to obtain the 
>> same one.   However, the first one is not deleted.
>>
>> How do I fix this?
>>
>> Dec 30, 2010 4:34:48 PM org.apache.solr.handler.ReplicationHandler inform
>> WARNING: Unable to get IndexCommit on startup
>> org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: 
>> Native
>> FSLock@..\webapps\solr\tnsolr\data\index\lucene-fe3fc928a4bbfeb55082e49b32a70c10
>> -write.lock
>>     at org.apache.lucene.store.Lock.obtain(Lock.java:85)
>>     at org.apache.lucene.index.IndexWriter.init(IndexWriter.java:1565)
>>     at org.apache.lucene.index.IndexWriter.(IndexWriter.java:1421)
>>     at 
>> org.apache.solr.update.SolrIndexWriter.(SolrIndexWriter.java:19
>> 1)
>>     at 
>> org.apache.solr.update.UpdateHandler.createMainIndexWriter(UpdateHand
>> ler.java:98)
>>     at 
>> org.apache.solr.update.DirectUpdateHandler2.openWriter(DirectUpdateHa
>> ndler2.java:173)
>>     at 
>> org.apache.solr.update.DirectUpdateHandler2.forceOpenWriter(DirectUpd
>> ateHandler2.java:376)
>>     at 
>> org.apache.solr.handler.ReplicationHandler.inform(ReplicationHandler.
>>
>>
>> Tri
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>


Collapsing with start, rows parameters

2010-12-29 Thread samarth s
Hi,

I am using collapsing with start & rows parameters.
For start=0 & rows=10 my query looks like:
q=f1:v1+AND+f2:v2&date:[*+TO+*]&rows=10&start=0&fl=rootId&collapse.field=rootId&collapse.threshold=1&collapse.type=normal&collapse.includeCollapsedDocs.fl=id

The same query with start=10, gives me an overlapping result. i.e.
Last two of the first query's collapse groups are appearing in the
second query's results as the first two groups. With increasing the
value of the start parameter, the number of overlapping groups changes
somewhat arbitrarily. Sometimes gives 3,5,etc.

I am working on a patch for SOLR-236 dated 2010-06-17 03:08 PM. Has
this been an issue that has been fixed?

Thanks for any pointers,
Samarth


Re: Total number of groups after collapsing

2010-12-23 Thread samarth s
Hi,

I figured out a better way of doing it. The following query would be a
better option:
q=*:*&start=2147483647&rows=0&collapse=true&collapse.field=abc&collapse.threshold=1

Thanks,
Samarth

On Thu, Dec 23, 2010 at 8:57 PM, samarth s wrote:

> Hi,
>
> I have been using collapsing in my application. I have a requirement of
> finding the no of groups matching some filter criteria.
> Something like a COUNT(DISTINCT columnName). The only solution I can
> currently think of is using the query:
>
>
> q=*:*&rows=Integer.MAX_VALUE&start=0&fl=score&collapse.field=abc&collapse.threshold=1&collapse.type=normal
>
> I get the number of groups from 'numFound', but this seems like a bad
> solution in terms of performance. Is there a cleaner way?
>
> Thanks,
> Samarth
>


Total number of groups after collapsing

2010-12-23 Thread samarth s
Hi,

I have been using collapsing in my application. I have a requirement of
finding the no of groups matching some filter criteria.
Something like a COUNT(DISTINCT columnName). The only solution I can
currently think of is using the query:

q=*:*&rows=Integer.MAX_VALUE&start=0&fl=score&collapse.field=abc&collapse.threshold=1&collapse.type=normal

I get the number of groups from 'numFound', but this seems like a bad
solution in terms of performance. Is there a cleaner way?

Thanks,
Samarth


Glob in fl parameter

2010-12-22 Thread samarth s
Hi,

Is there any support for glob in the 'fl' param. This would be very useful
in case of retrieving dynamic fields. I have read the wiki for
FieldAliasesAndGlobsInParams. Is there any related patch?

Thanks for any pointers,
Samarth


Re: solr dynamic core creation

2010-11-21 Thread samarth s
Hi nizan,

I have the same requirement of creating cores on the fly. Was looking
for some API provided by http solr server. Currently working around by
writing my own shell script on the server (solr server :) ). Any
better leads on the same?

Thanks,
Samarth

On Thu, Nov 11, 2010 at 9:27 PM, Robert Sandiford
 wrote:
>
> No - in reading what you just wrote, and what you originally wrote, I think
> the misunderstanding was mine, based on the architecture of my code.  In my
> code, it is our 'server' level that does the SolrJ indexing calls, but you
> meant 'server' to be the Solr instance, and what you mean by 'client' is
> what I was thinking of (without thinking) as the 'server'...
>
> Sorry about that.  Hopefully someone else can chime in on your specific
> issue...
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/solr-dynamic-core-creation-tp1867705p1883354.html
> Sent from the Solr - User mailing list archive at Nabble.com.


Dynamically create new core

2010-11-02 Thread samarth s
Hi,


I have a requirement of dynamically creating new cores(master). Each
core should have a replicated slave core.
I am working with Java and using SolrJ as my solr client. I came
across CoreAdminRequest class and looks like the way to go.

CoreAdminRequest.createCore("NewCore1", "NewCore1", solrServer);
creates a new core programmatically.

Also, for the newly created core, I want to use an existing
solrconfig.xml & modify certain parameters. Can I achieve this using
SolrJ?

Are there any better approaches for the requirement?

Thanks for any pointers,