[jira] Commented: (LUCENE-1337) [PATCH] improve searching under high concurrancy

2008-07-19 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614971#action_12614971
 ] 

Michael McCandless commented on LUCENE-1337:


bq. Yonik checked in a modification of FSDirectory into LUCENE-753. I took that 
code and made NIOFSDirectory which is standalone so that it can be committed. 
It is checked into LUCENE-753 as lucene-753.patch. 

OK.  I think (?) it's a good idea to separately offer an FSDirectory 
implementation that uses positional reads (via FileChannel) to avoid 
synchronization.

I'd also like to somehow make that implementation the default on those 
platforms (all except windows?) where there are clear concurrency gains.  Ie, 
maybe change FSDirectory.getDirectory to return NIOFSDirectory if it's not on 
windows, but also offer a getDirectory that takes the IMPL so you can force it 
to pick a different IMPL.  In general I think Lucene should default to good out 
of the box performance, ie, without requiring special knowledge/tuning on the 
user's part, so long as there's no difficult tradeoff.

Though we probably should change the name to something less generic than "nio", 
though I can't think of an alternative offhand.

But one question: it looks like NIOFSIndexInput copies most of 
BufferedIndexInput source rather than subclassing -- why was that?  Can we 
change that back to a subclass, perhaps opening up members of 
BufferedIndexInput a bit if necessary?

> [PATCH] improve searching under high concurrancy
> 
>
> Key: LUCENE-1337
> URL: https://issues.apache.org/jira/browse/LUCENE-1337
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.3.1
> Environment: Linux
>Reporter: Brian Gardner
>Priority: Minor
> Attachments: lucene.patch
>
>
> I was trying to load test my web server and kept running into a condition 
> were the web server would become unresponsive even though the load was below 
> one.  Turns out Lucene has synchronization blocks around reading the index.  
> It appears this was only necassary to synchronize access to a descriptor 
> which contains a RandomAccessFile and information about the state of this 
> file.  My solution was to use a pool of descriptors so that they could be 
> reused on subsequent reads.  During periods of low contention only one or a 
> few Descriptors will be created, but under heavy loads many Descriptors can 
> be created to avoid synchronization.  After creating and applying my patch, I 
> was able to triple my searching throughput and fully utilize the resources, 
> the CPU's becoming the new bottleneck.   My patch modifies FSDirectory 
> directly, but I'm not entirely sure that's the proper implementation.  I'd 
> like to help resolve this synchronization issue for other lucene users, so 
> please let me know how I can help.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1337) [PATCH] improve searching under high concurrancy

2008-07-17 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614519#action_12614519
 ] 

Jason Rutherglen commented on LUCENE-1337:
--

Yonik checked in a modification of FSDirectory into LUCENE-753.  I took that 
code and made NIOFSDirectory which is standalone so that it can be committed.  
It is checked into LUCENE-753 as lucene-753.patch.  

> [PATCH] improve searching under high concurrancy
> 
>
> Key: LUCENE-1337
> URL: https://issues.apache.org/jira/browse/LUCENE-1337
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.3.1
> Environment: Linux
>Reporter: Brian Gardner
>Priority: Minor
> Attachments: lucene.patch
>
>
> I was trying to load test my web server and kept running into a condition 
> were the web server would become unresponsive even though the load was below 
> one.  Turns out Lucene has synchronization blocks around reading the index.  
> It appears this was only necassary to synchronize access to a descriptor 
> which contains a RandomAccessFile and information about the state of this 
> file.  My solution was to use a pool of descriptors so that they could be 
> reused on subsequent reads.  During periods of low contention only one or a 
> few Descriptors will be created, but under heavy loads many Descriptors can 
> be created to avoid synchronization.  After creating and applying my patch, I 
> was able to triple my searching throughput and fully utilize the resources, 
> the CPU's becoming the new bottleneck.   My patch modifies FSDirectory 
> directly, but I'm not entirely sure that's the proper implementation.  I'd 
> like to help resolve this synchronization issue for other lucene users, so 
> please let me know how I can help.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1337) [PATCH] improve searching under high concurrancy

2008-07-17 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614483#action_12614483
 ] 

Michael McCandless commented on LUCENE-1337:


Jason are you thinking of LUCENE-414 (NIOFSDirectory)?

> [PATCH] improve searching under high concurrancy
> 
>
> Key: LUCENE-1337
> URL: https://issues.apache.org/jira/browse/LUCENE-1337
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.3.1
> Environment: Linux
>Reporter: Brian Gardner
>Priority: Minor
> Attachments: lucene.patch
>
>
> I was trying to load test my web server and kept running into a condition 
> were the web server would become unresponsive even though the load was below 
> one.  Turns out Lucene has synchronization blocks around reading the index.  
> It appears this was only necassary to synchronize access to a descriptor 
> which contains a RandomAccessFile and information about the state of this 
> file.  My solution was to use a pool of descriptors so that they could be 
> reused on subsequent reads.  During periods of low contention only one or a 
> few Descriptors will be created, but under heavy loads many Descriptors can 
> be created to avoid synchronization.  After creating and applying my patch, I 
> was able to triple my searching throughput and fully utilize the resources, 
> the CPU's becoming the new bottleneck.   My patch modifies FSDirectory 
> directly, but I'm not entirely sure that's the proper implementation.  I'd 
> like to help resolve this synchronization issue for other lucene users, so 
> please let me know how I can help.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1337) [PATCH] improve searching under high concurrancy

2008-07-17 Thread Jason Rutherglen (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614475#action_12614475
 ] 

Jason Rutherglen commented on LUCENE-1337:
--

The problem is the same but the solution is not.  Do they each need separate 
patches listing more specifically how they solved the problem?  Each solution 
has pluses and minuses.  The NIOFSDirectory doesn't work on Windows.  
DescriptorsFSDirectory will on many Lucene installations quickly max out the 
file descriptors.  

I would like to see both committed to trunk.  MMapDirectory is in the trunk and 
it has limitations as well, mainly that (at least how I understand it) loads 
the all the files into ram.  

> [PATCH] improve searching under high concurrancy
> 
>
> Key: LUCENE-1337
> URL: https://issues.apache.org/jira/browse/LUCENE-1337
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.3.1
> Environment: Linux
>Reporter: Brian Gardner
>Priority: Minor
> Attachments: lucene.patch
>
>
> I was trying to load test my web server and kept running into a condition 
> were the web server would become unresponsive even though the load was below 
> one.  Turns out Lucene has synchronization blocks around reading the index.  
> It appears this was only necassary to synchronize access to a descriptor 
> which contains a RandomAccessFile and information about the state of this 
> file.  My solution was to use a pool of descriptors so that they could be 
> reused on subsequent reads.  During periods of low contention only one or a 
> few Descriptors will be created, but under heavy loads many Descriptors can 
> be created to avoid synchronization.  After creating and applying my patch, I 
> was able to triple my searching throughput and fully utilize the resources, 
> the CPU's becoming the new bottleneck.   My patch modifies FSDirectory 
> directly, but I'm not entirely sure that's the proper implementation.  I'd 
> like to help resolve this synchronization issue for other lucene users, so 
> please let me know how I can help.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



[jira] Commented: (LUCENE-1337) [PATCH] improve searching under high concurrancy

2008-07-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1337?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614031#action_12614031
 ] 

Yonik Seeley commented on LUCENE-1337:
--

Thanks Brian, also see LUCENE-753 for more history and a bunch of options.

> [PATCH] improve searching under high concurrancy
> 
>
> Key: LUCENE-1337
> URL: https://issues.apache.org/jira/browse/LUCENE-1337
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Affects Versions: 2.3.1
> Environment: Linux
>Reporter: Brian Gardner
>Priority: Minor
> Attachments: lucene.patch
>
>
> I was trying to load test my web server and kept running into a condition 
> were the web server would become unresponsive even though the load was below 
> one.  Turns out Lucene has synchronization blocks around reading the index.  
> It appears this was only necassary to synchronize access to a descriptor 
> which contains a RandomAccessFile and information about the state of this 
> file.  My solution was to use a pool of descriptors so that they could be 
> reused on subsequent reads.  During periods of low contention only one or a 
> few Descriptors will be created, but under heavy loads many Descriptors can 
> be created to avoid synchronization.  After creating and applying my patch, I 
> was able to triple my searching throughput and fully utilize the resources, 
> the CPU's becoming the new bottleneck.   My patch modifies FSDirectory 
> directly, but I'm not entirely sure that's the proper implementation.  I'd 
> like to help resolve this synchronization issue for other lucene users, so 
> please let me know how I can help.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]