Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-02-02 Thread Chris Hostetter

:  So what/how should we document all of this?
...
:  I've got more info on this.

Mark: most of what you wrote is above my head, but since you fixed a 
grammar error in my updated example solrconfig.xml comment w/o making any 
content changes, I'm assuming you feel what i put there is sufficient.

Most of your comments feel like they should be raised over in Lucene-Java 
land, at a minimum in documentation (added to the AvailableLockFactories 
page perhaps) or possibly in some code changes (should we changed the 
default LockFactory depending on Java version?)

I'll leave that up to you, since (as i mentioned) i didnt' understand half 
of it.

:  Checking for OverlappingFileLockException *should* actually work when
:  using Java 1.6. Java 1.6 started using a *system wide* thread safe check
:  for this.
: 
:  Previous to Java 1.6, checks for this *were* limited to an instance of
:  FileChannel - the FileChannel maintained its own personal lock list. So
:  you have to use
:  the same Channel to even have any hope of seeing an
:  OverlappingFileLockException. Even then though, its not properly thread
:  safe. They did not sync across
:  checking if the lock exists and acquiring the lock - they separately
:  sync each action - leaving room to acquire the lock twice from two
:  different threads like I was seeing.
: 
:  Interestingly, Java 1.6 has a back compat mode you can turn on that
:  doesn't use the system wide lock list, and they have fixed this thread
:  safety issue in that impl - there is a sync across checking
:  and getting the lock so that it is properly thread safe - but not in
:  Java 1.4, 1.5.
: 
:  Looking at GCC - uh ... I don't think you want to use GCC - they don't
:  appear to use a lock list and check for this at all :)
: 
:  But the point is, this is fixable on Java 6 if we check for
:  OverlappingFileLockException - it *should* work across webapps, and it
:  is actually thread safe, unlike Java 1.4,1.5.
: 
:
: Another interesting fact:
: 
: On Windows, if you attempt to lock the same file with different channel
: instances pre Java 1.6 - the code will deadlock.
: 
: -- 
: - Mark
: 
: http://www.lucidimagination.com
: 
: 
: 



-Hoss



Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-28 Thread Mark Miller
Chris Hostetter wrote:
 :  At a minimu, shouldn't NativeFSLock.obtain() be checking for 
 :  OverlappingFileLockException and treating that as a failure to acquire 
 the 
 :  lock?
   ...
 : Perhaps - that should make it work in more cases - but in my simple
 : testing its not 100% reliable.
   ...
 : File locks are held on behalf of the entire Java virtual machine.
 :  * They are not suitable for controlling access to a file by multiple
 :  * threads within the same virtual machine.

 ...Grrr  so where does that leave us?

 Yonik's added comment was that native isnt' recommended when running 
 multiple webapps in the same container.  in truth, native *can* 
 work when running multiple webapps in the same container, just as long as 
 those cotnainers don't refrence the same data dirs

 I'm worried that we should recommend people avoid native altogether 
 because even if you are only running one webapp, it seems like a reload 
 or that app could trigger some similar bad behavior.

 So what/how should we document all of this?

 -Hoss

   
I've got more info on this.

Checking for OverlappingFileLockException *should* actually work when
using Java 1.6. Java 1.6 started using a *system wide* thread safe check
for this.

Previous to Java 1.6, checks for this *were* limited to an instance of
FileChannel - the FileChannel maintained its own personal lock list. So
you have to use
the same Channel to even have any hope of seeing an
OverlappingFileLockException. Even then though, its not properly thread
safe. They did not sync across
checking if the lock exists and acquiring the lock - they separately
sync each action - leaving room to acquire the lock twice from two
different threads like I was seeing.

Interestingly, Java 1.6 has a back compat mode you can turn on that
doesn't use the system wide lock list, and they have fixed this thread
safety issue in that impl - there is a sync across checking
and getting the lock so that it is properly thread safe - but not in
Java 1.4, 1.5.

Looking at GCC - uh ... I don't think you want to use GCC - they don't
appear to use a lock list and check for this at all :)

But the point is, this is fixable on Java 6 if we check for
OverlappingFileLockException - it *should* work across webapps, and it
is actually thread safe, unlike Java 1.4,1.5.

-- 
- Mark

http://www.lucidimagination.com





Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-28 Thread Mark Miller
Mark Miller wrote:
 Chris Hostetter wrote:
   
 :  At a minimu, shouldn't NativeFSLock.obtain() be checking for 
 :  OverlappingFileLockException and treating that as a failure to acquire 
 the 
 :  lock?
  ...
 : Perhaps - that should make it work in more cases - but in my simple
 : testing its not 100% reliable.
  ...
 : File locks are held on behalf of the entire Java virtual machine.
 :  * They are not suitable for controlling access to a file by multiple
 :  * threads within the same virtual machine.

 ...Grrr  so where does that leave us?

 Yonik's added comment was that native isnt' recommended when running 
 multiple webapps in the same container.  in truth, native *can* 
 work when running multiple webapps in the same container, just as long as 
 those cotnainers don't refrence the same data dirs

 I'm worried that we should recommend people avoid native altogether 
 because even if you are only running one webapp, it seems like a reload 
 or that app could trigger some similar bad behavior.

 So what/how should we document all of this?

 -Hoss

   
 
 I've got more info on this.

 Checking for OverlappingFileLockException *should* actually work when
 using Java 1.6. Java 1.6 started using a *system wide* thread safe check
 for this.

 Previous to Java 1.6, checks for this *were* limited to an instance of
 FileChannel - the FileChannel maintained its own personal lock list. So
 you have to use
 the same Channel to even have any hope of seeing an
 OverlappingFileLockException. Even then though, its not properly thread
 safe. They did not sync across
 checking if the lock exists and acquiring the lock - they separately
 sync each action - leaving room to acquire the lock twice from two
 different threads like I was seeing.

 Interestingly, Java 1.6 has a back compat mode you can turn on that
 doesn't use the system wide lock list, and they have fixed this thread
 safety issue in that impl - there is a sync across checking
 and getting the lock so that it is properly thread safe - but not in
 Java 1.4, 1.5.

 Looking at GCC - uh ... I don't think you want to use GCC - they don't
 appear to use a lock list and check for this at all :)

 But the point is, this is fixable on Java 6 if we check for
 OverlappingFileLockException - it *should* work across webapps, and it
 is actually thread safe, unlike Java 1.4,1.5.

   
Another interesting fact:

On Windows, if you attempt to lock the same file with different channel
instances pre Java 1.6 - the code will deadlock.

-- 
- Mark

http://www.lucidimagination.com





Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-20 Thread Sanne Grinovero
thanks for the heads-up, this is good to know.
I've updated http://wiki.apache.org/lucene-java/AvailableLockFactories
which I recently created as a guide to help in choosing between
different LockFactories.

I believe the Native LockFactory is very useful, I wouldn't consider
this a bug nor consider discouraging it's use, people just need to be
informed of the behavior and know that no LockFactory impl is good for
all cases.

Adding some lines to it's javadoc seems appropriate.

Regards,
Sanne

2010/1/20 Chris Hostetter hossman_luc...@fucit.org:

 :  At a minimu, shouldn't NativeFSLock.obtain() be checking for
 :  OverlappingFileLockException and treating that as a failure to acquire the
 :  lock?
        ...
 : Perhaps - that should make it work in more cases - but in my simple
 : testing its not 100% reliable.
        ...
 : File locks are held on behalf of the entire Java virtual machine.
 :      * They are not suitable for controlling access to a file by multiple
 :      * threads within the same virtual machine.

 ...Grrr  so where does that leave us?

 Yonik's added comment was that native isnt' recommended when running
 multiple webapps in the same container.  in truth, native *can*
 work when running multiple webapps in the same container, just as long as
 those cotnainers don't refrence the same data dirs

 I'm worried that we should recommend people avoid native altogether
 because even if you are only running one webapp, it seems like a reload
 or that app could trigger some similar bad behavior.

 So what/how should we document all of this?

 -Hoss




Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-19 Thread Chris Hostetter

: again. I don't think it matters if its the same FileChannel or not - you
: just can't use Native Locks within the same JVM, as the lock is held by
: the JVM - they are per process - so Lucene does its own little static
: map stuff to lock within JVM (simple in memory lock tracking) and uses
: the actual Native Lock for multiple JVMs (which is all its good for -
: process granularity). But obviously, the in memory locking doesn't work
: across webapps.

Assuming I'm understanding all of this correctly, that implies a bug in 
Lucene's NativeFSLockFactory when used in a multiple classloader type 
situation -- including any app running in a servlet container.

At a minimu, shouldn't NativeFSLock.obtain() be checking for 
OverlappingFileLockException and treating that as a failure to acquire the 
lock?



-Hoss



Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-19 Thread Mark Miller
Chris Hostetter wrote:
 : again. I don't think it matters if its the same FileChannel or not - you
 : just can't use Native Locks within the same JVM, as the lock is held by
 : the JVM - they are per process - so Lucene does its own little static
 : map stuff to lock within JVM (simple in memory lock tracking) and uses
 : the actual Native Lock for multiple JVMs (which is all its good for -
 : process granularity). But obviously, the in memory locking doesn't work
 : across webapps.

 Assuming I'm understanding all of this correctly, that implies a bug in 
 Lucene's NativeFSLockFactory when used in a multiple classloader type 
 situation -- including any app running in a servlet container.

 At a minimu, shouldn't NativeFSLock.obtain() be checking for 
 OverlappingFileLockException and treating that as a failure to acquire the 
 lock?



 -Hoss

   
Perhaps - that should make it work in more cases - but in my simple
testing its not 100% reliable.

If I startup two threads and and try and get a lock (with the same
channel, with different channels) with first one thread and then the
other - sometimes it throws OverlappingFileLockException
... and sometimes it doesn't. From what I can tell, you certainly can't
count on it.

If you pause between attempts, it does appear to always work - so it
certainly would give us a lot of ground it would seem - but if they
attempts are back to back, both threads can still successfully get the lock.

This behavior could be OS dependent as its using OS level locks.

FileChannel does appear to say that this should work (though its
obviously not completely thread safe from what I can tell), but it also
says:

File locks are held on behalf of the entire Java virtual machine.
 * They are not suitable for controlling access to a file by multiple
 * threads within the same virtual machine.

Which seems to be the case.

-- 
- Mark

http://www.lucidimagination.com





Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-19 Thread Chris Hostetter

:  At a minimu, shouldn't NativeFSLock.obtain() be checking for 
:  OverlappingFileLockException and treating that as a failure to acquire the 
:  lock?
...
: Perhaps - that should make it work in more cases - but in my simple
: testing its not 100% reliable.
...
: File locks are held on behalf of the entire Java virtual machine.
:  * They are not suitable for controlling access to a file by multiple
:  * threads within the same virtual machine.

...Grrr  so where does that leave us?

Yonik's added comment was that native isnt' recommended when running 
multiple webapps in the same container.  in truth, native *can* 
work when running multiple webapps in the same container, just as long as 
those cotnainers don't refrence the same data dirs

I'm worried that we should recommend people avoid native altogether 
because even if you are only running one webapp, it seems like a reload 
or that app could trigger some similar bad behavior.

So what/how should we document all of this?

-Hoss



Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-18 Thread Yonik Seeley
On Mon, Jan 18, 2010 at 1:17 AM, Chris Hostetter
hossman_luc...@fucit.org wrote:
 : Right... for stock Solr usage (i.e. as long as they don't try to lock
 : the same thing.)
 : It is funny that native locks always work across different processes,
 : but not always in the same JVM though.

 Actaully, the more i think about this the less i understand it ... why
 don't native locks work within the same VM? ... and by work i mean why
 didn't he just get a lock timeout error?

Within the same VM, you need the same FileChannel for some reason.
Lucene uses a static hashmap so that multiple NativeFSLockFactory
instances will end up using the same FileChannel for locking.  But
multiple webapps obviously breaks that.

-Yonik
http://www.lucidimagination.com


Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-18 Thread Mark Miller
Yonik Seeley wrote:
 On Mon, Jan 18, 2010 at 1:17 AM, Chris Hostetter
 hossman_luc...@fucit.org wrote:
   
 : Right... for stock Solr usage (i.e. as long as they don't try to lock
 : the same thing.)
 : It is funny that native locks always work across different processes,
 : but not always in the same JVM though.

 Actaully, the more i think about this the less i understand it ... why
 don't native locks work within the same VM? ... and by work i mean why
 didn't he just get a lock timeout error?
 

 Within the same VM, you need the same FileChannel for some reason.
 Lucene uses a static hashmap so that multiple NativeFSLockFactory
 instances will end up using the same FileChannel for locking.  But
 multiple webapps obviously breaks that.

 -Yonik
 http://www.lucidimagination.com
   
Native Locks are obtained at the JVM level - so if you try and lock the
same Channel twice, since the same JVM already has the lock, its granted
again. I don't think it matters if its the same FileChannel or not - you
just can't use Native Locks within the same JVM, as the lock is held by
the JVM - they are per process - so Lucene does its own little static
map stuff to lock within JVM (simple in memory lock tracking) and uses
the actual Native Lock for multiple JVMs (which is all its good for -
process granularity). But obviously, the in memory locking doesn't work
across webapps.

-- 
- Mark

http://www.lucidimagination.com





Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-18 Thread Mark Miller
Mark Miller wrote:
 Yonik Seeley wrote:
   
 On Mon, Jan 18, 2010 at 1:17 AM, Chris Hostetter
 hossman_luc...@fucit.org wrote:
   
 
 : Right... for stock Solr usage (i.e. as long as they don't try to lock
 : the same thing.)
 : It is funny that native locks always work across different processes,
 : but not always in the same JVM though.

 Actaully, the more i think about this the less i understand it ... why
 don't native locks work within the same VM? ... and by work i mean why
 didn't he just get a lock timeout error?
 
   
 Within the same VM, you need the same FileChannel for some reason.
 Lucene uses a static hashmap so that multiple NativeFSLockFactory
 instances will end up using the same FileChannel for locking.  But
 multiple webapps obviously breaks that.

 -Yonik
 http://www.lucidimagination.com
   
 
 Native Locks are obtained at the JVM level - so if you try and lock the
 same Channel twice, since the same JVM already has the lock, its granted
 again. I don't think it matters if its the same FileChannel or not - you
 just can't use Native Locks within the same JVM, as the lock is held by
 the JVM - they are per process - so Lucene does its own little static
 map stuff to lock within JVM (simple in memory lock tracking) and uses
 the actual Native Lock for multiple JVMs (which is all its good for -
 process granularity). But obviously, the in memory locking doesn't work
 across webapps.

   
Also, the javadocs in Lucene are wrong:

  /*
   * The javadocs for FileChannel state that you should have
   * a single instance of a FileChannel (per JVM) for all
   * locking against a given file.  To ensure this, we have
   * a single (static) HashSet that contains the file paths
   * of all currently locked locks.  This protects against
   * possible cases where different Directory instances in
   * one JVM (each with their own NativeFSLockFactory
   * instance) have set the same lock dir and lock prefix.
   */

The javadocs for FileChannel don't say this at all - and this implies
that Lucene is doing something that it is not. The javadocs say don't
expect native locks to work for locking within a JVM, because it
doesn't. And Lucene doesn't try and use the same FileChannel per JVM (it
wouldn't help anyway) - Lucene simply attempts to track per JVM locks in
a static map (which doesn't work per JVM when you are dealing with
different classloaders).

-- 
- Mark

http://www.lucidimagination.com





Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-18 Thread Yonik Seeley
Ah thanks - I was going by that comment :-)

On Mon, Jan 18, 2010 at 12:07 PM, Mark Miller markrmil...@gmail.com wrote:
 Mark Miller wrote:
 Yonik Seeley wrote:

 On Mon, Jan 18, 2010 at 1:17 AM, Chris Hostetter
 hossman_luc...@fucit.org wrote:


 : Right... for stock Solr usage (i.e. as long as they don't try to lock
 : the same thing.)
 : It is funny that native locks always work across different processes,
 : but not always in the same JVM though.

 Actaully, the more i think about this the less i understand it ... why
 don't native locks work within the same VM? ... and by work i mean why
 didn't he just get a lock timeout error?


 Within the same VM, you need the same FileChannel for some reason.
 Lucene uses a static hashmap so that multiple NativeFSLockFactory
 instances will end up using the same FileChannel for locking.  But
 multiple webapps obviously breaks that.

 -Yonik
 http://www.lucidimagination.com


 Native Locks are obtained at the JVM level - so if you try and lock the
 same Channel twice, since the same JVM already has the lock, its granted
 again. I don't think it matters if its the same FileChannel or not - you
 just can't use Native Locks within the same JVM, as the lock is held by
 the JVM - they are per process - so Lucene does its own little static
 map stuff to lock within JVM (simple in memory lock tracking) and uses
 the actual Native Lock for multiple JVMs (which is all its good for -
 process granularity). But obviously, the in memory locking doesn't work
 across webapps.


 Also, the javadocs in Lucene are wrong:

  /*
   * The javadocs for FileChannel state that you should have
   * a single instance of a FileChannel (per JVM) for all
   * locking against a given file.  To ensure this, we have
   * a single (static) HashSet that contains the file paths
   * of all currently locked locks.  This protects against
   * possible cases where different Directory instances in
   * one JVM (each with their own NativeFSLockFactory
   * instance) have set the same lock dir and lock prefix.
   */

 The javadocs for FileChannel don't say this at all - and this implies
 that Lucene is doing something that it is not. The javadocs say don't
 expect native locks to work for locking within a JVM, because it
 doesn't. And Lucene doesn't try and use the same FileChannel per JVM (it
 wouldn't help anyway) - Lucene simply attempts to track per JVM locks in
 a static map (which doesn't work per JVM when you are dealing with
 different classloaders).

 --
 - Mark

 http://www.lucidimagination.com






Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-17 Thread Chris Hostetter

: Right... for stock Solr usage (i.e. as long as they don't try to lock
: the same thing.)
: It is funny that native locks always work across different processes,
: but not always in the same JVM though.

Actaully, the more i think about this the less i understand it ... why 
don't native locks work within the same VM? ... and by work i mean why 
didn't he just get a lock timeout error?

If the behavior of Native Locks is really that you don't get the same 
behavior if both clients are in the same JVM, then shouldn't the Lucene 
NativeLockFactory be doing something like wrapping a 
SingleInstanceLockFactory arround the NativeFSLockFactory?

: #2) native lock factory fails if it's two different Solr webapps in
: the same JVM trying to lock the same thing.
...
: Should we clarify Do not use with multiple solr webapps in the same
: JVM or just remove it?

I'm starting to think we should remove support for native locks at all -- 
if it can fail in the situation of multiple wars in the same JVM trying to 
use the same solr home, that implies that it can also fail if something 
goes wrong during a hot deploying the solr.war ... if the shutdown of 
the older instance of solr.war fails for some reason, thentheir could be a 
stale lock, created in the same JVM, left over when the newer instance is 
brought online.

correct?


-Hoss



Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-16 Thread Chris Hostetter

: doc: note about native locks not working for multiple webapps in same JVM

Is this in resposne to the OverlappingFileLockException thread started by 
Joe Kessel? ...

: +  native = NativeFSLockFactory  - uses OS native file locking.
: +   Do not use with multiple solr webapps in the same JVM.

I think there's a missunderstanding about the root cause of hte problem.  
There shouldn't be any inherent problem with using Native locks 
and multiple webapps -- i believe the underlying source of the exception 
was that he was using multiple webapps w/o realizing it -- so presumably 
both webapps were trying to use the same solr home dir.


-Hoss



Re: svn commit: r899979 - /lucene/solr/trunk/example/solr/conf/solrconfig.xml

2010-01-16 Thread Yonik Seeley
On Sat, Jan 16, 2010 at 3:40 PM, Chris Hostetter
hossman_luc...@fucit.org wrote:

 : doc: note about native locks not working for multiple webapps in same JVM

 Is this in resposne to the OverlappingFileLockException thread started by
 Joe Kessel? ...

 : +      native = NativeFSLockFactory  - uses OS native file locking.
 : +               Do not use with multiple solr webapps in the same JVM.

 I think there's a missunderstanding about the root cause of hte problem.
 There shouldn't be any inherent problem with using Native locks
 and multiple webapps

Right... for stock Solr usage (i.e. as long as they don't try to lock
the same thing.)
It is funny that native locks always work across different processes,
but not always in the same JVM though.

 -- i believe the underlying source of the exception
 was that he was using multiple webapps w/o realizing it -- so presumably
 both webapps were trying to use the same solr home dir.

Right... it's really two issues:
#1) two separate solr instances trying to use the same solr index
#2) native lock factory fails if it's two different Solr webapps in
the same JVM trying to lock the same thing.

I do recall expert level stuff like people having mutiple solr
instances pointing to the same data directory in the past though, but
not sure if it was from the same JVM or not.

Should we clarify Do not use with multiple solr webapps in the same
JVM or just remove it?

-Yonik
http://www.lucidimagination.com