[jira] Commented: (LUCENE-2815) MultiFields not thread safe

2010-12-17 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972505#action_12972505
 ] 

Yonik Seeley commented on LUCENE-2815:
--

Hmmm, this patch causes test failures because ConcurrentHashMap doesn't handle 
null values.


 MultiFields not thread safe
 ---

 Key: LUCENE-2815
 URL: https://issues.apache.org/jira/browse/LUCENE-2815
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Yonik Seeley
 Fix For: 4.0

 Attachments: LUCENE-2815.patch


 MultiFields looks like it has thread safety issues

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2815) MultiFields not thread safe

2010-12-16 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972057#action_12972057
 ] 

Michael McCandless commented on LUCENE-2815:


Ugh, nice finds Yonik!  We should fix these.

Maybe MultiFields should just pre-build its MapString,Term on init?

You're right, we do reuse MultiFields today (we stuff the instance of 
MultiFields onto the IndexReader with IndexReader.store/retrieveFields), but I 
wonder whether we really should?  (In fact I thought at one point we decided to 
stop doing that... yet, we still are... can't remember the details; maybe perf 
hit was too high eg for MTQs/Solr facets/etc.).

What do we need to do to make the publication safe?  Is making 
IR.store/retrieveFields sync'd sufficient?

Aside: Java concurrency is a *mess*.  I understand why JMM is needed, to get 
good perf on modern CPUs, but allowing the low level CPU cache coherency 
requirements to bubble all the way up to complex requirements in the language 
itself, is a disaster.

 MultiFields not thread safe
 ---

 Key: LUCENE-2815
 URL: https://issues.apache.org/jira/browse/LUCENE-2815
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Yonik Seeley

 MultiFields looks like it has thread safety issues

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2815) MultiFields not thread safe

2010-12-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972096#action_12972096
 ] 

Yonik Seeley commented on LUCENE-2815:
--

bq. but I wonder whether we really should? (In fact I thought at one point we 
decided to stop doing that... yet, we still are... can't remember the details; 
maybe perf hit was too high eg for MTQs/Solr facets/etc.).

It wouldn't be solr facets... that code asks for fields() once up front (per 
facet request) and the rest of the work will dwarf that.
I think there probably are a lot of random places that use it where the 
overhead could be significant.  For example IndexReader.deleteDocuments(), 
ParallelReader, FuzzyLikeThisQuery, and anyone else that uses any of the static 
methods on Field on a non-segment reader.

bq. What do we need to do to make the publication safe? Is making 
IR.store/retrieveFields sync'd sufficient?

More than sufficient.  A volatile would also work fine provided that a race 
shouldn't matter (i.e. more than one MultiFields object could be constructed).

bq. Maybe MultiFields should just pre-build its MapString,Term on init?

Ouch... those folks with 1000s of fields wouldn't be happy about that.

 MultiFields not thread safe
 ---

 Key: LUCENE-2815
 URL: https://issues.apache.org/jira/browse/LUCENE-2815
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Yonik Seeley

 MultiFields looks like it has thread safety issues

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2815) MultiFields not thread safe

2010-12-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972258#action_12972258
 ] 

Yonik Seeley commented on LUCENE-2815:
--

bq. It looks like MultiReaderBits also has issues with safe object publication. 

Actually, it looks like this one is OK with most of our current code.
SegmentReader.getDeletedDocs() returns an object stored in a volatile, so that 
counts as a safe publish.  Other implementations seem to either throw an 
exception or directly call a segment reader.  One exception is instantiated 
index (I think).

We can't call getDeletedDocs() just once up-front, because an IndexReader may 
still be used to delete documents.

 MultiFields not thread safe
 ---

 Key: LUCENE-2815
 URL: https://issues.apache.org/jira/browse/LUCENE-2815
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Yonik Seeley
 Fix For: 4.0


 MultiFields looks like it has thread safety issues

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2815) MultiFields not thread safe

2010-12-16 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12972281#action_12972281
 ] 

Yonik Seeley commented on LUCENE-2815:
--

I was going to fix InstantiatedIndex, but while I was in there, I saw a lot of 
non-threadsafe code.  I think that really deserves it's own issue.
What range of docs is InstantiatedIndex faster for, and is it something we want 
to continue to maintain?

 MultiFields not thread safe
 ---

 Key: LUCENE-2815
 URL: https://issues.apache.org/jira/browse/LUCENE-2815
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Yonik Seeley
 Fix For: 4.0


 MultiFields looks like it has thread safety issues

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2815) MultiFields not thread safe

2010-12-15 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971909#action_12971909
 ] 

Yonik Seeley commented on LUCENE-2815:
--

So I started looking into MultiFields more closely, just from a performance 
perspective, and noticed a couple of thread safety issues:
- MultiFields is reused (a good thing), but isn't safely published to other 
threads.
- MultiFields has a HashMap that is used and modified unsynchronized to cache 
MultiTerms instances

 MultiFields not thread safe
 ---

 Key: LUCENE-2815
 URL: https://issues.apache.org/jira/browse/LUCENE-2815
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Yonik Seeley

 MultiFields looks like it has thread safety issues

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2815) MultiFields not thread safe

2010-12-15 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2815?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12971930#action_12971930
 ] 

Yonik Seeley commented on LUCENE-2815:
--

It looks like MultiReaderBits also has issues with safe object publication.

 MultiFields not thread safe
 ---

 Key: LUCENE-2815
 URL: https://issues.apache.org/jira/browse/LUCENE-2815
 Project: Lucene - Java
  Issue Type: Bug
Affects Versions: 4.0
Reporter: Yonik Seeley

 MultiFields looks like it has thread safety issues

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org