[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-10 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12741479#action_12741479
 ] 

Hoss Man commented on LUCENE-1749:
--

bq. Maybe we should simply print a warning, eg to System.err, on detecting that 
2X RAM usage has occurred, pointing people to the sanity checker? We could eg 
do it once only so we don't spam the stderr logs

I'm not really comfortable dumping anything to System.err without user 
requesting it ... but this is a really interesting idea.  (I suppose we could 
add an infoStream type idea to FieldCache to expose this)

FieldCacheImpl.Cache.get could use the FieldCacheSanityChecker to inspect 
itself immediately after calling createValue, and could even test if any of the 
Insanity instances returned are related to the current call (by comparing the 
CacheEntry with the Entry it's using) ... it could even log a useful stack 
trace since the sanity check would be happening in the same call stack as at 
least one of the CacheEntries in the Insanity object.

I've opened LUCENE-1798 to track implmenting somehting like this once the 
FieldCacheSanityChecker gets committed.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-07 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740624#action_12740624
 ] 

Michael McCandless commented on LUCENE-1749:


Maybe we should simply print a warning, eg to System.err, on detecting that 2X 
RAM usage has occurred, pointing people to the sanity checker?  We could eg do 
it once only so we don't spam the stderr logs...

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740155#action_12740155
 ] 

Mark Miller commented on LUCENE-1749:
-

P.S. I'm not sure we want to go with the way I have changed the tests here.

I switched things to go per subreader rather than use the overall reader - this 
is how things happen in IndexSearcher now. But we lose the top level reader 
test. We might want to do it both ways, and when doing it by top reader, ignore 
the triggered insanity check?

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740256#action_12740256
 ] 

Hoss Man commented on LUCENE-1749:
--

Mark: I'll start working on improving the docs (and other things from my 
previous todo list)

bq. P.S. I'm not sure we want to go with the way I have changed the tests here.

Are you talking about TestOrdValues and TestFieldScoreQuery ?

if we expect OrdValues and FieldScoreQuery to use subReader based field caches, 
then the test seems to be doing the correct thing (in your patch) .. inspecting 
the fieldcaches per subreader.  Is there a code path where we expect those 
methods to get called on a MultiReader?

(Actually: that seems like a wroth while improvement to make to these classes: 
a MultiDocValues impl that all of the getValues(IndexReader) methods use when 
passed a MultiReader ... it uses getSequentialSubReaders to construct DocValue 
instances for each so you don't get FieldCache expolsions if code inadvertenly 
passes the wrong reader to getValues.  What do you think? ... new issue?)

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740265#action_12740265
 ] 

Hoss Man commented on LUCENE-1749:
--

H...

actually mark, testing our your latest patch against hte trunk i'm seeing 
(FieldCache sanity) failures from TestCustomScoreQuery, TestFieldScoreQuery, 
TestOrdValues, and TestSort ... have you seen these?  did some other recent 
change on the trunk trigger these?

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740272#action_12740272
 ] 

Mark Miller commented on LUCENE-1749:
-

I think that TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues all 
fail because the fix for them is now in another issue.

TestSort I didn't notice. It looks like its considering String[] and 
StringIndex the same for the two multi and parallel sort tests - merged to 
trunk, so perhaps something has gone awry there? I've looked over the patch and 
I don't see any obvious mistake - I don't know that I have time to dig more 
now, but since you are more familiar with that code anyway, perhaps you can 
tell me why its now considering them the same anyway? Otherwise I will look 
more before too long.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740275#action_12740275
 ] 

Mark Miller commented on LUCENE-1749:
-

Here is the output - it appears to think String[] and StringIndex are both 
string:

VALUEMISMATCH: Multiple distinct value objects for 
org.apache.lucene.index.compoundfilereader$csindexin...@56d73c7a+string

'org.apache.lucene.index.compoundfilereader$csindexin...@56d73c7a'='string',class
 
org.apache.lucene.search.FieldCache$StringIndex,3,null,null=org.apache.lucene.search.FieldCache$StringIndex#279807577
 (size =~ 152 bytes)

'org.apache.lucene.index.compoundfilereader$csindexin...@56d73c7a'='string',class
 java.lang.String,11,null,null=[Ljava.lang.String;#647057258 (size =~ 108 
bytes)

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740278#action_12740278
 ] 

Mark Miller commented on LUCENE-1749:
-

{quote}(Actually: that seems like a wroth while improvement to make to these 
classes: a MultiDocValues impl that all of the getValues(IndexReader) methods 
use when passed a MultiReader ... it uses getSequentialSubReaders to construct 
DocValue instances for each so you don't get FieldCache expolsions if code 
inadvertenly passes the wrong reader to getValues. What do you think? ... new 
issue?){quote}

Very interesting idea - def a new issue I think. Not sure its worth it if you 
can't protect general fieldcache access as well though ...


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740308#action_12740308
 ] 

Mark Miller commented on LUCENE-1749:
-

Okay, sorry - I messed up when merging with trunk.

In TestSort you had moved the local sorting to the bottom in the multi sort 
test - I kept that, but I also kept them higher up. So thats the fail - they 
just have to be removed.

Line 953-957 it looks - sorry bout that - just didn't notice it fail with the 
other 3.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12740311#action_12740311
 ] 

Hoss Man commented on LUCENE-1749:
--

bq. I think that TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues 
all fail because the fix for them is now in another issue.

ah ... are you talking about LUCENE-1771 ? (the jira dependency sems backwards 
in that case)

bq. In TestSort you had moved the local sorting to the bottom in the multi sort 
test - I kept that, but I also kept them higher up. So thats the fail - they 
just have to be removed.

yeah .. i just caught that and was starting to reply ... the interestingthing 
is that the CacheEntry.toString() doesn't show the Local.US was used when 
getting the Strings[] FieldCache. .. i'm currently trying to figure out why 
(because that could confuse people as well)

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-05 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739743#action_12739743
 ] 

Mark Miller commented on LUCENE-1749:
-

patch is coming soon - I've merged to trunk and pulled the separate issues - 
just want to look over some a bit later. Would have had this sooner, but 
Eclipse decided to start crashing every 5 minutes this morning because firefox 
brought in a new xulrunner and ... ugg - at least its not Windows ... coming 
though.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739080#action_12739080
 ] 

Michael McCandless commented on LUCENE-1749:


bq. Right - that code was well tested and exercised via MultiSearcher in the 
past (all idf values had to come from Weight to avoid getting idfs per 
sub-searcher).

Ahh right.

bq. One thing that's missing for explain() is that there is no way to get df 
as opposed to idf from the Weight.

But this only affects the atomic queries, right?  So eg TermWeight
could simply hold onto this value and then use it during explain.
Hmm... though TermQuery's ctor doesn't get the df directly, because it
calls similarity.idf(term, searcher).  I don't really like making a
separate additional call to docFreq.

How about, for queries that need to go and look up docFreq, their
QueryWeight impls simply hold onto the [top-level] IndexSearcher that
had been passed to their ctor, and then do the docFreq call against
that, if explain is invoked?

bq. Right.. it doesn't belong there. Perhaps deprecate and remove from the 
Scorer base in 3.0? (since one can't reliably call it now anyway).

+1

Hoss/Mark do you want to fold it in to the patch, here?  Or I can open
a new issue?


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739089#action_12739089
 ] 

Yonik Seeley commented on LUCENE-1749:
--

{quote}
How about, for queries that need to go and look up docFreq, their QueryWeight 
impls simply hold onto the [top-level] IndexSearcher that
had been passed to their ctor, and then do the docFreq call against
that, if explain is invoked?
{quote}

Asking the searcher for the docFreq is the right thing to do... but people who 
rely on Weight being serializable might be in for a nasty surprise.
Of course... one might wonder if we should bother supporting serializable in 
Lucene longer term at all - anyone dealing with distributed systems has found 
it to have too many shortcomings anyway.


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739092#action_12739092
 ] 

Hoss Man commented on LUCENE-1749:
--

General Comments on mark's latest patch...
* the changes that i understand all seem good ... some of the details in 
reader/searcher/query internals elude me but it sounds Yonik  McCandless have 
their eyes on them so i trust the three of you have it covered.
* we still need to fill in some empty/sparse javadocs, but that can be done 
after an initial commit.
* is it a bug that AverageGuessMemoryModel.getSize() will NPE on a non 
primitive class ... or should/will the docs for that API say it only works on 
primitives?

Big Questions I Still Have
* does anyone have any reservations about the new APIs introduced?
  * FieldCache.CreationPlaceholder (promoted from FieldCacheImpl)
  * FieldCache.CacheEntry
  * FieldCache.getCacheEntries()
  * FieldCache.purgeAllCaches()
  * FieldCacheSanityChecker
  * RamUsageEstimator
* does anyone have any reservations about the refactoring done in 
FieldCacheImpl to make this new API possible? (ie: did i break the thread 
safety in a way i'm not noticing?)
* is the FieldCacheImpl.Entry.type (the SortField int type) still needed by 
FieldCacheImpl.Entry? ... nothing seems to use it so it would be nice to 
eliminate it and simplify the CacheEntry API as well.  (i suspect it got 
refactored into obsolescence when the Sorting got moved into the subreaders)
* The sanity checking ignores CreationPlaceholder -- largely because of the way 
the numeric caches first try one parser, and then if they get an NFE try a 
different parser -- but this leaves the CreationPlaceholder in the cache.  It's 
not a big object, so i assume it was implemented this way on purpose and the 
sanity checker is doing the correct thing by ignoring it, but i wanted to make 
sure people are aware/ok with this behavior.
 
Lastly: This patch feels unnecessarily large at this point.  Several of the 
bugs/improvements we've uncovered here don't seem like belong in this patch, 
and should be tracked in separate Jira issues, which can be committed 
independently and enumerated in CHANGES.txt
  * new ReaderUtil and the usage in DirectoryReader, MultiReader, MultiSearcher 
 IndexSearcher
  * explain subreader bug fixes in ConstantScoreQuery, QueryWeight, 
ValueSourceQuery, CustomScoreQuery, etc...
...i think this issue (and this patch) should be reduced to just the new sanity 
checkig API, and *tests* that have been changed to be more sane (where the 
underlying code was already fine)

Mark: would you mind splitting up the latest patch you have (you mentioned some 
additional minor tweaks) and opening new issues for these peripheral changes 
and then attaching back what's left for this patch.  Then I'll take the conch 
back and work on the missing javadocs.

(I'll happily commit once i get at least one thumbs up from someone on the Big 
Questions above ... we can always tweak the javadocs further in subsequent 
commits)



 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739093#action_12739093
 ] 

Hoss Man commented on LUCENE-1749:
--

bq. Hoss/Mark do you want to fold it in to the patch, here? Or I can open a new 
issue?

as i alluded to above, i'm in favor of individual issues for each bug 
uncovered by this issue so they can be tracked separately.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739097#action_12739097
 ] 

Mark Miller commented on LUCENE-1749:
-

bq. Mark: would you mind splitting up the latest patch you have (you mentioned 
some additional minor tweaks) and opening new issues for these peripheral 
changes and then attaching back what's left for this patch. 

I've already got separate issues and patches up were it makes sense (not the 
last one Mike mentions) - I wanted to keep them here too though until the 
insanity tests were complete - the tests that the fixes are somewhat correct 
are in this patch, and I don't like to manage layers of patches. if we don't 
plan on doing anymore with the insanity tests here though, I'll spin them out 
of this patch now.

I'll put up one more version and then you can have it back.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739108#action_12739108
 ] 

Mark Miller commented on LUCENE-1749:
-

bq. is it a bug that AverageGuessMemoryModel.getSize() will NPE on a non 
primitive class ... or should/will the docs for that API say it only works on 
primitives?

Its only meant to work with primitives. I'll change the name to 
getPrimitiveSize - on my last pass through, I'll also review the javadoc for 
the classes I added.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739122#action_12739122
 ] 

Michael McCandless commented on LUCENE-1749:


{quote}
bq. Hoss/Mark do you want to fold it in to the patch, here? Or I can open a new 
issue?

as i alluded to above, i'm in favor of individual issues for each bug 
uncovered by this issue so they can be tracked separately.
{quote}

OK I'll open a new issue for this one (deprecate Scorer.explain).

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739120#action_12739120
 ] 

Michael McCandless commented on LUCENE-1749:


{quote}
Asking the searcher for the docFreq is the right thing to do... but people who 
rely on Weight being serializable might be in for a nasty surprise.
{quote}

Argh, right.

bq. If we decide not to ask weight to hang onto it's searcher, then the other 
way to do it right is to change explain() to accept a Searcher as well as a 
IndexReader.

+1

bq. Of course... one might wonder if we should bother supporting serializable 
in Lucene longer term at all - anyone dealing with distributed systems has 
found it to have too many shortcomings anyway.

Yeah this was never really settled in LUCENE-1473.  Lucene currently
supports live serialization, but not cross-version
serialization... and we have moved RemoteSearchable to contrib and
removed RMI from Searchable.

Does Solr ever rely on Lucene's implements Serializable?


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12739163#action_12739163
 ] 

Yonik Seeley commented on LUCENE-1749:
--

bq. Does Solr ever rely on Lucene's implements Serializable?

Nope - itra-node communications use the same mechanism as clients... a generic 
data structures (Map,List,Document,etc) that has custom serialization to 
XML,JSON,Python,Ruby or Binary (and binary is now used by default).


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-03 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12738698#action_12738698
 ] 

Mark Miller commented on LUCENE-1749:
-

I've got one more draft here with the smallest of tweaks - javadoc spelling 
errors, and one perhaps one or two other tiny things - stuff I just would toss 
out rather than merge - but are you doing anything here right now Hoss? I think 
not at the moment, so if thats the case I'll put up one more patch before you 
grab the conch back. Otherwise I'll hold off on anything till you put something 
up.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-03 Thread Chris Hostetter (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12738721#action_12738721
 ] 

Chris Hostetter commented on LUCENE-1749:
-



: I've got one more draft here with the smallest of tweaks - javadoc 
: spelling errors, and one perhaps one or two other tiny things - stuff I 
: just would toss out rather than merge - but are you doing anything here 
: right now Hoss? I think not at the moment, so if thats the case I'll put 
: up one more patch before you grab the conch back. Otherwise I'll hold 
: off on anything till you put something up.


you have the conch ... i haven't worked on anything related to this issue 
since my last patch.

i'll try to look at it again tomorow.



-Hoss



 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-02 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12738100#action_12738100
 ] 

Michael McCandless commented on LUCENE-1749:


bq. That was my first thought... but it would probably break more than it 
helped right now (by exposing more limitations) - for example idf in 
TermWeight.explain()

Ugh, you're right.

I think It shouldn't be doing that?  Ie, a Weight instance should
capture all stats needed from the top-level searcher, on creation,
and then when we ask for a scorer or an explain (or other future
things that take an IndexReader) we should always pass in a single
segment reader.  This way we don't have to duplicate the go find the
right sub-reader in many places.

It's interesting that we didn't (I think?) have a similar problem w/
scorer when we switched to passing it the sub-reader.

bq. I'll leave the 'explain at multiple levels' for another issue

It looks like it's up to each query, which level does what.
IndexSearcher's explain always calls Weight.explain, but then some
Query impls (eg BooleanQuery) do everything in Weight.explain, while
others (eg TermQuery, PhraseQuery) do some work in Weight.explain and
some in the scorer.

I guess this makes sense: atomic Queries (TermQuery, PhraseQuery)
will need to fire up a scorer since there's real work to be done to
see the specifics of how that doc was matched.  Whereas BooleanQuery simply
glues together other queries so it doesn't need to forward to its
[many] scorers.

So the only odd thing is why explain is part of Scorer base
class... seems like the method could/should live privately to only
those queries that need it.

But I agree let's leave this be for now...


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-02 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12738140#action_12738140
 ] 

Yonik Seeley commented on LUCENE-1749:
--

bq. It's interesting that we didn't (I think?) have a similar problem w/ scorer 
when we switched to passing it the sub-reader.

Right - that code was well tested and exercised via MultiSearcher in the past 
(all idf values had to come from Weight to avoid getting idfs per sub-searcher).
One thing that's missing for explain() is that there is no way to get df as 
opposed to idf from the Weight.

bq. So the only odd thing is why explain is part of Scorer base class
Right.. it doesn't belong there.  Perhaps deprecate and remove from the Scorer 
base in 3.0? (since one can't reliably call it now anyway).


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-01 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737871#action_12737871
 ] 

Michael McCandless commented on LUCENE-1749:


This was an excellent idea, and it's great that it uncovered some
dangerous and very unexpected places where we are passing top-level
reader to the FieldCache (eg that explain() could suddenly populate
the FieldCache w/ top-level values is quite shocking!).

ReaderUtil.subSearcher is doing the same thing as
DirectoryReader.readerIndex.

I love the RAMUsageEstimator... we have other places that estimate RAM
(eg IndexWriter does so for added  deleted docs) that we should
eventually cutover to this new API.

I particularly love the new class named Insanity:

{code}
  public static Insanity[] checkSanity(FieldCache cache)
{code}

MultiDocIdSet/Iterator makes me a bit nervous, because it's further
propogating a non-segment-based iterator deeper into Lucene than I
think we want to.  It's similar to eg using
DirectoryReader.MultiTermDocs (what Lucene used to do), instead of
stepping through the segments yourself.

Also, shouldn't explain most closely match what was done during
searching (ie, run per segment)?  So simply pushing explain down to
the sub-reader that has the doc seems appropriate?  Ie we want it to
share as much of the code path as possible with how searching was in
fact done?

EG for ConstantScoreQuery.explain, it seems like we should 1) locate
the sub-reader that this doc falls in, and 2) get a scorer against
that reader, then 3) build up the explanation from that?  Likewise for
CustomScoreQuery? 

In fact maybe we should simply fix IndexSearcher.explain to do
this for all queries?  Ie, get the top-level weight, locate sub-reader
that has the doc, un-base the doc, and then invoke QueryWeight.explain
with that sub-reader and un-based doc?  Then we don't have to do
anything special for each query.  I think QueryWeight.scorer()
shouldn't be expected to handle a top level reader being passed in.
Ie, higher up in Lucene we should do that switch, so that we don't
have to do it (this valuesFromSubReaders arg) for every scorer.

Hmm: why do we even have explain at both the QueryWeight and Scorer
levels?  It seems like we should pick one level and do it there,
consistently.  Most queries seem to only implement the QueryWeight one
and often simply throw UOE in the Scorer's explain, but eg PhraseQuery
implements in both places.

(BTW: I'll be offline for approx the next 36 hours or so!)


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-01 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737877#action_12737877
 ] 

Yonik Seeley commented on LUCENE-1749:
--

bq. In fact maybe we should simply fix IndexSearcher.explain to do this for 
all queries?

That was my first thought... but it would probably break more than it helped 
right now (by exposing more limitations) - for example idf in 
TermWeight.explain()


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-01 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737984#action_12737984
 ] 

Mark Miller commented on LUCENE-1749:
-

bq. I was a bit hazy on explain, so for some reason I had it in my head that 
you would have to combine the explanations from multiple subreaders

bq. but it would probably break more than it helped right now (by exposing more 
limitations) - for example idf in TermWeight.explain()

To be a little more clear - this was originally why I went the direction I did 
- I assumed the reader was being used for stats that needed to come from the 
top level reader. Gut reaction seeing it go into scorer. I hadn't really 
checked that, at least for these queries, that wasn't the case - they just use 
it for the filter/fieldcache.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-31 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737509#action_12737509
 ] 

Mark Miller commented on LUCENE-1749:
-

{quote}I don't really follow you on this (i need to take a look at your 
proposed fix) .. i'm not suggesting we push the whole explain down to the 
subreader, just that when the explain method wants to get hte FieldCache value 
for a doc, it should fetch the FieldCache for the SegmentReader the doc is in - 
so it gets the exact same value (and same FieldCache entry) as the scoring code 
did when it scores the document. (or maybe i'm completley missunderstanding how 
these classes were reimplimented to use segment based field caches){quote}

The way the per segment stuff went, we don't push down to the sub readers for 
the fieldcache per say - we just search each sub reader separately - so per 
reader fieldcache is just kind of a side effect. Then the top level reader is 
still used for things like stats and explain. 

I switched the explain for the offending stuff (custom/valuesource) to use a 
DocValues class that does push down to each subreader for the fieldcache though 
(while everything else still uses the top reader) - its in the scorer, so I 
added a switch to push down to subreaders for fieldcache access or not - only 
explain pushes  down, while regular scoring doesn't (regular scoring will be 
working per sub reader anyway, because they are searched one at a time). 

I can merge up the patches.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-31 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737517#action_12737517
 ] 

Mark Miller commented on LUCENE-1749:
-

bq. the explain method wants to get hte FieldCache value for a doc, it should 
fetch the FieldCache for the SegmentReader the doc is in

One more note to try and be a bit more clear:

First, I wasn't sure how easy this was to do because I don't know explain code 
or the function package very well at all (eg I've never used it). And the 
explain method itself did not grab values from the field cache, it loaded up a 
scorer that did so. So I just wasn't sure how doable this fix was. Thats why I 
was saying pushing down explain to the subreader wasn't great, but I wasn't 
sure what else you could do. The fix turned out to be fairly easy though - the 
scorer for valuesource just needed two modes - one for normal scoring and one 
for explain (that breaks up the requests for a fieldcache val per sub reader) - 
the explain method would work for both ways, but no reason to try and break 
down per reader when its going to score per reader anyway, so I have both. 
Standard scorer for valuesource works as it did, and explain trips a setting to 
break out subreaders and distrib fieldcache requests. And then the custom query 
needed a tweak to work right (flip that setting) with its underlying 
valuesource queries.



 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-31 Thread Yonik Seeley (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737626#action_12737626
 ] 

Yonik Seeley commented on LUCENE-1749:
--

I believe that ConstantScoreQuery will need it's explain() fixed too?

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-31 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737792#action_12737792
 ] 

Mark Miller commented on LUCENE-1749:
-

In the case that is a caching filter? I hadn't actually looked to see if there 
are any other FieldCache ones either - just what tripped the tests.

I guess it could be dealt with the same way? A DocIdSet that distributes to sub 
readers ...

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-31 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737832#action_12737832
 ] 

Mark Miller commented on LUCENE-1749:
-

Already finding some corner cases with the multi docidset stuff - I'll keep 
working along those lines a bit, then maybe look at some of the code you have 
been working on and post another patch later this weekend.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-31 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737834#action_12737834
 ] 

Mark Miller commented on LUCENE-1749:
-

In the insanity check, when you drop into the sequential subreaders - I think 
its got to be recursive - you might have a multi at the top with other subs, or 
any combo thereof. I can add to next patch.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: [jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-31 Thread Chris Hostetter

: In the insanity check, when you drop into the sequential subreaders - I 
: think its got to be recursive - you might have a multi at the top with 
: other subs, or any combo thereof. I can add to next patch.

i don't have the code in front of me, but i thought i was adding the sub 
readers to the list it's iterating over, so it will eventually recurse all 
the way to the bottom.


: 
:  FieldCache introspection API
:  
: 
:  Key: LUCENE-1749
:  URL: https://issues.apache.org/jira/browse/LUCENE-1749
:  Project: Lucene - Java
:   Issue Type: Improvement
:   Components: Search
: Reporter: Hoss Man
: Priority: Minor
:  Fix For: 2.9
: 
:  Attachments: fieldcache-introspection.patch, 
LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
LUCENE-1749.patch, LUCENE-1749.patch
: 
: 
:  FieldCache should expose an Expert level API for runtime introspection of 
the FieldCache to provide info about what is in the FieldCache at any given 
moment.  We should also provide utility methods for sanity checking that the 
FieldCache doesn't contain anything odd...
: * entries for the same reader/field with different types/parsers
: * entries for the same field/type/parser in a reader and it's 
subreader(s)
: * etc...
: 
: -- 
: This message is automatically generated by JIRA.
: -
: You can reply to this email to add a comment to the issue online.
: 
: 
: -
: To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
: For additional commands, e-mail: java-dev-h...@lucene.apache.org
: 



-Hoss


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737217#action_12737217
 ] 

Hoss Man commented on LUCENE-1749:
--

Mark: thanks for looking into the tests.

If the CustomScoreQuery class(es) push the FieldCache sage into the subReaders 
during scoring, then shouldn't the explain methods do the same thing?  it 
definitely seems like a bug if getting score explanation from a query causes 
your memory footprint to double.

Last night i thought over what a more useful API for hte sanity checker would 
like like ... 
My power is getting turned off for a few hours this afternoon so i'll work on 
it them and should have a much cleaner looking patch to post this evening.

(BTW: random thought that occurred to me last night: wouldn't the simplest way 
to implement the RamEstimator just be to use vanilla java serialization to a 
custom OutputStream that just counted the bytes and sent them to /dev/null) ?

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737247#action_12737247
 ] 

Mark Miller commented on LUCENE-1749:
-

bq. (BTW: random thought that occurred to me last night: wouldn't the simplest 
way to implement the RamEstimator just be to use vanilla java serialization to 
a custom OutputStream that just counted the bytes and sent them to /dev/null) ?

That's one way to go. Its got its own little issues though - some bookkeeping 
stuff is not serialized, and extra info about class, fields is serialzied. 
Transient fields (niche issue for sure) would also not be serialized. Its def 
another way to get an estimate. I chose a different route after considering 
both (googled the topic for a bit and looked at some examples before choosing). 
I'd be open to another route, but I thought this method was fairly fast, 
accurate, and generic.


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737254#action_12737254
 ] 

Mark Miller commented on LUCENE-1749:
-

bq. If the CustomScoreQuery class(es) push the FieldCache sage into the 
subReaders during scoring, then shouldn't the explain methods do the same 
thing? it definitely seems like a bug if getting score explanation from a query 
causes your memory footprint to double.

It *should* do the same thing - but thats sticky. If you push explain to the 
sub readers, you will get why it scored as it did for each subreader - not one 
top level explain. I won't deny its kind of bug - but I'm not sure at the 
moment what the best way to address it is. I'll look into the possibility of 
pushing the fieldcache access to the subreaders while leaving everything else 
at the top reader - I have no thoughts about the feasibility of that at the 
moment though. I guess it might be doable.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737294#action_12737294
 ] 

Mark Miller commented on LUCENE-1749:
-

This issue was a fantastic idea by the way!

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-30 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12737424#action_12737424
 ] 

Hoss Man commented on LUCENE-1749:
--

Quick responses to some other comments...

bq. I chose a different route after considering both

i trust you to make the right call, just thought i'd point it out in case you 
hadn't though of it.

bq. If you push explain to the sub readers, you will get why it scored as it 
did for each subreader - not one top level explain

I don't really follow you on this (i need to take a look at your proposed fix) 
.. i'm not suggesting we push the whole explain down to the subreader, just 
that when the explain method wants to get hte FieldCache value for a doc, it 
should fetch the FieldCache for the SegmentReader the doc is in -- so it gets 
the exact same value (and same FieldCache entry) as the scoring code did when 
it scores the document.  (or maybe i'm completley missunderstanding how these 
classes were reimplimented to use segment based field caches)

bq. This issue was a fantastic idea by the way!

yeah ... i was pretty out of the loop on all the push sorting down into the 
segment discussion, but when i noticed yonik pointing out all the ways solr's 
fieldcache usage was going to explode if we didn't change it it occured to me 
that this would probably be a big problem for anyone doing non-trivial stuff 
with Lucene, so it would be nice to have a way to toruble shoot it (i also had 
very little faith in Lucene-Java's test coverage since we only have unit tests 
that verify correct behavior when we make changes -- but nothing sanity 
checks how that behavior happened (at least: not untill now)


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, 
 LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736732#action_12736732
 ] 

Mark Miller commented on LUCENE-1749:
-

bq. figure out why previously mentioned tests are breaking (need help with this 
one ... don't know enough about the code these tests excercise

Eh - its yucky. There are parts where the tests are passing the top level 
reader (say to a collector) when it should be using the sub readers. I fixed 
one :)
But then there is more - looked at a couple more difficult ones that also pass 
the top level reader for the test.

And then there is explain - IndexSearcher passes the top level reader to the 
weight explain, and valuesourcequery will get a fieldcache based on that 
reader. I guess that one is a bug.

And there are prob a few other similar type things...

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-29 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736750#action_12736750
 ] 

Mark Miller commented on LUCENE-1749:
-

bq. And then there is explain - IndexSearcher passes the top level reader to 
the weight explain, and valuesourcequery will get a fieldcache based on that 
reader. I guess that one is a bug.

I don't even know what to do about this one. All I can think is that you pump 
out an explain for each sub reader - but thats pretty unhelpful.

Perhaps the best we can do is javadoc the extra requirements that may be needed 
when you use explain?

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-28 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736304#action_12736304
 ] 

Hoss Man commented on LUCENE-1749:
--

Mark: i have a little time to work on this today ... do you have any updates 
that youv'e been working on locally (i noticed some patch add/retract from you 
in hte history)

Paul: over in LUCENE-831 there was a lot of discussion and work done towards 
making the entire FieldCAche internals pluggable so you could customize the 
cache behavior all sorts of ways ... i feel out of the loop on that issue, but 
my understanding is that it was pushed back to 3.1 at the earliest because it 
wasn't clear how the APIs should be setup given the work being done with reopen 
and with moving FieldCache usage down to the subreaders.

for now my goal with this issue (LUCENE-1749) is purely to provide an 
experimental  (ie: no back compat expectations) API for app developers to use 
to sanity check that the changes in 2.9 havne't blown their RAM usage sky high.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736309#action_12736309
 ] 

Mark Miller commented on LUCENE-1749:
-

I do - I removed that last patch because I just realized that it was missing 
everything but one class.

Go ahead though - I'll merge with what you have.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-28 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12736320#action_12736320
 ] 

Mark Miller commented on LUCENE-1749:
-

No worries, the updates are to the ram estimator and other minor things (eg if 
something fails the sanity check, the error output comes out twice because of 
the double check in teardown ) - nothing feature wise at the moment. I'll see 
what you add first.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-24 Thread Paul Smith (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12735242#action_12735242
 ] 

Paul Smith commented on LUCENE-1749:


You know what would be absolute icing on the cake here would be some way during 
the introspection by some code looking for large sort fields that perhaps can 
be discarded/unloaded as needed (programmatically).

What I'm thinking here is a use case we've come into where we have had to sort 
by subject.  Well the unique # subjects gets pretty large, and while we still 
need to support the use case, it'd be nice to be able to periodically 'toss' 
sort fields like this so they don't hog memory permanently while the 
IndexReader is still in memory.  (sorting by subject is used, just not often so 
a good candidate for tossing)

Because we have multiple large IndexReaders open concurrently, it'd be nice to 
be able to scan periodically and kick out any unneeded ones.

It's nice to be able to inspect and print out these, but even better if one can 
make changes based on what one finds.



 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-21 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733955#action_12733955
 ] 

Hoss Man commented on LUCENE-1749:
--

note to self: of the contribs, TestRemoteSort had two failed tests (not 
horribly surprising) and PatternAnalyzerTest generated an Error (?!?!) ...

{code}
java.lang.IllegalStateException: termText
at 
org.apache.lucene.index.memory.PatternAnalyzerTest.assertEquals(PatternAnalyzerTest.java:213)
at 
org.apache.lucene.index.memory.PatternAnalyzerTest.run(PatternAnalyzerTest.java:148)
at 
org.apache.lucene.index.memory.PatternAnalyzerTest.testMany(PatternAnalyzerTest.java:87)
{code}

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch, 
 LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732110#action_12732110
 ] 

Hoss Man commented on LUCENE-1749:
--


The motivation for this issue is all of the changes coming in 2.9 in how Lucene 
internally uses the FieldCache API -- the biggest change being per Segment 
sorting, but there may be others not immediately obvious.

While these changes are backwards compatible from an API and functionality 
perspective, they could have some pretty serious performance impacts for 
existing apps that also use the FieldCache directly and after upgrading the 
apps suddenly seem slower to start (because of redundant FieldCache 
initialization) and require 2X as much RAM as they did before.  This could lead 
people people to assume Lucene has suddenly became a major memory hog.  
SOLR- and SOLR-1247 are some quick examples of the types of problems that 
apps could encounter.

Currently the only way for a User to even notice the problem is to do memory 
profiling, and the FieldCache data structure isn't the easiest to understand.  
It would be a lot nicer to have some methods for doing this inspection 
programaticly, so users could write automated tests for incorrect/redundent 
usage.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor

 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-16 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732123#action_12732123
 ] 

Mark Miller commented on LUCENE-1749:
-

nice - would be great if it could estimate ram usage as well.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-16 Thread Michael McCandless (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732157#action_12732157
 ] 

Michael McCandless commented on LUCENE-1749:


+1 -- this'd be great to get into 2.9.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-16 Thread Uwe Schindler (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732166#action_12732166
 ] 

Uwe Schindler commented on LUCENE-1749:
---

Looks good as a start, one question about a comment:

What do you mean with:
 * :TODO: is the int sort type still needed? ... doesn't seem to be used 
anywhere, code just tests custom for SortComparator vs Parser.

I do not understand, do you want to remove the IntCache? What is different with 
it in comparison with the other ones?

Uwe

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-16 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732190#action_12732190
 ] 

Hoss Man commented on LUCENE-1749:
--

bq. :TODO: is the int sort type still needed? ... doesn't seem to be used 
anywhere, code just tests custom for SortComparator vs Parser.

sorry ... badly placed quotes ... that was in referent to Entry.type. 

Until i changed getStrings, getStringIndex, and getAuto to construct Entry 
objects as part of my refactoring the type attribute (and the constructor 
that takes a type argument) didnt' seem to be used anywhere (as far as i could 
tell)

My guess: maybe some previous changes refactored logic that switched on type up 
into the SortFields?, so the FieldCache no longer needs to care about it?


 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-07-16 Thread Mark Miller (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12732297#action_12732297
 ] 

Mark Miller commented on LUCENE-1749:
-

We prob would want to provide an alternate toString that includes the ram guess 
and the default that skips it - i havn't tested performance, but it might take 
a while to check a gigantic string array.

 FieldCache introspection API
 

 Key: LUCENE-1749
 URL: https://issues.apache.org/jira/browse/LUCENE-1749
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9

 Attachments: fieldcache-introspection.patch, LUCENE-1749.patch


 FieldCache should expose an Expert level API for runtime introspection of the 
 FieldCache to provide info about what is in the FieldCache at any given 
 moment.  We should also provide utility methods for sanity checking that the 
 FieldCache doesn't contain anything odd...
* entries for the same reader/field with different types/parsers
* entries for the same field/type/parser in a reader and it's subreader(s)
* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org