[jira] Commented: (LUCENE-2368) stopword files should be versioned; acessor for default(s) should take a Version property

2010-04-05 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853691#action_12853691
 ] 

Hoss Man commented on LUCENE-2368:
--

bq. I wonder if we should just break now (by renaming these 3) and version all 
the files so its clean.

I didn't realize we even had those.

The other option is to *not* rename any of the files, but clearly document what 
the naming convention is coming forward -- as i mentioned in the comment i just 
added (with more details beyond the summary description) the names don't have 
to match Lucene Version semantics ... they just have to be something that is 
unique moving forward.  Specificly: we should never modify the contents of the 
files, we should just add a new file and "deprecate" the old file.

but the naming convention could easily be...

stopwords_esperanto.txt
stopwords_esperanto_2.txt
stopwords_esperanto_3.txt
...



> stopword files should be versioned; acessor for default(s) should take a 
> Version property
> -
>
> Key: LUCENE-2368
> URL: https://issues.apache.org/jira/browse/LUCENE-2368
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis
>Reporter: Hoss Man
> Fix For: 2.3.3
>
>
> The existing language specific stopword files on the trunk have no version 
> info in their filenames -- this will make it awkward/confusing to update them 
> as time goes on.  LIkewise, many classes have a "getDefaultStopSet()" which 
> makes these methods (when called by client code) suffer from the same API 
> back-compat issues that the Analyzers themselves did before we added Version.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2368) stopword files should be versioned; acessor for default(s) should take a Version property

2010-04-05 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12853688#action_12853688
 ] 

Hoss Man commented on LUCENE-2368:
--

This is something i brought up with Robert on IRC a few days ago, and forgot to 
file an issue for...

* We should make all the langauge specific stopword files have something in 
their name that identifies them so we can add newer versions of them over time 
with distiguished names.  The simplest convention moving forward would probably 
be to name the file after the first Lucene version it was added in (ie: 
"russian_stop_3_3.txt") but there is no reason why the names have to directly 
corrispond to the Lucene Version -- they could just as easily have completely 
sequential names (ie: "russian_stop_001.txt" or "russian_stop_AAA.txt"). 

* All of the static "getDefaultStopSet()" methods in all of the various 
Analyzers should be changed to take in a Version param which picks the 
appropriate file (or staticly compiled set) based on the param.  Any Analyzer 
that already has Version based stopword switching logic in it's constructor 
should instead just delegate to the getDefaultStopSet() method.



> stopword files should be versioned; acessor for default(s) should take a 
> Version property
> -
>
> Key: LUCENE-2368
> URL: https://issues.apache.org/jira/browse/LUCENE-2368
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis
>Reporter: Hoss Man
> Fix For: 2.3.3
>
>
> The existing language specific stopword files on the trunk have no version 
> info in their filenames -- this will make it awkward/confusing to update them 
> as time goes on.  LIkewise, many classes have a "getDefaultStopSet()" which 
> makes these methods (when called by client code) suffer from the same API 
> back-compat issues that the Analyzers themselves did before we added Version.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2368) stopword files should be versioned; acessor for default(s) should take a Version property

2010-04-05 Thread Hoss Man (JIRA)
stopword files should be versioned; acessor for default(s) should take a 
Version property
-

 Key: LUCENE-2368
 URL: https://issues.apache.org/jira/browse/LUCENE-2368
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Analysis
Reporter: Hoss Man
 Fix For: 2.3.3


The existing language specific stopword files on the trunk have no version info 
in their filenames -- this will make it awkward/confusing to update them as 
time goes on.  LIkewise, many classes have a "getDefaultStopSet()" which makes 
these methods (when called by client code) suffer from the same API back-compat 
issues that the Analyzers themselves did before we added Version.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2323) reorganize contrib modules

2010-03-23 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12848853#action_12848853
 ] 

Hoss Man commented on LUCENE-2323:
--

bq. If no one objects, (especially including Hoss Man)

I really have no opinions, I was just trying to chime in with my memories of 
hte past discussions -- i don't necessarily think one way or another is more 
good/bad right/wrong.

go with your gut.

> reorganize contrib modules
> --
>
> Key: LUCENE-2323
> URL: https://issues.apache.org/jira/browse/LUCENE-2323
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/*
>Reporter: Robert Muir
>Assignee: Robert Muir
> Attachments: LUCENE-2323.patch
>
>
> it would be nice to reorganize contrib modules, so that they are bundled 
> together by functionality.
> For example:
> * the wikipedia contrib is a tokenizer, i think really belongs in 
> contrib/analyzers
> * there are two highlighters, i think could be one highlighters package.
> * there are many queryparsers and queries in different places in contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2323) reorganize contrib modules

2010-03-17 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846712#action_12846712
 ] 

Hoss Man commented on LUCENE-2323:
--

bq. I didn't know this was the goal, if what you say is true, then I must say i 
completely misunderstood, I completely disagree, and I'm completely off-base 
with this issue.

I'm not saying it is a goal, or should be a goal, just that i seem to remember 
that this was teh direction that seemed to have support the last time i 
remember there being a big "reorg the contribs" discussion.  (i could be 
remembering wrong, it could be that *I* thought it was a really great idea at 
the time so it stuck with me, and now i'm just more ambivalent)  

A quick skim suggests this is the most recent thread i'm thinking of...

http://old.nabble.com/New-flexible-query-parser-to22549684.html#a22637326
("kitchen sink" was the search term i was looking for)

...but i don't think that was the first time it came up.

> reorganize contrib modules
> --
>
> Key: LUCENE-2323
> URL: https://issues.apache.org/jira/browse/LUCENE-2323
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/*
>Reporter: Robert Muir
>
> it would be nice to reorganize contrib modules, so that they are bundled 
> together by functionality.
> For example:
> * the wikipedia contrib is a tokenizer, i think really belongs in 
> contrib/analyzers
> * there are two highlighters, i think could be one highlighters package.
> * there are many queryparsers and queries in different places in contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2323) reorganize contrib modules

2010-03-17 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846707#action_12846707
 ] 

Hoss Man commented on LUCENE-2323:
--

bq. Perhaps I want to refactor some code among our 7 queryparsers or 2 
highlighters or whatever, the only way I can do this is to shove stuff (shared 
code) into core, I think this is bad.

agreed ... IIRC the idea in this discussion was the have a lot more smaller 
"modules", with a lot better defined/advertised dependencies, so that module 
X,Y,Z might all depend on modules A, and B (which had the common refactored 
code you speak of) and the "core" module is special in that it must never 
depend on anything else.

Like i said: I personally don't have a very strong opinion about this, i think 
people who are really concerned about jar sizes can compile their own after 
pruning the classes they don't care about -- but it's definitely harder when 
those classes are all in one atomic source tree where you might not notice that 
someone refactored a common dependency that wasn't there before.


> reorganize contrib modules
> --
>
> Key: LUCENE-2323
> URL: https://issues.apache.org/jira/browse/LUCENE-2323
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/*
>Reporter: Robert Muir
>
> it would be nice to reorganize contrib modules, so that they are bundled 
> together by functionality.
> For example:
> * the wikipedia contrib is a tokenizer, i think really belongs in 
> contrib/analyzers
> * there are two highlighters, i think could be one highlighters package.
> * there are many queryparsers and queries in different places in contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2323) reorganize contrib modules

2010-03-17 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12846700#action_12846700
 ] 

Hoss Man commented on LUCENE-2323:
--

I personally don't have a strong opinion on this, but i wanted to point it out 
for completeness:

the last time i remember a big discussion about reorging contribs, there seemed 
to be a strong sentiment that we should be striving for more "small" 
contribs/modules -- specificly in terms of artifact size/complexity.  I think 
one specific example was that some poeple might want a few langauge specific 
analyzers, but not all of them -- and if they have no direct dependencies on 
each other (just core) we should try to build/distribute them as (tiny) 
individual Jars -- and possible in (big) bundled jars as well.

So while it might make a lot of sense to organize some existing contribs into 
logical "groups" which might get build up in big bundled jars, there are likely 
going to be people who still want to comsume the existing jars (or even more 
granular jars)

Looking at the specific suggestions robert made: it makes sense to logically 
organize all the query parsers under a common directory, but how many users are 
actually using more then one and are we doing them a disservice if we only ship 
them in one big jar?   Ditto for the highlighters (does anyone besides Solr use 
*both* highlighters in a single application?)

> reorganize contrib modules
> --
>
> Key: LUCENE-2323
> URL: https://issues.apache.org/jira/browse/LUCENE-2323
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/*
>Reporter: Robert Muir
>
> it would be nice to reorganize contrib modules, so that they are bundled 
> together by functionality.
> For example:
> * the wikipedia contrib is a tokenizer, i think really belongs in 
> contrib/analyzers
> * there are two highlighters, i think could be one highlighters package.
> * there are many queryparsers and queries in different places in contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2209) add @experimental javadocs tag

2010-01-14 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800517#action_12800517
 ] 

Hoss Man commented on LUCENE-2209:
--

bq. p.s. hossman i only commented because you are out to get me

I'm deeply hurt that you think I am out to get you -- It's just that there are 
just some things i feel very passionate about.

It just so happens that undermining everything you do, and contradicting 
everything you say, are the two things i'm most passionate about in the whole 
wide world ... but that doesn't mean i'm out to get you.



> add @experimental javadocs tag
> --
>
> Key: LUCENE-2209
> URL: https://issues.apache.org/jira/browse/LUCENE-2209
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Javadocs
>Reporter: Robert Muir
> Attachments: LUCENE-2209.patch, LUCENE-2209.patch
>
>
> There are a lot of things marked experimental, api subject to change, etc. in 
> lucene.
> this patch simply adds a @experimental tag to common-build.xml so that we can 
> use it, for more consistency.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2209) add @experimental javadocs tag

2010-01-14 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12800304#action_12800304
 ] 

Hoss Man commented on LUCENE-2209:
--

small suggestion...

@todo is a pretty wide spread and long used custom javadoc tag, so most people 
don't worry about it ... but for any other custom tags that projects use, it's 
strongly suggested that they always have a "." in their name.  The Javadoc 
compatibility contract is that future versions of javadoc won't add tags that 
have periods in their name so it's the way to avoid collisions (you should 
actauly see a warning about using a tag without a "." in it's name from javadoc 
when you declare these.

So i would suggest @lucene.internal, @lucene.expert, @lucene.experimental, 
etc...

> add @experimental javadocs tag
> --
>
> Key: LUCENE-2209
> URL: https://issues.apache.org/jira/browse/LUCENE-2209
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Javadocs
>Reporter: Robert Muir
> Attachments: LUCENE-2209.patch
>
>
> There are a lot of things marked experimental, api subject to change, etc. in 
> lucene.
> this patch simply adds a @experimental tag to common-build.xml so that we can 
> use it, for more consistency.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2164) Make CMS smarter about thread priorities

2009-12-15 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12790979#action_12790979
 ] 

Hoss Man commented on LUCENE-2164:
--

bq. I also increased the default max thread count from 1 to 2.

random thought from the peanut gallery: do we want to go down the "Ergonomics" 
route and make the default number of threads vary based on 
Runtime.getRuntime().availableProcessors()

(ie: 1 on a single threaded box, NUM_PROCESSORS/CONSTANT on multithreaded 
boxes?)

?

> Make CMS smarter about thread priorities
> 
>
> Key: LUCENE-2164
> URL: https://issues.apache.org/jira/browse/LUCENE-2164
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Index
>Reporter: Michael McCandless
>Assignee: Michael McCandless
>Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2164.patch
>
>
> Spinoff from LUCENE-2161...
> The hard throttling CMS does (blocking the incoming thread that wants
> to launch a new merge) can be devastating when it strikes during NRT
> reopen.
> It can easily happen if a huge merge is off and running, but then a
> tiny merge is needed to clean up recently created segments due to
> frequent reopens.
> I think a small change to CMS, whereby it assigns a higher thread
> priority to tiny merges than big merges, should allow us to increase
> the max merge thread count again, and greatly reduce the chance that
> NRT's reopen would hit this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2092) BooleanQuery.hashCode and equals ignore isCoordDisabled

2009-11-23 Thread Hoss Man (JIRA)
BooleanQuery.hashCode and equals ignore isCoordDisabled
---

 Key: LUCENE-2092
 URL: https://issues.apache.org/jira/browse/LUCENE-2092
 Project: Lucene - Java
  Issue Type: Bug
  Components: Query/Scoring
Affects Versions: 2.9.1, 2.9, 2.4.1, 2.4, 2.3.2, 2.3.1, 2.3, 2.2, 2.1, 
2.0.0, 1.9
Reporter: Hoss Man


BooleanQuery.isCoordDisabled() is not considered by BooleanQuery's hashCode() 
or equals() methods ... this can cause serious badness to happen when caching 
BooleanQueries.

bug traces back to at least 1.9

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Assigned: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-13 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reassigned LUCENE-2037:


Assignee: Erick Erickson

It's all yours Erick.

> Allow Junit4 tests in our environment.
> --
>
> Key: LUCENE-2037
> URL: https://issues.apache.org/jira/browse/LUCENE-2037
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Other
>Affects Versions: 3.1
> Environment: Development
>Reporter: Erick Erickson
>Assignee: Erick Erickson
>Priority: Minor
> Fix For: 3.1
>
>   Original Estimate: 8h
>  Remaining Estimate: 8h
>
> Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
> Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
> have to be rewritten. We should start this for the 3.1 release so we can get 
> a clean 3.0 out smoothly.
> It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1974:
-

Attachment: LUCENE-1974.test.patch

tweaked test so that it can be applied to 2.4.1 (by removing readOnly param 
from IndexSearcher constructor)

verified this test passes against 2.4.1 ... it's a new bug in 2.9.0

> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch, 
> LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1974) BooleanQuery can not find all matches in special condition

2009-10-13 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1974:
-

Attachment: LUCENE-1974.test.patch

this is the same as the previously attached test but i've simplified it (to me) 
and revamped it to be a patch that can be applied to 2.9.0.

I can confirm that it fails for me (against 2.9.0) and seems to suggest a weird 
hit collection bug somwhere in the BooleanScorer or Prefix scoring code 

(a prefix query works, a boolean query containing term queries work, but a 
boolean query containing a prefix query fails to find all the expected matches)

Unless i'm missing something really silly, this suggests a pretty heinious bug 
somewhere in the core scoring code.

> BooleanQuery can not find all matches in special condition
> --
>
> Key: LUCENE-1974
> URL: https://issues.apache.org/jira/browse/LUCENE-1974
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Query/Scoring
>Affects Versions: 2.9
>Reporter: tangfulin
> Attachments: BooleanQueryTest.java, LUCENE-1974.test.patch
>
>
> query: (name:tang*)
> doc=5137 score=1.0  doc:Document>
> doc=11377 score=1.0  doc:Document>
> query: name:tang* name:notexistnames
> doc=5137 score=0.048133932  doc:Document>
> It is two queries on the same index, one is just a prefix query in a
> boolean query, and the other is a prefix query plus a term query in a
> boolean query, all with Occur.SHOULD .
> what I wonder is why the later query can not find the doc=11377 doc ?
> the problem can be repreduced by the code in the attachment .

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1897) Links to javadoc from the site pages do not work in the src dist because it does not include the javadoc under docs.

2009-09-11 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754486#action_12754486
 ] 

Hoss Man commented on LUCENE-1897:
--

bq. If the README said, you will find this jar at /lucene/jar and then I go to 
/lucene/jar and nothing is there, I would personally find that a bit confusing. 
But it wouldnt bother me too much I guess.

That's almost exactly what our README file says actually...

{noformat}
...
FILES

lucene-core-XX.jar
  The compiled lucene library.

lucene-demos-XX.jar
  The compiled simple example code.

luceneweb.war
  The compiled simple example Web Application.


contrib/*
  Contributed code which extends and enhances Lucene, but is not
  part of the core library.  Of special note are the JAR files in the analyzers 
and snowball directory which
  contain various analyzers that people may find useful in place of the Standard
Analyzer.

docs/index.html
  The contents of the Lucene website.

docs/api/index.html
  The Javadoc Lucene API documentation.  This includes the core
  library, the demo, as well as all of the contrib modules.
...
{noformat}

...if anything, it's somewhat odd that we include the generated documentation 
in our source release (instead of just the forrest docs) but that's only 
because the generated docs are also committed into SVN (which now that i tihnk 
about it is also really weird since we seperated the version specific docs from 
the site docs)



> Links to javadoc from the site pages do not work in the src dist because it 
> does not include the javadoc under docs.
> 
>
> Key: LUCENE-1897
> URL: https://issues.apache.org/jira/browse/LUCENE-1897
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Javadocs
>Reporter: Mark Miller
>Priority: Minor
> Fix For: 2.9
>
>
> Links to javadoc from the site pages do not work in the src dist because it 
> does not include the javadoc under docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1897) Links to javadoc from the site pages do not work in the src dist because it does not include the javadoc under docs.

2009-09-11 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754379#action_12754379
 ] 

Hoss Man commented on LUCENE-1897:
--

Honestly: i don't think this is a bug.  javadocs are generated, just like jar 
files ... if we had a refrence to a jar file in a README or an html page we 
wouldn't consider it a bug that people couldn't find that jar file in a source 
release, it's understood that they have to *build* the source release for the 
generated artifacts to be where they expect.

as long as running "ant javadoc" in a source release puts the javadocs in the 
right spot so that the linkss start working, we're doing the right thing.

> Links to javadoc from the site pages do not work in the src dist because it 
> does not include the javadoc under docs.
> 
>
> Key: LUCENE-1897
> URL: https://issues.apache.org/jira/browse/LUCENE-1897
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Javadocs
>Reporter: Mark Miller
>Priority: Minor
> Fix For: 2.9
>
>
> Links to javadoc from the site pages do not work in the src dist because it 
> does not include the javadoc under docs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1816) example code in overview.html uses deprecated syntax

2009-09-08 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1816?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1816.
--

Resolution: Duplicate
  Assignee: Mark Miller

miller fixed this in r807755 

> example code in overview.html uses deprecated syntax
> 
>
> Key: LUCENE-1816
> URL: https://issues.apache.org/jira/browse/LUCENE-1816
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Javadocs
>Affects Versions: 2.9
>Reporter: Daniel Naber
>Assignee: Mark Miller
>Priority: Minor
> Attachments: overview.diff
>
>
> The examples should use non-deprecated syntax only. Im' attaching a patch, 
> but other parts of that page might also be out-of-date, which I didn't check 
> now.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Assigned: (LUCENE-1902) Changes.html not explicitly included in release

2009-09-08 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reassigned LUCENE-1902:


Assignee: Hoss Man

> Changes.html not explicitly included in release
> ---
>
> Key: LUCENE-1902
> URL: https://issues.apache.org/jira/browse/LUCENE-1902
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1902.patch, LUCENE-1902.patch
>
>
> None of the release related ant targets explicitly call cahnges-to-html ... 
> this seems like an oversight.  (currently it's only called as part of the 
> nightly target)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1902) Changes.html not explicitly included in release

2009-09-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752815#action_12752815
 ] 

Hoss Man commented on LUCENE-1902:
--

bq. Patch looks good so far mark ... go for it.

scratch that mark ... you've got enough on your plate, i'll finish this

> Changes.html not explicitly included in release
> ---
>
> Key: LUCENE-1902
> URL: https://issues.apache.org/jira/browse/LUCENE-1902
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1902.patch, LUCENE-1902.patch
>
>
> None of the release related ant targets explicitly call cahnges-to-html ... 
> this seems like an oversight.  (currently it's only called as part of the 
> nightly target)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1903) Incorrect ShingleFilter behavior when outputUnigrams == false

2009-09-08 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1903?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1903:
-

Fix Version/s: 2.9

since this is code that worked in 2.4.1, it should probably block 2.9 (at least 
untill the scope is understood -- a concious choice could be made to release 
with it as a known bug)

> Incorrect ShingleFilter behavior when outputUnigrams == false
> -
>
> Key: LUCENE-1903
> URL: https://issues.apache.org/jira/browse/LUCENE-1903
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: contrib/analyzers
>Affects Versions: 2.9
>Reporter: Chris Harris
> Fix For: 2.9
>
> Attachments: LUCENE-1903_testcases.patch, 
> LUCENE-1903_testcases_lucene2_4_1_version.patch, 
> TEST-org.apache.lucene.analysis.shingle.ShingleFilterTest.xml
>
>
> ShingleFilter isn't working as expected when outputUnigrams == false. In 
> particular, it is outputting unigrams at least some of the time when 
> outputUnigrams==false.
> I'll attach a patch to ShingleFilterTest.java that adds some test cases that 
> demonstrate the problem.
> I haven't checked this, but I hypothesize that the behavior for 
> outputUnigrams == false got changed when the class was upgraded to the new 
> TokenStream API?

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1896) Modify confusing javadoc for queryNorm

2009-09-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752787#action_12752787
 ] 

Hoss Man commented on LUCENE-1896:
--

argue was a poor choice of word, it implies someone is arguing back.  what i 
should have said was that i don't *advocate* adding a mention of it to the docs.

> Modify confusing javadoc for queryNorm
> --
>
> Key: LUCENE-1896
> URL: https://issues.apache.org/jira/browse/LUCENE-1896
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Javadocs
>Reporter: Jiri Kuhn
>Priority: Minor
> Fix For: 2.9
>
>
> See http://markmail.org/message/arai6silfiktwcer
> The javadoc confuses me as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-387) Contrib: Main memory based SynonymMap and SynonymTokenFilter

2009-09-08 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-387.
-

   Resolution: Fixed
Fix Version/s: 1.9
 Assignee: hoschek  (was: Lucene Developers)

committed long, long ago into the contrib/memory package

> Contrib: Main memory based SynonymMap and SynonymTokenFilter
> 
>
> Key: LUCENE-387
> URL: https://issues.apache.org/jira/browse/LUCENE-387
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
> Environment: Operating System: other
> Platform: Other
>Reporter: hoschek
>Assignee: hoschek
>Priority: Minor
> Fix For: 1.9
>
> Attachments: AnalyzerUtil.java, AnalyzerUtil.java, SynonymMap.java, 
> SynonymMap.java, SynonymMap.java, SynonymMap.java, SynonymMap.java, 
> SynonymTokenFilter.java, SynonymTokenFilter.java, SynonymTokenFilter.java, 
> SynonymTokenFilter.java
>
>
> - Contrib: Main memory based SynonymMap and SynonymTokenFilter
> - applies to SVN trunk as well as 1.4.3

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1904) move wordnet based synonym code out of contrib/memory and into contrib/wordnet (or somewhere else)

2009-09-08 Thread Hoss Man (JIRA)
move wordnet based synonym code out of contrib/memory and into contrib/wordnet 
(or somewhere else)
--

 Key: LUCENE-1904
 URL: https://issues.apache.org/jira/browse/LUCENE-1904
 Project: Lucene - Java
  Issue Type: Improvement
  Components: contrib/*
Reporter: Hoss Man
Priority: Minor


see LUCENE-387 ... some synonym related code has been living in contrib/memory 
for a very long time ... it should be refactored out.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1902) Changes.html not explicitly included in release

2009-09-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752781#action_12752781
 ] 

Hoss Man commented on LUCENE-1902:
--

agreed ... the only reason i had "Main" in my patch is because i thought maybe 
the "Core" link in the javadocs menu was causing the problem so i tried a 
unique name.

> Changes.html not explicitly included in release
> ---
>
> Key: LUCENE-1902
> URL: https://issues.apache.org/jira/browse/LUCENE-1902
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1902.patch, LUCENE-1902.patch
>
>
> None of the release related ant targets explicitly call cahnges-to-html ... 
> this seems like an oversight.  (currently it's only called as part of the 
> nightly target)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1896) Modify confusing javadoc for queryNorm

2009-09-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752780#action_12752780
 ] 

Hoss Man commented on LUCENE-1896:
--

you could have a queryNorm implementation that always returned 0.01f ... but if 
you were dealing with weights that were all really large, it might not be 
enough ... and ify ou were dealing with weights that were *extremely* small 
low, it might actually be counter productive.  that's why the default isn't 
arbitrary -- it's a function of the weight.

i never said it was a *reason* why queryNorm was there ... i just said it was 
an advatnge i've observed in having it.

I also didn't argue in favor of adding anything about that to hte javadocs -- i 
mentioned it only to explain one type of benefit that can arise from have "a 
uniform normalization factor computed from the sumOfSquareWeights for the query 
which is then applied to each of the clauses of the query"

> Modify confusing javadoc for queryNorm
> --
>
> Key: LUCENE-1896
> URL: https://issues.apache.org/jira/browse/LUCENE-1896
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Javadocs
>Reporter: Jiri Kuhn
>Priority: Minor
> Fix For: 2.9
>
>
> See http://markmail.org/message/arai6silfiktwcer
> The javadoc confuses me as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1902) Changes.html not explicitly included in release

2009-09-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752768#action_12752768
 ] 

Hoss Man commented on LUCENE-1902:
--

"cli.xconf" ?!?!?!?!

how many years have i been using forrest w/o knowing about this file?!?!?!

Patch looks good so far mark ... go for it.

> Changes.html not explicitly included in release
> ---
>
> Key: LUCENE-1902
> URL: https://issues.apache.org/jira/browse/LUCENE-1902
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1902.patch, LUCENE-1902.patch
>
>
> None of the release related ant targets explicitly call cahnges-to-html ... 
> this seems like an oversight.  (currently it's only called as part of the 
> nightly target)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-387) Contrib: Main memory based SynonymMap and SynonymTokenFilter

2009-09-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752766#action_12752766
 ] 

Hoss Man commented on LUCENE-387:
-

Yeah ... i'm really not sure what the deal is with this ... the code was 
committed, but why is it in the memory contrib?!?!?

> Contrib: Main memory based SynonymMap and SynonymTokenFilter
> 
>
> Key: LUCENE-387
> URL: https://issues.apache.org/jira/browse/LUCENE-387
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
> Environment: Operating System: other
> Platform: Other
>Reporter: hoschek
>Assignee: Lucene Developers
>Priority: Minor
> Attachments: AnalyzerUtil.java, AnalyzerUtil.java, SynonymMap.java, 
> SynonymMap.java, SynonymMap.java, SynonymMap.java, SynonymMap.java, 
> SynonymTokenFilter.java, SynonymTokenFilter.java, SynonymTokenFilter.java, 
> SynonymTokenFilter.java
>
>
> - Contrib: Main memory based SynonymMap and SynonymTokenFilter
> - applies to SVN trunk as well as 1.4.3

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-387) Contrib: Main memory based SynonymMap and SynonymTokenFilter

2009-09-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752766#action_12752766
 ] 

Hoss Man commented on LUCENE-387:
-

Yeah ... i'm really not sure what the deal is with this ... the code was 
committed, but why is it in the memory contrib?!?!?

> Contrib: Main memory based SynonymMap and SynonymTokenFilter
> 
>
> Key: LUCENE-387
> URL: https://issues.apache.org/jira/browse/LUCENE-387
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Analysis
>Affects Versions: 1.4
> Environment: Operating System: other
> Platform: Other
>Reporter: hoschek
>Assignee: Lucene Developers
>Priority: Minor
> Attachments: AnalyzerUtil.java, AnalyzerUtil.java, SynonymMap.java, 
> SynonymMap.java, SynonymMap.java, SynonymMap.java, SynonymMap.java, 
> SynonymTokenFilter.java, SynonymTokenFilter.java, SynonymTokenFilter.java, 
> SynonymTokenFilter.java
>
>
> - Contrib: Main memory based SynonymMap and SynonymTokenFilter
> - applies to SVN trunk as well as 1.4.3

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1896) Modify confusing javadoc for queryNorm

2009-09-08 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752763#action_12752763
 ] 

Hoss Man commented on LUCENE-1896:
--

as i mentioned in the linked thread, the biggest advantage i know of for 
queryNorm is that it is a reduction factor applied to the constituent parts of 
a complex score prior to multiplication -- so it helps prevent loss of 
information due to floating point accuracy that could arrise otherwise.

but then again: that'swhat the *default* queryNorm does ... an alternate 
queryNorm could do something (like be a no-op)

Since the target audience of the Similarity javadocs are mainly people who are 
interested in customizing the scoring, perhaps it should be something like...

{quote}
The queryNorm is a uniform normalization factor computed from the 
sumOfSquareWeights for the query which is then applied to each of the clauses 
of the query.  In some cases this can be useful for attempting to keep scores 
from simple queries semi-comparable.  The Default implementation is...
{quote} 

> Modify confusing javadoc for queryNorm
> --
>
> Key: LUCENE-1896
> URL: https://issues.apache.org/jira/browse/LUCENE-1896
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Javadocs
>Reporter: Jiri Kuhn
>Priority: Minor
> Fix For: 2.9
>
>
> See http://markmail.org/message/arai6silfiktwcer
> The javadoc confuses me as well.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1902) Changes.html not explicitly included in release

2009-09-08 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1902?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1902:
-

Attachment: LUCENE-1902.patch

I'm sick of dealing with forrest.

this patch does the following things...
* calls cahnges-to-html from the pacakge target
* add a link to Changes.html and Contrib-Changes.html in the left nav of the 
version specific docs
* fixes a hardcoded link to the public site when refering to the contrib 
changes in an existing page.

it has the side effect of of causing forrest to freak out that it doesn't know 
how to create Changes.html ... never mind that i've configured it to be an 
external page, never mind that there are 100 other external pages configured 
the same way that work just fine (all of the contrib javadoc pages) ... forrest 
refuses to play nice with me.

I've taken this as far as i can ... someone else can pick it up if they choose.

> Changes.html not explicitly included in release
> ---
>
> Key: LUCENE-1902
> URL: https://issues.apache.org/jira/browse/LUCENE-1902
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1902.patch
>
>
> None of the release related ant targets explicitly call cahnges-to-html ... 
> this seems like an oversight.  (currently it's only called as part of the 
> nightly target)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1902) Changes.html not explicitly included in release

2009-09-08 Thread Hoss Man (JIRA)
Changes.html not explicitly included in release
---

 Key: LUCENE-1902
 URL: https://issues.apache.org/jira/browse/LUCENE-1902
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9


None of the release related ant targets explicitly call cahnges-to-html ... 
this seems like an oversight.  (currently it's only called as part of the 
nightly target)



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1887) o.a.l.messages should be moved to it's own contrib

2009-09-08 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1887:
-

Component/s: contrib/*
Description: 
contrib/queryParser contains an org.apache.lucene.messages package containing 
some generallized code that (claims in it's javadocs) is not specific to the 
queryParser.

If this is truely general purpose code, it should probably be moved out of hte 
queryParser contrib -- either into it's own contrib, or into the core (it's 
very small)

*EDIT:* alternate suggestion to rename package to fall under the 
o.a.l.queryParser namespace retracted due to comments in favor of (eventually) 
promoting to it's own contrib

  was:
contrib/queryParser contains an org.apache.lucene.messages package containing 
some generallized code that (claims in it's javadocs) is not specific to the 
queryParser.

If this is truely general purpose code, it should probably be moved out of hte 
queryParser contrib -- either into it's own contrib, or into the core (it's 
very small)

Alternately: if the code isn't super reusable, the package name should probably 
be changed to be a subpackage of  org.apache.lucene.queryParser ... it tends to 
be very confusing when all of the code in a contrib doesn't fall into a clear 
organizational namespace.

I've marked this issue for 2.9 so we at least think about it prior to release 
... it is a brand new public API, so this is out best chance to change it if we 
want ... but it is by no means a blocker for 2.9

Summary: o.a.l.messages should be moved to it's own contrib  (was: 
o.a.l.messages should be moved/renamed)

> o.a.l.messages should be moved to it's own contrib
> --
>
> Key: LUCENE-1887
> URL: https://issues.apache.org/jira/browse/LUCENE-1887
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: contrib/*
>Reporter: Hoss Man
>Priority: Minor
>
> contrib/queryParser contains an org.apache.lucene.messages package containing 
> some generallized code that (claims in it's javadocs) is not specific to the 
> queryParser.
> If this is truely general purpose code, it should probably be moved out of 
> hte queryParser contrib -- either into it's own contrib, or into the core 
> (it's very small)
> *EDIT:* alternate suggestion to rename package to fall under the 
> o.a.l.queryParser namespace retracted due to comments in favor of 
> (eventually) promoting to it's own contrib

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1887) o.a.l.messages should be moved/renamed

2009-09-08 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1887?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1887:
-

Fix Version/s: (was: 2.9)

bq. These classes have not relation with the queryparser code, the queryparser 
only uses it.

that seems like a pretty strong argument to promote it into it's own contrib 
... no other contrib is going to start depending on queryParser just to get 
access to a messages class -- And if we wait until 3.x to move it to it's own 
contrib, we make a lot of headaches for any users who start(ed) using the 
queryParser contrib in 2.9, because all of hte sudden their code will stop 
working at runtime because the classes can't be found.

(it's an easy problem to fix: tell them to use the new jar as well, but it 
reflects badly on the project when people encounter annoyances like that when 
upgrading)

That said: i'm not going to argue that hard if no more closely involved in the 
contrib thinks it's worth moving .. removing 2.9 fix-for.

> o.a.l.messages should be moved/renamed
> --
>
> Key: LUCENE-1887
> URL: https://issues.apache.org/jira/browse/LUCENE-1887
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Hoss Man
>Priority: Minor
>
> contrib/queryParser contains an org.apache.lucene.messages package containing 
> some generallized code that (claims in it's javadocs) is not specific to the 
> queryParser.
> If this is truely general purpose code, it should probably be moved out of 
> hte queryParser contrib -- either into it's own contrib, or into the core 
> (it's very small)
> Alternately: if the code isn't super reusable, the package name should 
> probably be changed to be a subpackage of  org.apache.lucene.queryParser ... 
> it tends to be very confusing when all of the code in a contrib doesn't fall 
> into a clear organizational namespace.
> I've marked this issue for 2.9 so we at least think about it prior to release 
> ... it is a brand new public API, so this is out best chance to change it if 
> we want ... but it is by no means a blocker for 2.9

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1884) javadocs cleanup

2009-09-03 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1884.
--

Resolution: Fixed
  Assignee: Robert Muir

i'm the last person in the world that should be reviewing a spell correction 
patch -- but nothing jumped out at me as being bad about the patch...

Committed revision 811070.

> javadocs cleanup
> 
>
> Key: LUCENE-1884
> URL: https://issues.apache.org/jira/browse/LUCENE-1884
> Project: Lucene - Java
>  Issue Type: Task
>  Components: Javadocs
>Reporter: Robert Muir
>Assignee: Robert Muir
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1884.patch
>
>
> basic cleanup in core/contrib: typos, apache license header as javadoc, 
> missing periods that screw up package summary, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1886) Improve Javadoc

2009-09-03 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1886?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1886.
--

Resolution: Fixed
  Assignee: Hoss Man

Committed revision 811060.

Thanks Bernd

> Improve Javadoc
> ---
>
> Key: LUCENE-1886
> URL: https://issues.apache.org/jira/browse/LUCENE-1886
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Analysis
>Reporter: Bernd Fondermann
>Assignee: Hoss Man
>Priority: Trivial
> Fix For: 2.9
>
> Attachments: javadoc.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1887) o.a.l.messages should be moved/renamed

2009-09-02 Thread Hoss Man (JIRA)
o.a.l.messages should be moved/renamed
--

 Key: LUCENE-1887
 URL: https://issues.apache.org/jira/browse/LUCENE-1887
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9


contrib/queryParser contains an org.apache.lucene.messages package containing 
some generallized code that (claims in it's javadocs) is not specific to the 
queryParser.

If this is truely general purpose code, it should probably be moved out of hte 
queryParser contrib -- either into it's own contrib, or into the core (it's 
very small)

Alternately: if the code isn't super reusable, the package name should probably 
be changed to be a subpackage of  org.apache.lucene.queryParser ... it tends to 
be very confusing when all of the code in a contrib doesn't fall into a clear 
organizational namespace.

I've marked this issue for 2.9 so we at least think about it prior to release 
... it is a brand new public API, so this is out best chance to change it if we 
want ... but it is by no means a blocker for 2.9

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1881) Non-stored fields are not copied in writer.addDocument()?

2009-09-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1881.
--

Resolution: Invalid

> Non-stored fields are not copied in writer.addDocument()?
> -
>
> Key: LUCENE-1881
> URL: https://issues.apache.org/jira/browse/LUCENE-1881
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Store
>Affects Versions: 2.4.1
> Environment: Linux
>Reporter: Wai Wong
>Assignee: Hoss Man
>Priority: Critical
>
> We would like to modified stored documents properties.  The method is to use 
> IndexReader to open all files, modified some fields, and copy the document 
> via addDocument() of IndexWriter to another index.  But all fields that are 
> created using Field.Store.NO are no longer available for searching.
> Sample code in jsp is attached:
> <%@ page language="java" 
> import="org.apache.lucene.analysis.standard.StandardAnalyzer;"%>
> <%@ page language="java" import="org.apache.lucene.document.*;"%>
> <%@ page language="java" import="org.apache.lucene.index.*;"%>
> <%@ page language="java" import="org.apache.lucene.search.*;"%>
> <%@ page contentType="text/html; charset=utf8" %>
> <%
> // create for testing
> IndexWriter writer = new IndexWriter("/opt/wwwroot/351/Index/test", new 
> StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED);
> Document doc = new Document();
> doc.add(new Field("A", "1234", Field.Store.NO , 
> Field.Index.NOT_ANALYZED));
> doc.add(new Field("B", "abcd", Field.Store.NO , 
> Field.Index.NOT_ANALYZED));
> writer.addDocument(doc);
> writer.close();
> // check ok
> Query q = new TermQuery(new Term("A", "1234"));
> Searcher s = new IndexSearcher("/opt/wwwroot/351/Index/test");
> Hits h = s.search(q);
> out.println("# of document found is " + h.length());// it is ok
> // update the document to change or remove a field
> IndexReader r = IndexReader.open("/opt/wwwroot/351/Index/test");
> doc = r.document(0);
> r.deleteDocument(0);
> r.close();
> doc.removeField("B");
> writer = new IndexWriter("/opt/wwwroot/351/Index/test1", new 
> StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED);
> writer.addDocument(doc);
> writer.optimize();
> writer.close();
> // test again
> s = new IndexSearcher("/opt/wwwroot/351/Index/test1");
> h = s.search(q);
> out.println("# of document found is now " + h.length());
> r = IndexReader.open("/opt/wwwroot/351/Index/test1");
> out.println(" max Doc is " + r.maxDoc());
> %>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-1881) Non-stored fields are not copied in writer.addDocument()?

2009-09-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750211#action_12750211
 ] 

Hoss Man edited comment on LUCENE-1881 at 9/1/09 5:57 PM:
--

*REDACTED* ... comment put in wrong issue

  was (Author: hossman):
Committed revision 810324.

i went ahead and fixed this using the "display the '1' bucket" approach.
  
> Non-stored fields are not copied in writer.addDocument()?
> -
>
> Key: LUCENE-1881
> URL: https://issues.apache.org/jira/browse/LUCENE-1881
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Store
>Affects Versions: 2.4.1
> Environment: Linux
>Reporter: Wai Wong
>Assignee: Hoss Man
>Priority: Critical
>
> We would like to modified stored documents properties.  The method is to use 
> IndexReader to open all files, modified some fields, and copy the document 
> via addDocument() of IndexWriter to another index.  But all fields that are 
> created using Field.Store.NO are no longer available for searching.
> Sample code in jsp is attached:
> <%@ page language="java" 
> import="org.apache.lucene.analysis.standard.StandardAnalyzer;"%>
> <%@ page language="java" import="org.apache.lucene.document.*;"%>
> <%@ page language="java" import="org.apache.lucene.index.*;"%>
> <%@ page language="java" import="org.apache.lucene.search.*;"%>
> <%@ page contentType="text/html; charset=utf8" %>
> <%
> // create for testing
> IndexWriter writer = new IndexWriter("/opt/wwwroot/351/Index/test", new 
> StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED);
> Document doc = new Document();
> doc.add(new Field("A", "1234", Field.Store.NO , 
> Field.Index.NOT_ANALYZED));
> doc.add(new Field("B", "abcd", Field.Store.NO , 
> Field.Index.NOT_ANALYZED));
> writer.addDocument(doc);
> writer.close();
> // check ok
> Query q = new TermQuery(new Term("A", "1234"));
> Searcher s = new IndexSearcher("/opt/wwwroot/351/Index/test");
> Hits h = s.search(q);
> out.println("# of document found is " + h.length());// it is ok
> // update the document to change or remove a field
> IndexReader r = IndexReader.open("/opt/wwwroot/351/Index/test");
> doc = r.document(0);
> r.deleteDocument(0);
> r.close();
> doc.removeField("B");
> writer = new IndexWriter("/opt/wwwroot/351/Index/test1", new 
> StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED);
> writer.addDocument(doc);
> writer.optimize();
> writer.close();
> // test again
> s = new IndexSearcher("/opt/wwwroot/351/Index/test1");
> h = s.search(q);
> out.println("# of document found is now " + h.length());
> r = IndexReader.open("/opt/wwwroot/351/Index/test1");
> out.println(" max Doc is " + r.maxDoc());
> %>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Reopened: (LUCENE-1881) Non-stored fields are not copied in writer.addDocument()?

2009-09-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reopened LUCENE-1881:
--


reopening to mark with correct resolution

> Non-stored fields are not copied in writer.addDocument()?
> -
>
> Key: LUCENE-1881
> URL: https://issues.apache.org/jira/browse/LUCENE-1881
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Store
>Affects Versions: 2.4.1
> Environment: Linux
>Reporter: Wai Wong
>Assignee: Hoss Man
>Priority: Critical
>
> We would like to modified stored documents properties.  The method is to use 
> IndexReader to open all files, modified some fields, and copy the document 
> via addDocument() of IndexWriter to another index.  But all fields that are 
> created using Field.Store.NO are no longer available for searching.
> Sample code in jsp is attached:
> <%@ page language="java" 
> import="org.apache.lucene.analysis.standard.StandardAnalyzer;"%>
> <%@ page language="java" import="org.apache.lucene.document.*;"%>
> <%@ page language="java" import="org.apache.lucene.index.*;"%>
> <%@ page language="java" import="org.apache.lucene.search.*;"%>
> <%@ page contentType="text/html; charset=utf8" %>
> <%
> // create for testing
> IndexWriter writer = new IndexWriter("/opt/wwwroot/351/Index/test", new 
> StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED);
> Document doc = new Document();
> doc.add(new Field("A", "1234", Field.Store.NO , 
> Field.Index.NOT_ANALYZED));
> doc.add(new Field("B", "abcd", Field.Store.NO , 
> Field.Index.NOT_ANALYZED));
> writer.addDocument(doc);
> writer.close();
> // check ok
> Query q = new TermQuery(new Term("A", "1234"));
> Searcher s = new IndexSearcher("/opt/wwwroot/351/Index/test");
> Hits h = s.search(q);
> out.println("# of document found is " + h.length());// it is ok
> // update the document to change or remove a field
> IndexReader r = IndexReader.open("/opt/wwwroot/351/Index/test");
> doc = r.document(0);
> r.deleteDocument(0);
> r.close();
> doc.removeField("B");
> writer = new IndexWriter("/opt/wwwroot/351/Index/test1", new 
> StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED);
> writer.addDocument(doc);
> writer.optimize();
> writer.close();
> // test again
> s = new IndexSearcher("/opt/wwwroot/351/Index/test1");
> h = s.search(q);
> out.println("# of document found is now " + h.length());
> r = IndexReader.open("/opt/wwwroot/351/Index/test1");
> out.println(" max Doc is " + r.maxDoc());
> %>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1881) Non-stored fields are not copied in writer.addDocument()?

2009-09-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1881.
--

Resolution: Fixed
  Assignee: Hoss Man

Committed revision 810324.

i went ahead and fixed this using the "display the '1' bucket" approach.

> Non-stored fields are not copied in writer.addDocument()?
> -
>
> Key: LUCENE-1881
> URL: https://issues.apache.org/jira/browse/LUCENE-1881
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Store
>Affects Versions: 2.4.1
> Environment: Linux
>Reporter: Wai Wong
>Assignee: Hoss Man
>Priority: Critical
>
> We would like to modified stored documents properties.  The method is to use 
> IndexReader to open all files, modified some fields, and copy the document 
> via addDocument() of IndexWriter to another index.  But all fields that are 
> created using Field.Store.NO are no longer available for searching.
> Sample code in jsp is attached:
> <%@ page language="java" 
> import="org.apache.lucene.analysis.standard.StandardAnalyzer;"%>
> <%@ page language="java" import="org.apache.lucene.document.*;"%>
> <%@ page language="java" import="org.apache.lucene.index.*;"%>
> <%@ page language="java" import="org.apache.lucene.search.*;"%>
> <%@ page contentType="text/html; charset=utf8" %>
> <%
> // create for testing
> IndexWriter writer = new IndexWriter("/opt/wwwroot/351/Index/test", new 
> StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED);
> Document doc = new Document();
> doc.add(new Field("A", "1234", Field.Store.NO , 
> Field.Index.NOT_ANALYZED));
> doc.add(new Field("B", "abcd", Field.Store.NO , 
> Field.Index.NOT_ANALYZED));
> writer.addDocument(doc);
> writer.close();
> // check ok
> Query q = new TermQuery(new Term("A", "1234"));
> Searcher s = new IndexSearcher("/opt/wwwroot/351/Index/test");
> Hits h = s.search(q);
> out.println("# of document found is " + h.length());// it is ok
> // update the document to change or remove a field
> IndexReader r = IndexReader.open("/opt/wwwroot/351/Index/test");
> doc = r.document(0);
> r.deleteDocument(0);
> r.close();
> doc.removeField("B");
> writer = new IndexWriter("/opt/wwwroot/351/Index/test1", new 
> StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED);
> writer.addDocument(doc);
> writer.optimize();
> writer.close();
> // test again
> s = new IndexSearcher("/opt/wwwroot/351/Index/test1");
> h = s.search(q);
> out.println("# of document found is now " + h.length());
> r = IndexReader.open("/opt/wwwroot/351/Index/test1");
> out.println(" max Doc is " + r.maxDoc());
> %>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1881) Non-stored fields are not copied in writer.addDocument()?

2009-09-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750210#action_12750210
 ] 

Hoss Man commented on LUCENE-1881:
--

there is no bug in addDocument.

the behavior observed is a basic tenant of retrieving documents from an 
IndexReader...

http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/index/IndexReader.html#document(int)
bq. Returns the stored fields of the nth Document in this index. 

See also...
http://lucene.apache.org/java/2_4_1/api/core/org/apache/lucene/document/Document.html
bq. ...Note that fields which are not stored are not available in documents 
retrieved from the index, e.g. with ScoreDoc.doc, Searcher.doc(int) or 
IndexReader.document(int). 





> Non-stored fields are not copied in writer.addDocument()?
> -
>
> Key: LUCENE-1881
> URL: https://issues.apache.org/jira/browse/LUCENE-1881
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Store
>Affects Versions: 2.4.1
> Environment: Linux
>Reporter: Wai Wong
>Priority: Critical
>
> We would like to modified stored documents properties.  The method is to use 
> IndexReader to open all files, modified some fields, and copy the document 
> via addDocument() of IndexWriter to another index.  But all fields that are 
> created using Field.Store.NO are no longer available for searching.
> Sample code in jsp is attached:
> <%@ page language="java" 
> import="org.apache.lucene.analysis.standard.StandardAnalyzer;"%>
> <%@ page language="java" import="org.apache.lucene.document.*;"%>
> <%@ page language="java" import="org.apache.lucene.index.*;"%>
> <%@ page language="java" import="org.apache.lucene.search.*;"%>
> <%@ page contentType="text/html; charset=utf8" %>
> <%
> // create for testing
> IndexWriter writer = new IndexWriter("/opt/wwwroot/351/Index/test", new 
> StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED);
> Document doc = new Document();
> doc.add(new Field("A", "1234", Field.Store.NO , 
> Field.Index.NOT_ANALYZED));
> doc.add(new Field("B", "abcd", Field.Store.NO , 
> Field.Index.NOT_ANALYZED));
> writer.addDocument(doc);
> writer.close();
> // check ok
> Query q = new TermQuery(new Term("A", "1234"));
> Searcher s = new IndexSearcher("/opt/wwwroot/351/Index/test");
> Hits h = s.search(q);
> out.println("# of document found is " + h.length());// it is ok
> // update the document to change or remove a field
> IndexReader r = IndexReader.open("/opt/wwwroot/351/Index/test");
> doc = r.document(0);
> r.deleteDocument(0);
> r.close();
> doc.removeField("B");
> writer = new IndexWriter("/opt/wwwroot/351/Index/test1", new 
> StandardAnalyzer(), true, IndexWriter.MaxFieldLength.LIMITED);
> writer.addDocument(doc);
> writer.optimize();
> writer.close();
> // test again
> s = new IndexSearcher("/opt/wwwroot/351/Index/test1");
> h = s.search(q);
> out.println("# of document found is now " + h.length());
> r = IndexReader.open("/opt/wwwroot/351/Index/test1");
> out.println(" max Doc is " + r.maxDoc());
> %>

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-09-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1862.
--

Resolution: Fixed
  Assignee: Hoss Man

this is done ... if people want to make other improvements to the javadocs for 
either package, let's open separate issues.

> duplicate package.html files in queryParser and analsysis.cn packages
> -
>
> Key: LUCENE-1862
> URL: https://issues.apache.org/jira/browse/LUCENE-1862
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1862-queryParser.patch, LUCENE-1862.patch
>
>
> These files conflict with eachother when building the javadocs. there can be 
> only one (of each) ...
> {code}
> hoss...@brunner:~/lucene/java$ find src contrib -name package.html | perl 
> -ple 's{.*src/java/}{}' | sort | uniq -c | grep -v " 1 "
>2 org/apache/lucene/analysis/cn/package.html
>2 org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path 
> \*queryParser/package.html
> src/java/org/apache/lucene/queryParser/package.html
> contrib/queryparser/src/java/org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path \*cn/package.html
> contrib/analyzers/common/src/java/org/apache/lucene/analysis/cn/package.html
> contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/package.html
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-09-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750150#action_12750150
 ] 

Hoss Man commented on LUCENE-1862:
--

LUCENE-1862-queryParser.patch ...

Committed revision 810286

> duplicate package.html files in queryParser and analsysis.cn packages
> -
>
> Key: LUCENE-1862
> URL: https://issues.apache.org/jira/browse/LUCENE-1862
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1862-queryParser.patch, LUCENE-1862.patch
>
>
> These files conflict with eachother when building the javadocs. there can be 
> only one (of each) ...
> {code}
> hoss...@brunner:~/lucene/java$ find src contrib -name package.html | perl 
> -ple 's{.*src/java/}{}' | sort | uniq -c | grep -v " 1 "
>2 org/apache/lucene/analysis/cn/package.html
>2 org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path 
> \*queryParser/package.html
> src/java/org/apache/lucene/queryParser/package.html
> contrib/queryparser/src/java/org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path \*cn/package.html
> contrib/analyzers/common/src/java/org/apache/lucene/analysis/cn/package.html
> contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/package.html
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-09-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1862:
-

Attachment: LUCENE-1862-queryParser.patch

patch fixing the duplicate package.html files for queryParser by moving the 
contrib version into the contrib's overview.html (the package one was never 
used in the contrib specific docs)

this patch also makes some other misc improvements to the docs, and tweaks the 
build.xml so that the appropraite subpackages are listed in the correct section.

> duplicate package.html files in queryParser and analsysis.cn packages
> -
>
> Key: LUCENE-1862
> URL: https://issues.apache.org/jira/browse/LUCENE-1862
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1862-queryParser.patch, LUCENE-1862.patch
>
>
> These files conflict with eachother when building the javadocs. there can be 
> only one (of each) ...
> {code}
> hoss...@brunner:~/lucene/java$ find src contrib -name package.html | perl 
> -ple 's{.*src/java/}{}' | sort | uniq -c | grep -v " 1 "
>2 org/apache/lucene/analysis/cn/package.html
>2 org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path 
> \*queryParser/package.html
> src/java/org/apache/lucene/queryParser/package.html
> contrib/queryparser/src/java/org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path \*cn/package.html
> contrib/analyzers/common/src/java/org/apache/lucene/analysis/cn/package.html
> contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/package.html
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-09-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750122#action_12750122
 ] 

Hoss Man commented on LUCENE-1862:
--

FYI: the queryParser contrib doesn't even have any classes living in the 
org.apache.lucene.queryParser package, so 
contrib/queryparser/src/java/org/apache/lucene/queryParser/package.html is 
never even used when building the contrib-queryParser specific javadocs ... 
it's just a possible candidate when builidng the "all" javadocs.

the best solution seems to be moving that content into 
contrib/queryParser/src/java/overview.html ... i'll work up a patch.

> duplicate package.html files in queryParser and analsysis.cn packages
> -
>
> Key: LUCENE-1862
> URL: https://issues.apache.org/jira/browse/LUCENE-1862
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1862.patch
>
>
> These files conflict with eachother when building the javadocs. there can be 
> only one (of each) ...
> {code}
> hoss...@brunner:~/lucene/java$ find src contrib -name package.html | perl 
> -ple 's{.*src/java/}{}' | sort | uniq -c | grep -v " 1 "
>2 org/apache/lucene/analysis/cn/package.html
>2 org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path 
> \*queryParser/package.html
> src/java/org/apache/lucene/queryParser/package.html
> contrib/queryparser/src/java/org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path \*cn/package.html
> contrib/analyzers/common/src/java/org/apache/lucene/analysis/cn/package.html
> contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/package.html
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1882) move SmartChineseAnalyzer into the smartcn package

2009-09-01 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1882.
--

Resolution: Fixed
  Assignee: Robert Muir

Committed revision 810247.

thanks robert.

> move SmartChineseAnalyzer into the smartcn package
> --
>
> Key: LUCENE-1882
> URL: https://issues.apache.org/jira/browse/LUCENE-1882
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Assignee: Robert Muir
> Fix For: 2.9
>
> Attachments: LUCENE-1882.patch, LUCENE-1882.patch
>
>
> an offshoot of LUCENE-1862, 
> org.apache.lucene.analysis.cn.SmartChineseAnalyzer should become 
> org.apache.lucene.analysis.cn.smartcn.SmartChineseAnalyzer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1882) move SmartChineseAnalyzer into the smartcn package

2009-09-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750109#action_12750109
 ] 

Hoss Man commented on LUCENE-1882:
--

bq. I am curious if you have trouble with this patch to package.html now that 
you have corrected the EOL issue?

it might have just been the content type in the html header ... but i don't 
think so, i think it had to do with the line endings of the patch, vs the 
updated line endencins once i fixed the eol-style ... plus i was trying a bunch 
of differnt things to fix hte line endings ... i don't know what it was 
exactly, but this new patch seems to work fine (evne with the non ascii glyphs) 
so i think we're good.

> move SmartChineseAnalyzer into the smartcn package
> --
>
> Key: LUCENE-1882
> URL: https://issues.apache.org/jira/browse/LUCENE-1882
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1882.patch, LUCENE-1882.patch
>
>
> an offshoot of LUCENE-1862, 
> org.apache.lucene.analysis.cn.SmartChineseAnalyzer should become 
> org.apache.lucene.analysis.cn.smartcn.SmartChineseAnalyzer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-09-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750077#action_12750077
 ] 

Hoss Man commented on LUCENE-1862:
--

Robert: i opened LUCENE-1882 to track the issue of moving SmartChineseAnalyzer 
into the correct package, and committed your suggested changes under that issue.

i know for a fact that part of your patch didn't make it in -- the characters 
kept getting corrupted in one of hte package.html files, and i couldn't figure 
out an obvious reason/solution so i just committed what there was at that point 
so we'd at least have all the files in the right places to move forward from 
there.

please take a look at the current state of the pacakge.html files and let us 
konw what still needs done to make them "good"

> duplicate package.html files in queryParser and analsysis.cn packages
> -
>
> Key: LUCENE-1862
> URL: https://issues.apache.org/jira/browse/LUCENE-1862
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1862.patch
>
>
> These files conflict with eachother when building the javadocs. there can be 
> only one (of each) ...
> {code}
> hoss...@brunner:~/lucene/java$ find src contrib -name package.html | perl 
> -ple 's{.*src/java/}{}' | sort | uniq -c | grep -v " 1 "
>2 org/apache/lucene/analysis/cn/package.html
>2 org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path 
> \*queryParser/package.html
> src/java/org/apache/lucene/queryParser/package.html
> contrib/queryparser/src/java/org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path \*cn/package.html
> contrib/analyzers/common/src/java/org/apache/lucene/analysis/cn/package.html
> contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/package.html
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1882) move SmartChineseAnalyzer into the smartcn package

2009-09-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750075#action_12750075
 ] 

Hoss Man commented on LUCENE-1882:
--

svn commit -m "LUCENE-1882: move SmartChineseAnalyzer to the 'correct' package 
... this commit is based on a sequence of svn commands and a patch provided by 
Robert Muir in LUCENE-1862"
...
Committed revision 810208.

Robert: could you please verify wether the move looks good to you?

> move SmartChineseAnalyzer into the smartcn package
> --
>
> Key: LUCENE-1882
> URL: https://issues.apache.org/jira/browse/LUCENE-1882
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
> Fix For: 2.9
>
>
> an offshoot of LUCENE-1862, 
> org.apache.lucene.analysis.cn.SmartChineseAnalyzer should become 
> org.apache.lucene.analysis.cn.smartcn.SmartChineseAnalyzer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1882) move SmartChineseAnalyzer into the smartcn package

2009-09-01 Thread Hoss Man (JIRA)
move SmartChineseAnalyzer into the smartcn package
--

 Key: LUCENE-1882
 URL: https://issues.apache.org/jira/browse/LUCENE-1882
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Hoss Man
 Fix For: 2.9


an offshoot of LUCENE-1862, org.apache.lucene.analysis.cn.SmartChineseAnalyzer 
should become org.apache.lucene.analysis.cn.smartcn.SmartChineseAnalyzer

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-09-01 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12750032#action_12750032
 ] 

Hoss Man commented on LUCENE-1862:
--

bq. Anyone have any thoughts here? Any time I think about it, I just end up 
thinking its best to leave it ... the javadoc itself (package descriptions) 
still appears to come out correctly.

Uh ... not really.  what you get is non-deterministic behavior, where *one* of 
the package.html files for each package gets picked, and the other one isn't 
used. (this can look particularly confusing with something like queryPrser, 
where you'll find one description in the "core" docs, a differnet version in 
the "contrib" docs, and it's a crap shoot as to which one of those will show up 
in the "all" docs.




> duplicate package.html files in queryParser and analsysis.cn packages
> -
>
> Key: LUCENE-1862
> URL: https://issues.apache.org/jira/browse/LUCENE-1862
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1862.patch
>
>
> These files conflict with eachother when building the javadocs. there can be 
> only one (of each) ...
> {code}
> hoss...@brunner:~/lucene/java$ find src contrib -name package.html | perl 
> -ple 's{.*src/java/}{}' | sort | uniq -c | grep -v " 1 "
>2 org/apache/lucene/analysis/cn/package.html
>2 org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path 
> \*queryParser/package.html
> src/java/org/apache/lucene/queryParser/package.html
> contrib/queryparser/src/java/org/apache/lucene/queryParser/package.html
> hoss...@brunner:~/lucene/java$ find src contrib -path \*cn/package.html
> contrib/analyzers/common/src/java/org/apache/lucene/analysis/cn/package.html
> contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/package.html
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1865) Add a ton of missing license headers throughout test/demo/contrib

2009-08-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1865:
-

Attachment: LUCENE-1865-part2.patch

untested patch fixing the files i mentioned

> Add a ton of missing license headers throughout test/demo/contrib
> -
>
> Key: LUCENE-1865
> URL: https://issues.apache.org/jira/browse/LUCENE-1865
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1865-part2.patch, LUCENE-1865.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1865) Add a ton of missing license headers throughout test/demo/contrib

2009-08-27 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748611#action_12748611
 ] 

Hoss Man commented on LUCENE-1865:
--

as of r808636 it still seems like we're missing boilerplate from several files 
that fit the "source file" criteria...
http://www.apache.org/dev/release.html#which-files-contain-license

This list is a subset of the rat report on the uncompressed src.zip (from trunk 
r808636)...
{code}
lucene-2.9/contrib/ant/src/java/org/apache/lucene/ant/antlib.xml
lucene-2.9/contrib/db/bdb/build.xml
lucene-2.9/contrib/db/bdb-je/build.xml
lucene-2.9/contrib/lucli/run.sh
lucene-2.9/contrib/xml-query-parser/src/demo/WebContent/index.jsp
lucene-2.9/contrib/xml-query-parser/src/demo/java/org/apache/lucene/xmlparser/webdemo/FormBasedXmlQueryDemo.java
lucene-2.9/src/jsp/configuration.jsp
lucene-2.9/src/jsp/footer.jsp
lucene-2.9/src/jsp/header.jsp
lucene-2.9/src/jsp/index.jsp
lucene-2.9/src/jsp/results.jsp
{code}

> Add a ton of missing license headers throughout test/demo/contrib
> -
>
> Key: LUCENE-1865
> URL: https://issues.apache.org/jira/browse/LUCENE-1865
> Project: Lucene - Java
>  Issue Type: Task
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Minor
> Fix For: 2.9
>
> Attachments: LUCENE-1865.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1866) better RAT reporting

2009-08-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1866:
-

Attachment: LUCENE-1866.patch

per discussion on the user list, we should be looking at a RAT report for *all* 
the sources in the common case, and prior to a release we should sanity check 
_exactly_ what is in the release.

patch makes a start at this ... as mentioned on mail, the rat ant task has soem 
anoying limitations so i went back and forth between using it vs using  
to run rat.Report directly.

at the moment this works fine, but it's pissing me off that i can't find an 
obvious way to either just get a summary report (skipping all the heads of the 
files that don't have recognized licenses) or to have hte report go to a file 
*and* to the ant output (because when the report is really bad, all the 
important summary info scrolls out of hte terminal buffer unless it's 
configured really huge)


I'm clearly really tired right now, so i'm going to drop this and let other 
people pick it up if they want.

> better RAT reporting
> 
>
> Key: LUCENE-1866
> URL: https://issues.apache.org/jira/browse/LUCENE-1866
> Project: Lucene - Java
>  Issue Type: Bug
>Reporter: Hoss Man
> Attachments: LUCENE-1866.patch
>
>
> the "ant rat-sources" target currently only analyzes src/java ... we can do 
> better then this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1866) better RAT reporting

2009-08-27 Thread Hoss Man (JIRA)
better RAT reporting


 Key: LUCENE-1866
 URL: https://issues.apache.org/jira/browse/LUCENE-1866
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Hoss Man


the "ant rat-sources" target currently only analyzes src/java ... we can do 
better then this.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1863) SnowballAnalyzer has a link to net.sf (a package that is empty and needs to be removed).

2009-08-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1863?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1863.
--

Resolution: Fixed

mark fixed in r808472

> SnowballAnalyzer has a link to net.sf (a package that is empty and needs to 
> be removed).
> 
>
> Key: LUCENE-1863
> URL: https://issues.apache.org/jira/browse/LUCENE-1863
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Javadocs
>Reporter: Mark Miller
>Assignee: Mark Miller
>Priority: Trivial
> Fix For: 2.9
>
>
> need to remove net.sf and points to org.tartarus.snowball.ext. Doesn't work 
> as a link though, so I'll also remove the @link to lose the javadoc error and 
> broken link.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1864) bogus javadocs for FieldValueHitQuery.fillFields

2009-08-27 Thread Hoss Man (JIRA)
bogus javadocs for FieldValueHitQuery.fillFields


 Key: LUCENE-1864
 URL: https://issues.apache.org/jira/browse/LUCENE-1864
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Hoss Man
 Fix For: 2.9


FieldValueHitQuery.fillFields has javadocs that seem to be left over from a 
completely different method...

{code}
  /**
   * Given a FieldDoc object, stores the values used to sort the given document.
   * These values are not the raw values out of the index, but the internal
   * representation of them. This is so the given search hit can be collated by
   * a MultiSearcher with other search hits.
   * 
   * @param doc
   *  The FieldDoc to store sort values into.
   * @return The same FieldDoc passed in.
   * @see Searchable#search(Weight,Filter,int,Sort)
   */
  FieldDoc fillFields(final Entry entry) {
final int n = comparators.length;
final Comparable[] fields = new Comparable[n];
for (int i = 0; i < n; ++i) {
  fields[i] = comparators[i].value(entry.slot);
}
//if (maxscore > 1.0f) doc.score /= maxscore;   // normalize scores
return new FieldDoc(entry.docID, entry.score, fields);
  }

{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1862) duplicate package.html files in queryParser and analsysis.cn packages

2009-08-27 Thread Hoss Man (JIRA)
duplicate package.html files in queryParser and analsysis.cn packages
-

 Key: LUCENE-1862
 URL: https://issues.apache.org/jira/browse/LUCENE-1862
 Project: Lucene - Java
  Issue Type: Bug
Reporter: Hoss Man
 Fix For: 2.9


These files conflict with eachother when building the javadocs. there can be 
only one (of each) ...

{code}
hoss...@brunner:~/lucene/java$ find src contrib -name package.html | perl -ple 
's{.*src/java/}{}' | sort | uniq -c | grep -v " 1 "
   2 org/apache/lucene/analysis/cn/package.html
   2 org/apache/lucene/queryParser/package.html
hoss...@brunner:~/lucene/java$ find src contrib -path \*queryParser/package.html
src/java/org/apache/lucene/queryParser/package.html
contrib/queryparser/src/java/org/apache/lucene/queryParser/package.html
hoss...@brunner:~/lucene/java$ find src contrib -path \*cn/package.html
contrib/analyzers/common/src/java/org/apache/lucene/analysis/cn/package.html
contrib/analyzers/smartcn/src/java/org/apache/lucene/analysis/cn/package.html
{code}



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1861) Add contrib libs to classpath for javadoc

2009-08-27 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1861.
--

   Resolution: Fixed
Fix Version/s: 2.9
 Assignee: Hoss Man

Committed revision 808419.

> Add contrib libs to classpath for javadoc
> -
>
> Key: LUCENE-1861
> URL: https://issues.apache.org/jira/browse/LUCENE-1861
> Project: Lucene - Java
>  Issue Type: Wish
>  Components: Build
>Reporter: Mark Miller
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
>
> I don't know Ant well enough to just do this easily, so I've labeled a wish - 
> would be nice to get rid of all the errors/warnings that not finding these 
> classes generates when building javadoc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1798) FieldCacheSanityChecker called directly by FieldCache.get*

2009-08-25 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747570#action_12747570
 ] 

Hoss Man commented on LUCENE-1798:
--

bq. I suppose we could switch to InsanityMonitor but then provide a 
PrintStreamInstanityMonitor impl... still seems kinda overkill though.

...that was what i had in mind, but you're right -- it is overkill.  a 
PrintStream is a nice quick and easy way to get this info -- if they really 
want robust data structures they can use the sanity checker directly (possibly 
even from a mock PrintStream)

bq. I agree we could make LuceneTestCase.tearDown more robust if tap into this, 
though the simple infoStream could also be used for that? ... sure, because if 
*anything* gets written to that stream, it indicates a bug ... unless they 
expect it, in which case they can catch an exception an ignore it.

but the LuceneTestCase changes are less urgent ... i was mainly worried about 
making surewe were happy with the API.  You've convinced me.

> FieldCacheSanityChecker called directly by FieldCache.get*
> --
>
> Key: LUCENE-1798
> URL: https://issues.apache.org/jira/browse/LUCENE-1798
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Assignee: Michael McCandless
> Fix For: 2.9
>
> Attachments: LUCENE-1798.patch, LUCENE-1798.patch
>
>
> As suggested by McCandless in LUCENE-1749, we can make FieldCacheImpl a 
> client of the FieldCacheSanityChecker and have it sanity check itself each 
> time it creates a new cache entry, and log a warning if it thinks there is a 
> problem.  (although we'd probably only want to do this if the caller has set 
> some sort of infoStream/warningStream type property on the FieldCache object.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1798) FieldCacheSanityChecker called directly by FieldCache.get*

2009-08-25 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747538#action_12747538
 ] 

Hoss Man commented on LUCENE-1798:
--

Michael: reading the patch you commited, i only have two concerns...

1) a PrintStream doesn't really seem like the ideal callback API for this 
situation ... with IndexWriter it makes some sense because we ant to be able to 
log all sorts of misc info that will be unstructured, but in the field cache 
checkign case we already have a fairly robust data structure (Insanity) that we 
can provide ... so instead of a setInfoStream(PritStream) method, why not have 
a callback interface that takes Insanity objects (and the Entry that triggered 
the problem)

{code}
  /** 
   * If non-null, this monitor will be notify anytime an entry is created 
   * which are not sane according to {...@link FieldCacheSanityChecker}.
   * @param monitor The Monitor to notify, if it throws a RuntimeException then 
the cache method will throw a RuntimeException.
   */
  public void setInsanityMonitor(InsanityMonitor monitor)
  ...
  public interface InsanityMonitor {
public void notify(CacheEntry e, Instanty[] i);
  }
{code}

2) it seems like we should change LuceneTestCase to use this new hook instead 
of just calling the FieldCacheSanityChecker in tearDown() ... that way we can 
be sure we're checking all FieldCache usages (the current approach risks 
IndexReader weak refs getting gc'ed after they go out of scope in the test and 
before the checker runs in tearDown)


thoughts?


> FieldCacheSanityChecker called directly by FieldCache.get*
> --
>
> Key: LUCENE-1798
> URL: https://issues.apache.org/jira/browse/LUCENE-1798
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Assignee: Michael McCandless
> Fix For: 2.9
>
> Attachments: LUCENE-1798.patch, LUCENE-1798.patch
>
>
> As suggested by McCandless in LUCENE-1749, we can make FieldCacheImpl a 
> client of the FieldCacheSanityChecker and have it sanity check itself each 
> time it creates a new cache entry, and log a warning if it thinks there is a 
> problem.  (although we'd probably only want to do this if the caller has set 
> some sort of infoStream/warningStream type property on the FieldCache object.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1798) FieldCacheSanityChecker called directly by FieldCache.get*

2009-08-23 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12746719#action_12746719
 ] 

Hoss Man commented on LUCENE-1798:
--

i haven't looked at the patch, but i don't think you need two calls to the 
sanity checker. 

Why not just a single call after the val has been created and log if any of the 
Insanity  objects contain the new val?

> FieldCacheSanityChecker called directly by FieldCache.get*
> --
>
> Key: LUCENE-1798
> URL: https://issues.apache.org/jira/browse/LUCENE-1798
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Assignee: Michael McCandless
> Fix For: 2.9
>
> Attachments: LUCENE-1798.patch
>
>
> As suggested by McCandless in LUCENE-1749, we can make FieldCacheImpl a 
> client of the FieldCacheSanityChecker and have it sanity check itself each 
> time it creates a new cache entry, and log a warning if it thinks there is a 
> problem.  (although we'd probably only want to do this if the caller has set 
> some sort of infoStream/warningStream type property on the FieldCache object.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-718) possible build.xml addition to ensure 1.4 class compatibility

2009-08-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744901#action_12744901
 ] 

Hoss Man commented on LUCENE-718:
-

you realize we're going to have this same problem with 1.6 right?

> possible build.xml addition to ensure 1.4 class compatibility
> -
>
> Key: LUCENE-718
> URL: https://issues.apache.org/jira/browse/LUCENE-718
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Other
>Reporter: Hoss Man
> Attachments: check.bootclasspath.patch, check.bootclasspath.patch
>
>
> As encountered recently, setting the "source" and "target" values for the 
> java compiler don't acctually test that the classes/methods are 1.4 
> compatible -- just that the language syntax/features are...
> http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6333296
> ...i've come up with one possible solution, that's really feels like a hack, 
> but i wanted to throw it out here for comment, in a nutshell:
>1) we support a new optional javac.bootclasspath property indicating with 
> path the 
>compiler should use.
>2) people compiling with 1.4 can ignore that property
>3) anyone who has a 1.5 compiler by default, can set this proprety to 
> point at a 1.4 copy 
> of the rt.jar -- which is not inlcuded (users would need to install 
> it themselves)
>4) as part of the "init" target the build file will attempt to compile a 
> java class that is
>syntactically correct in java 1.4, but utilizes a method only 
> available in 1.5 ... if this 
>class compiles cleanly, the task will fail.
>5) java 1.5 users that aren't concerned about submitting compatible 
> patches back to 
> the comunity and don't want to hassle with a 1.4 version of rt.jar, 
> can set 
> a"javac.trustbootclasspath" and go about their merry way.
> The main idea here being that if someone has both JVMs installed and 
> accidently uses the wrong one to test something before submitting a patch or 
> committing, their build will either fail with a helpful message, or compile 
> against the correct set of core classes anyway if they've done a small amount 
> of setup.
> Caveats to commiting this:
>a) it's a hack, so i don't wnat to commit unless multiple people like it
>b) at the moment, all "successful" ant executions print a confusing 
> compiler error as right 
>off the bat, it would be better if we could supress that somehow.
>c) the BUILD.txt should be updated accordingly.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1821) Weight.scorer() not passed doc offset for "sub reader"

2009-08-18 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744824#action_12744824
 ] 

Hoss Man commented on LUCENE-1821:
--

bq. you are caching from external id to ord - its really not something I think 
we intend to support. The fact that we don't support it is why we were able to 
make this change. The FieldCache is the caching mechanism that Lucene supports 
with internal ids - and it supports it per segment.

I think Tim's got a valid point though about wanting an ordinal value across 
the entire index ... he's not using external ids, he's using the internal 
lucene docIds, and wants to know the ordinal value of a field for each doc 
across the entire index -- as he said, he's essentially using a 
FieldCache.StringIndex he just doesn't care about the String[] part.

Solr had/has the same problem with some of the function queries that wanted 
ordinal values (or the min/max field value for the whole index) that i think 
yonik just punted on and fetched the outermost field cache anyway ... we just 
weren't using it inside the Weight class, so we didn't encounter the specified 
problem Tim did.


> Weight.scorer() not passed doc offset for "sub reader"
> --
>
> Key: LUCENE-1821
> URL: https://issues.apache.org/jira/browse/LUCENE-1821
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Search
>Affects Versions: 2.9
>Reporter: Tim Smith
>
> Now that searching is done on a per segment basis, there is no way for a 
> Scorer to know the "actual" doc id for the document's it matches (only the 
> relative doc offset into the segment)
> If using caches in your scorer that are based on the "entire" index (all 
> segments), there is now no way to index into them properly from inside a 
> Scorer because the scorer is not passed the needed offset to calculate the 
> "real" docid
> suggest having Weight.scorer() method also take a integer for the doc offset
> Abstract Weight class should have a constructor that takes this offset as 
> well as a method to get the offset
> All Weights that have "sub" weights must pass this offset down to created 
> "sub" weights

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-18 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1791.
--

Resolution: Fixed

Geezz... you expect me to commit *and* resolve ... i thought this open source 
stuff was all about volunteering and cooperation, how come i gotta do _both_ 
?!?!

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch, LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Assigned: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-15 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man reassigned LUCENE-1791:


Assignee: Hoss Man

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
>Assignee: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-15 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1791:
-

Attachment: LUCENE-1791.patch

i put the doc "ids" into a KEY field and refactored ItemizedFilter to be a 
trivial subclass of FieldCacheTermsFilter.

I also added more wrap permutations to address some of the possible edge cases 
Simon pointed out (good catch SImon) but didn't introduce any randomization for 
hte reasons mentioned before (even with the change to not rely on consistent 
docIds in ItemizedFilter, we can't allow deletions before the wrapped 
searcher/reader because CheckHIts does it magic based on docIds. 

(hmm... i suppose the wrap functions could return some metadata about what 
offset the old ids have in the new search/reader and CheckHits could use that 
 hmmm ... seems kludgy so i'm not going to worry about it)

I think we're good to go here unless anyone has any objections

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-13 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743039#action_12743039
 ] 

Hoss Man commented on LUCENE-1791:
--

bq. The filter is applied per sub-reader

Doh! .. right, that's the part i was forgetting -- filters are another thing 
that happen per reader now.

sweet.  i'll review the patch for real tomorow or sat, and maybe rip out the 
ItemizedFilter and replace it with something more sensical (it's a bad example 
that people might stumble upon, that seems even more confusing now, and i feel 
responsible since i'm the one that wrote it for those explanation tests anyway)


> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-13 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12743032#action_12743032
 ] 

Hoss Man commented on LUCENE-1791:
--

bq. Okay, I've got all tests passing.

Sweet! ... Even the random boolean query test? (i see changes to ReqExclScorer 
in your patch, but i don't relaly see how that could solve the problem)

bq. I rigged up a smarted ItemizedFilter that works for these tests.

I'm clearly missing something ... why does ItemizedFilter need to change?  I 
know it's an abomination and you would never want to use something like that in 
real code, but in this test the docIds passed to it on construction shouldn't 
change -- even when we wrap the reader/searcher in other searchers, i 
specificly only put empty indexes (with no deletions) in front of the original 
index when bulding the multireader/searcher so the docIds will be the same.  
why doesn't the existing implementation work?

bq. Would it make sense to randomized reader / searcher indexes

Simon: randomized wrapping would be nice, but i didn't try to do that for two 
reasons:
   * the utility has no way to get the seed from LuceneTestCase, and i didn't 
want to introduce randomness unless it was predictable randomness that could be 
logged.
   * i needed to be sure we didn't add an index with deletions prior to the 
original index (but based on Mark's comments, it looks like that's not really 
an issue since we have to deal with it anyway)

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-13 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1791:
-

Attachment: LUCENE-1791.patch

one other thing i experimented with earlier today was making the "wrapped" 
MultiReaders and MultiSearchers a little more complicated (deeper structures, 
and adding some deleted docs to the other subreaders)

at first i ran into problems because the deleted docs would through off the 
docIds of real docs (and some tests expected certain docs to have specific IDs) 
but even when the deleted docs only appear "after" the real docs, there are 
still some test failures with this latest patch.

TestSimpleExplanations and TestComplexExplanations are the ones i've been 
looking at but there may be others (my laptop gets to hot if i try to run the 
full test suite too often)

my head hurts trying to figure out why the deeper nested structures and deleted 
docs might be causing these new failures ... i'll try to look at it tomorow.


> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch, 
> LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1791:
-

Attachment: LUCENE-1791.patch

I figured out the problem with TestComplexExplanations ... the test uses a 
searcher with a Custom Similarity, and the new code wasn't setting the same 
Similarity on the new Searcher & MultiSearcher being created.

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch, LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-12 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742614#action_12742614
 ] 

Hoss Man commented on LUCENE-1791:
--

FYI: with mark's updated path, we're back to just the NaN failures from 
TestComplexExplanations.  (and possibly TestBoolean2.testRandomQueries, but i 
can't confirm that)

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-12 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742606#action_12742606
 ] 

Hoss Man commented on LUCENE-1791:
--

midair collision (x2) ... i think i see what you mean in your revised patch ... 
the tests don't need changed, it's just the test utility methods that were 
trying to recurse the readers that weren't doing the entire job.

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-12 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742603#action_12742603
 ] 

Hoss Man commented on LUCENE-1791:
--

{quote}
Well that explains half the output anyway - even if thats fixed there is still 
a fail. Its because the tests doesn't expand fully into the subreaders - just 
needed the top level before - with this test, we need to recursively grab them.
{quote}
You lost me there... are you saying the _tests_ needs to be changed? ... why?  

For this patch to trigger an error in an existing test, that test must either 
be using CheckHits or QueryUtils to execute a query against a seracher and 
validate the results are ok ... why would the test be responsible for any 
subreader expansion in this case?

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch, LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-12 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742600#action_12742600
 ] 

Hoss Man commented on LUCENE-1791:
--

bq. I'm guess the NAN failures are not a problem - looks like they fail because 
NAN != NAN?

right -- but why would the scores be NaN when wrapped in a MultiReader? when 
it's *not* wrapped in a MultiReader the test passes, so the scores must not be 
NaN in that case.

bq. I don't think the fieldcache insanity is multi-reader related [...] same 
stuff now, doubled entry.

The sanity checker ignores when two CacheEntries differ only by parser 
(precisely because of the the null/default parser issue) and the resulting 
value object is the same.  but it does include all related CacheEntry objects 
in an Insanity object so that you have them all for debugging.

Looking at TestCustomScoreQuery.testCustomScoreByte (for example)...

{code}
*** BEGIN 
org.apache.lucene.search.function.TestCustomScoreQuery.testCustomScoreByte: 
Insane FieldCache usage(s) ***
SUBREADER: Found caches for decendents of 
org.apache.lucene.index.directoryrea...@88d2ae+iii

'org.apache.lucene.index.directoryrea...@88d2ae'=>'iii',byte,null=>[B#841343 
(size =~ 33 bytes)

'org.apache.lucene.index.directoryrea...@88d2ae'=>'iii',byte,org.apache.lucene.search.FieldCache.DEFAULT_BYTE_PARSER=>[B#841343
 (size =~ 33 bytes)

'org.apache.lucene.index.compoundfilereader$csindexin...@77daaa'=>'iii',byte,org.apache.lucene.search.FieldCache.DEFAULT_BYTE_PARSER=>[B#981898
 (size =~ 33 bytes)

'org.apache.lucene.index.compoundfilereader$csindexin...@77daaa'=>'iii',byte,null=>[B#981898
 (size =~ 33 bytes)

*** END 
org.apache.lucene.search.function.TestCustomScoreQuery.testCustomScoreByte: 
Insane FieldCache usage(s) ***

{code}

The insanity type is "SUBREADER", so it's specificly identified a problem with 
that type of relationship.  There are 4 CacheEntries listed in the error all 
from the same field, but from two different readers.  If you note the value 
identity hashcodes (just before the size estimate) each reader has only one 
value cached for that field (with different parsers) which is why there isn't a 
seperate error about the multiple values. 
as the first line of hte Instanity.toString() states: what it found is that 
directoryrea...@88d2ae and at least one of it's decendents both have cached 
entires for the same field.



> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-12 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742552#action_12742552
 ] 

Hoss Man commented on LUCENE-1791:
--

bq. (grr it uses it's own random so no seed was logged)

Correction: it does log the seed, i was just looking for stderr when i should 
have been looking for stdout...

{code}
failed query: +field:w2 field:w3 field:xx field:w4 field:w2
NOTE: random seed of testcase 'testRandomQueries' was: 5695251427490718890
{code}

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1791:
-

Fix Version/s: 2.9

I just retried this patch against the trunk now that the 
FieldCacheSanityChecker and some other patches have been committed.  In 
addition to the possibly false-negatives from TestComplexExplanation (NaN 
score) this is now surfacing FielCache sanity failures from 
TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues (suggesting that 
there are code paths where those query types don't correctly use the subreaders 
to get the FieldCache) as well as checkFirstSkipTo() failures for 
TestSpansAdvanced2 and an ArrayIndexOutOfBoundsException from 
TestBoolean2.testRandomQueries  (grr it uses it's own random so no seed was 
logged)

I don't pretend this patch is perfect, but i can't imagine these are all 
false-negatives.  

We should get to the bottom of this before 2.9.  I'll start trying to figure it 
out on the train tonight.

> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Fix For: 2.9
>
> Attachments: LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Resolved: (LUCENE-1749) FieldCache introspection API

2009-08-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man resolved LUCENE-1749.
--

Resolution: Fixed
  Assignee: Hoss Man

Committed revision 803676.


> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Assignee: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1749) FieldCache introspection API

2009-08-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1749:
-

Attachment: LUCENE-1749.patch

one last updated: the Locale.US asserts in TestRemoteSort had the same problem 
as TestSort, they were suppose to be moved, but instead they were just copied 
(not sure how i missed that before)

> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1749) FieldCache introspection API

2009-08-12 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1749:
-

Attachment: LUCENE-1749.patch

updated patch to trunk (QueryWeight->Weight) and tweaked some FieldCacheImpl 
methods to use the non-deprecated Entry constructors (forgot that part before)

I'll commit as soon as my test run is finished.

> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-12 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12742460#action_12742460
 ] 

Hoss Man commented on LUCENE-1789:
--

Cool... i don't suppose you have time to work on a patch? 

(what's the emoticon for fingers crossed?)

> getDocValues should provide a MultiReader DocValues abstraction
> ---
>
> Key: LUCENE-1789
> URL: https://issues.apache.org/jira/browse/LUCENE-1789
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
>
> When scoring a ValueSourceQuery, the scoring code calls 
> ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
> instances are backed by the individual FieldCache entries of the subreaders 
> -- but if Client code were to inadvertently  called getValues() on a 
> MultiReader (or DirectoryReader) they would wind up using the "outer" 
> FieldCache.
> Since getValues(IndexReader) returns DocValues, we have an advantage here 
> that we don't have with FieldCache API (which is required to provide direct 
> array access). getValues(IndexReader) could be implimented so that *IF* some 
> a caller inadvertently passes in a reader with non-null subReaders, getValues 
> could generate a DocValues instance for each of the subReaders, and then wrap 
> them in a composite "MultiDocValues".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1749) FieldCache introspection API

2009-08-10 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1749:
-

Attachment: LUCENE-1749.patch

slightly revised patch based on java-...@lucene discussion...

the sortFieldTYpe and Locale portions of Cache.Entry are never used by 
FieldCache -- just a deprecated class that abuses the Entry api out of 
lazyiness... so the CacheEntry debugging abstraction shouldn't expose them (but 
i left in code to manifest them in the toString() if they are atypical just in 
case).  Also added some deprecation notices so we remember to remove them once 
they are no longer needed.



> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1798) FieldCacheSanityChecker called directly by FieldCache.get*

2009-08-10 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741480#action_12741480
 ] 

Hoss Man commented on LUCENE-1798:
--

https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12741479#action_12741479
 
{quote}
FieldCacheImpl.Cache.get could use the FieldCacheSanityChecker to inspect 
itself immediately after calling createValue, and could even test if any of the 
Insanity instances returned are related to the current call (by comparing the 
CacheEntry with the Entry it's using) ... it could even log a useful stack 
trace since the sanity check would be happening in the same call stack as at 
least one of the CacheEntries in the Insanity object.
{quote}

> FieldCacheSanityChecker called directly by FieldCache.get*
> --
>
> Key: LUCENE-1798
> URL: https://issues.apache.org/jira/browse/LUCENE-1798
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>
> As suggested by McCandless in LUCENE-1749, we can make FieldCacheImpl a 
> client of the FieldCacheSanityChecker and have it sanity check itself each 
> time it creates a new cache entry, and log a warning if it thinks there is a 
> problem.  (although we'd probably only want to do this if the caller has set 
> some sort of infoStream/warningStream type property on the FieldCache object.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-10 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741479#action_12741479
 ] 

Hoss Man commented on LUCENE-1749:
--

bq. Maybe we should simply print a warning, eg to System.err, on detecting that 
2X RAM usage has occurred, pointing people to the sanity checker? We could eg 
do it once only so we don't spam the stderr logs

I'm not really comfortable dumping anything to System.err without user 
requesting it ... but this is a really interesting idea.  (I suppose we could 
add an infoStream type idea to FieldCache to expose this)

FieldCacheImpl.Cache.get could use the FieldCacheSanityChecker to inspect 
itself immediately after calling createValue, and could even test if any of the 
Insanity instances returned are related to the current call (by comparing the 
CacheEntry with the Entry it's using) ... it could even log a useful stack 
trace since the sanity check would be happening in the same call stack as at 
least one of the CacheEntries in the Insanity object.

I've opened LUCENE-1798 to track implmenting somehting like this once the 
FieldCacheSanityChecker gets committed.

> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1798) FieldCacheSanityChecker called directly by FieldCache.get*

2009-08-10 Thread Hoss Man (JIRA)
FieldCacheSanityChecker called directly by FieldCache.get*
--

 Key: LUCENE-1798
 URL: https://issues.apache.org/jira/browse/LUCENE-1798
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Search
Reporter: Hoss Man


As suggested by McCandless in LUCENE-1749, we can make FieldCacheImpl a client 
of the FieldCacheSanityChecker and have it sanity check itself each time it 
creates a new cache entry, and log a warning if it thinks there is a problem.  
(although we'd probably only want to do this if the caller has set some sort of 
infoStream/warningStream type property on the FieldCache object.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-10 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12741464#action_12741464
 ] 

Hoss Man commented on LUCENE-1789:
--

bq. How about this: we add a new param to the ctors of the value sources, 
called (say) acceptMultiReader. It has 3 values:

...that would work ... but i feel like there may be a cleaner API possible 
here...

What if we just added a new MultiValueSource wrapper class, that acted as a 
proxy around another ValueSource so that the only non-transparent behavior is 
that MultiValueSource.getDocValues returns an instance of the new 
MultiDocValues we've been talking about.

If you use something like FloatFieldSource directly in your code, you get what 
you ask for: the FieldCache is fetched agaisnt the exact reader you supply (ie: 
YES_BURN_MEMORY).  If you want to use a FieldSource directly in your code, and 
you want to get good cache reuse, and you don't want to sorry about the 
subreaders yourself, you wrap your FieldSource in a new 
MultiValueSource(myFieldSource)  (YES_BURN_TIME)

The only thing this wouldn't get us is an obvious warning to developers on 
upgrading (like the deprecation warnings htat would come from your suggested 
API) ... but since nothing about backwards compatibility is actually breaking 
here, that doesn't seem like the end of the world -- we can document it in 
CHANGES.txt (we're going to need a nice big section there about all the 
FieldCache usage changes anyway) drawing their attention to the new 
MultiValueSource they should consider using.

My thinking is this: anybody who is constructing new ValueSOurces directly is 
pretty deep into the code, odds are if they're using that type of code, they 
might be mucking with the FieldCache directly in other ways as well -- we can't 
solve all their problems, but we can give them helper code to make the 
transition easier)


> getDocValues should provide a MultiReader DocValues abstraction
> ---
>
> Key: LUCENE-1789
> URL: https://issues.apache.org/jira/browse/LUCENE-1789
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
>
> When scoring a ValueSourceQuery, the scoring code calls 
> ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
> instances are backed by the individual FieldCache entries of the subreaders 
> -- but if Client code were to inadvertently  called getValues() on a 
> MultiReader (or DirectoryReader) they would wind up using the "outer" 
> FieldCache.
> Since getValues(IndexReader) returns DocValues, we have an advantage here 
> that we don't have with FieldCache API (which is required to provide direct 
> array access). getValues(IndexReader) could be implimented so that *IF* some 
> a caller inadvertently passes in a reader with non-null subReaders, getValues 
> could generate a DocValues instance for each of the subReaders, and then wrap 
> them in a composite "MultiDocValues".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-07 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740574#action_12740574
 ] 

Hoss Man commented on LUCENE-1789:
--

{quote}
While client code that has relied on this in the past will nicely
continue to function properly, if we make this change, its performance
is going to silently take a [possibly sizable] hit.
{quote}

Correct: a change like this could cause 2.9 to introduce a _time_ based 
performance hit from the added method call to resolve the sub(reader|docvalue) 
on each method call ... but if we don't have a change like this, 2.9 could 
introduce a _memory_ based performance hit from the other FieldCache changes as 
it client code accessing DocValues for the  top level reader will create a 
duplication of the whole array.

Incidently: I'm willing to believe you that the time based perf hit would be 
high, but my instinct is that it wouldn't be that bad: the DocValues API 
already introduces at least one method call per doc lookup (two depending on 
datatype).  adding a second method call to delegate to a sub-DocValues isntance 
doesn't seem that bad (especially since a new MultDocValues class could get the 
subReader list and compute the docId offsets on init, and then reuse them on 
each method call)

bq. In the core I think we should always switch "up high".

(In case there is any confusion: wasn't suggesting that we stop using "up high" 
switching on DocValues in code included in the Lucene dist, i was suggesting 
that if someone uses DocValues directly in their code (against a top level 
reader) then we help them out by giving them the "down low" switching ... so 
"expected" usages wouldn't pay the added time based hit, just "unexpected" 
usages (which would be saved from the memory hit))

{quote}
Maybe it'd be best if we could somehow allow this "down low" switching
for 2.9, but 1) warn that you'll see a performance hit right off, 2)
deprecate it, and 3) and somehow state that in 3.0 you'll have to send
only a SegmentReader to this API, instead.
{quote}

that would get into really sticky territory for people writting custom 
IndexReaders (or using FilteredIndexReader)

bq. But, if we make the proposed change here, the app could in fact just keep 
working off the top-level values (eg if the ctor in their class is pulling 
these values), thinking everything is fine when in fact there is a sizable, 
silent perf hit.

I agree ... but unless i'm missing something about the code on the trunk, that 
situation already exists: the developer might switch to using the Collector 
API, but nothing about the   current trunk will prevent/warn him that this...

{code}
ValueSource vs = new ValueSource("aFieldIAlsoSortOn");
IndexReader r = getCurrentReaderThatCouldBeAMultiReader();
DocValues vals = vs.getDocValues(r);
{code}

...could have a sizable, silent, _memory_ perf hit in 2.9

(ValueSource.getValues has a javadoc indicating that caching will be done on 
the IndexReader passed in, but your comment suggests that if 2.9 were released 
today (with hte current trunk) people upgrading would have some obvious way of 
noticing that they need to pass a sub reader to getValues)






> getDocValues should provide a MultiReader DocValues abstraction
> ---
>
> Key: LUCENE-1789
> URL: https://issues.apache.org/jira/browse/LUCENE-1789
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
>
> When scoring a ValueSourceQuery, the scoring code calls 
> ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
> instances are backed by the individual FieldCache entries of the subreaders 
> -- but if Client code were to inadvertently  called getValues() on a 
> MultiReader (or DirectoryReader) they would wind up using the "outer" 
> FieldCache.
> Since getValues(IndexReader) returns DocValues, we have an advantage here 
> that we don't have with FieldCache API (which is required to provide direct 
> array access). getValues(IndexReader) could be implimented so that *IF* some 
> a caller inadvertently passes in a reader with non-null subReaders, getValues 
> could generate a DocValues instance for each of the subReaders, and then wrap 
> them in a composite "MultiDocValues".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Issue Comment Edited: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-07 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740328#action_12740328
 ] 

Hoss Man edited comment on LUCENE-1789 at 8/7/09 7:16 AM:
--

This idea orriginated in LUCENE-1749, see these comments...

https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740155#action_12740155
https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740256#action_12740256
https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740278#action_12740278


I've marked this for 2.9 for now  i think it's a "nice to have" in 2.9, 
because unlike general FieldCache usage, the API is abstract enough we can 
protect our users from mistakes; but i don't personally think it's critical 
that we do this if no one else wants to take a stab at it.

(EDIT: shorter versions of URLs to prevent horizontal scroll)

  was (Author: hossman):
This idea orriginated in LUCENE-1749, see these comments...

https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740155&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12740155
https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740256&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12740256
https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740278&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12740278


I've marked this for 2.9 for now  i think it's a "nice to have" in 2.9, 
because unlike general FieldCache usage, the API is abstract enough we can 
protect our users from mistakes; but i don't personally think it's critical 
that we do this if no one else wants to take a stab at it.
  
> getDocValues should provide a MultiReader DocValues abstraction
> ---
>
> Key: LUCENE-1789
> URL: https://issues.apache.org/jira/browse/LUCENE-1789
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
>
> When scoring a ValueSourceQuery, the scoring code calls 
> ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
> instances are backed by the individual FieldCache entries of the subreaders 
> -- but if Client code were to inadvertently  called getValues() on a 
> MultiReader (or DirectoryReader) they would wind up using the "outer" 
> FieldCache.
> Since getValues(IndexReader) returns DocValues, we have an advantage here 
> that we don't have with FieldCache API (which is required to provide direct 
> array access). getValues(IndexReader) could be implimented so that *IF* some 
> a caller inadvertently passes in a reader with non-null subReaders, getValues 
> could generate a DocValues instance for each of the subReaders, and then wrap 
> them in a composite "MultiDocValues".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-06 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1791?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1791:
-

Attachment: LUCENE-1791.patch

Patch showing what i have in mind.

Current patch causes 14 failures in TestComplexExplanations, all of which stem 
from calling checkFirstSkipTo() on IndexSearcher created by 
wrapUnderlyingReader().  

The failure is QueryUtils.java:305: "unstable skipTo(0) score! expected: 
but was:" which seems like it could easily be a bug introduced by the 
patch, except...

   * Why only cause 14/22 in TestComplexExplanations to fail?
   * Why wouldn't the original IndexSearcher have produced a NaN score?


> Enhance QueryUtils and CheckHIts to wrap everything they check in 
> MultiReader/MultiSearcher
> ---
>
> Key: LUCENE-1791
> URL: https://issues.apache.org/jira/browse/LUCENE-1791
> Project: Lucene - Java
>  Issue Type: Test
>Reporter: Hoss Man
> Attachments: LUCENE-1791.patch
>
>
> methods in CheckHits & QueryUtils are in a good position to take any Searcher 
> they are given and not only test it, but also test MultiReader & 
> MultiSearcher constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1791) Enhance QueryUtils and CheckHIts to wrap everything they check in MultiReader/MultiSearcher

2009-08-06 Thread Hoss Man (JIRA)
Enhance QueryUtils and CheckHIts to wrap everything they check in 
MultiReader/MultiSearcher
---

 Key: LUCENE-1791
 URL: https://issues.apache.org/jira/browse/LUCENE-1791
 Project: Lucene - Java
  Issue Type: Test
Reporter: Hoss Man


methods in CheckHits & QueryUtils are in a good position to take any Searcher 
they are given and not only test it, but also test MultiReader & MultiSearcher 
constructs built around them

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1771) Using explain may double ram reqs for fieldcaches when using ValueSourceQuery/CustomScoreQuery or for ConstantScoreQuerys that use a caching Filter.

2009-08-06 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1771?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1771:
-

Attachment: LUCENE-1771.patch

FWIW: the last patch was giving me compile errors because BoostingNearQuery 
still referenced the removed QueryWeight class.

I think this is the correct fix.

> Using explain may double ram reqs for fieldcaches when using 
> ValueSourceQuery/CustomScoreQuery or for ConstantScoreQuerys that use a 
> caching Filter.
> 
>
> Key: LUCENE-1771
> URL: https://issues.apache.org/jira/browse/LUCENE-1771
> Project: Lucene - Java
>  Issue Type: Bug
>  Components: Search
>Reporter: Mark Miller
>Assignee: Mark Miller
> Fix For: 2.9
>
> Attachments: LUCENE-1771.patch, LUCENE-1771.patch, LUCENE-1771.patch, 
> LUCENE-1771.patch, LUCENE-1771.patch, LUCENE-1771.patch
>
>


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-06 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740328#action_12740328
 ] 

Hoss Man commented on LUCENE-1789:
--

This idea orriginated in LUCENE-1749, see these comments...

https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740155&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12740155
https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740256&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12740256
https://issues.apache.org/jira/browse/LUCENE-1749?focusedCommentId=12740278&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12740278


I've marked this for 2.9 for now  i think it's a "nice to have" in 2.9, 
because unlike general FieldCache usage, the API is abstract enough we can 
protect our users from mistakes; but i don't personally think it's critical 
that we do this if no one else wants to take a stab at it.

> getDocValues should provide a MultiReader DocValues abstraction
> ---
>
> Key: LUCENE-1789
> URL: https://issues.apache.org/jira/browse/LUCENE-1789
> Project: Lucene - Java
>  Issue Type: Improvement
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
>
> When scoring a ValueSourceQuery, the scoring code calls 
> ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
> instances are backed by the individual FieldCache entries of the subreaders 
> -- but if Client code were to inadvertently  called getValues() on a 
> MultiReader (or DirectoryReader) they would wind up using the "outer" 
> FieldCache.
> Since getValues(IndexReader) returns DocValues, we have an advantage here 
> that we don't have with FieldCache API (which is required to provide direct 
> array access). getValues(IndexReader) could be implimented so that *IF* some 
> a caller inadvertently passes in a reader with non-null subReaders, getValues 
> could generate a DocValues instance for each of the subReaders, and then wrap 
> them in a composite "MultiDocValues".

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-1789) getDocValues should provide a MultiReader DocValues abstraction

2009-08-06 Thread Hoss Man (JIRA)
getDocValues should provide a MultiReader DocValues abstraction
---

 Key: LUCENE-1789
 URL: https://issues.apache.org/jira/browse/LUCENE-1789
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Hoss Man
Priority: Minor
 Fix For: 2.9


When scoring a ValueSourceQuery, the scoring code calls 
ValueSource.getValues(reader) on *each* leaf level subreader -- so DocValue 
instances are backed by the individual FieldCache entries of the subreaders -- 
but if Client code were to inadvertently  called getValues() on a MultiReader 
(or DirectoryReader) they would wind up using the "outer" FieldCache.

Since getValues(IndexReader) returns DocValues, we have an advantage here that 
we don't have with FieldCache API (which is required to provide direct array 
access). getValues(IndexReader) could be implimented so that *IF* some a caller 
inadvertently passes in a reader with non-null subReaders, getValues could 
generate a DocValues instance for each of the subReaders, and then wrap them in 
a composite "MultiDocValues".




-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1749:
-

Attachment: LUCENE-1749.patch


bq. the interestingthing is that the CacheEntry.toString() doesn't show the 
Local.US was used when getting the Strings[] FieldCache

I'm an idiot ... the Locale isn't used like a FieldCache Parser ... the same 
String[] is used regardless of the Localed, so it's never part of the CacheKey. 
 the output is correct.

revised patch fixes TestSort as mark pointed out, and updates some javadocs 
where i missleading suggested different Locales might trigger 
InsanityType.VALUEMISMATCH

> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740311#action_12740311
 ] 

Hoss Man commented on LUCENE-1749:
--

bq. I think that TestCustomScoreQuery, TestFieldScoreQuery, and TestOrdValues 
all fail because the fix for them is now in another issue.

ah ... are you talking about LUCENE-1771 ? (the jira dependency sems backwards 
in that case)

bq. In TestSort you had moved the local sorting to the bottom in the multi sort 
test - I kept that, but I also kept them higher up. So thats the fail - they 
just have to be removed.

yeah .. i just caught that and was starting to reply ... the interestingthing 
is that the CacheEntry.toString() doesn't show the Local.US was used when 
getting the Strings[] FieldCache. .. i'm currently trying to figure out why 
(because that could confuse people as well)

> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hoss Man updated LUCENE-1749:
-

Attachment: LUCENE-1749.patch

checkpoint: no functional change from mark's previous patch, just improved all 
the javadocs, including explanation of SanityCheckers purpose and 
experimental/expert warnings where appropriate.

> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740265#action_12740265
 ] 

Hoss Man commented on LUCENE-1749:
--

H...

actually mark, testing our your latest patch against hte trunk i'm seeing 
(FieldCache sanity) failures from TestCustomScoreQuery, TestFieldScoreQuery, 
TestOrdValues, and TestSort ... have you seen these?  did some other recent 
change on the trunk trigger these?

> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-06 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12740256#action_12740256
 ] 

Hoss Man commented on LUCENE-1749:
--

Mark: I'll start working on improving the docs (and other things from my 
previous todo list)

bq. P.S. I'm not sure we want to go with the way I have changed the tests here.

Are you talking about TestOrdValues and TestFieldScoreQuery ?

if we expect OrdValues and FieldScoreQuery to use subReader based field caches, 
then the test seems to be doing the correct thing (in your patch) .. inspecting 
the fieldcaches per subreader.  Is there a code path where we expect those 
methods to get called on a MultiReader?

(Actually: that seems like a wroth while improvement to make to these classes: 
a MultiDocValues impl that all of the getValues(IndexReader) methods use when 
passed a MultiReader ... it uses getSequentialSubReaders to construct DocValue 
instances for each so you don't get FieldCache expolsions if code inadvertenly 
passes the wrong reader to getValues.  What do you think? ... new issue?)

> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739093#action_12739093
 ] 

Hoss Man commented on LUCENE-1749:
--

bq. Hoss/Mark do you want to fold it in to the patch, here? Or I can open a new 
issue?

as i alluded to above, i'm in favor of individual issues for each "bug" 
uncovered by this issue so they can be tracked separately.

> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1749) FieldCache introspection API

2009-08-04 Thread Hoss Man (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12739092#action_12739092
 ] 

Hoss Man commented on LUCENE-1749:
--

General Comments on mark's latest patch...
* the changes that i understand all seem good ... some of the details in 
reader/searcher/query internals elude me but it sounds Yonik & McCandless have 
their eyes on them so i trust the three of you have it covered.
* we still need to fill in some empty/sparse javadocs, but that can be done 
after an initial commit.
* is it a bug that AverageGuessMemoryModel.getSize() will NPE on a non 
primitive class ... or should/will the docs for that API say it only works on 
primitives?

Big Questions I Still Have
* does anyone have any reservations about the new APIs introduced?
  * FieldCache.CreationPlaceholder (promoted from FieldCacheImpl)
  * FieldCache.CacheEntry
  * FieldCache.getCacheEntries()
  * FieldCache.purgeAllCaches()
  * FieldCacheSanityChecker
  * RamUsageEstimator
* does anyone have any reservations about the refactoring done in 
FieldCacheImpl to make this new API possible? (ie: did i break the thread 
safety in a way i'm not noticing?)
* is the FieldCacheImpl.Entry.type (the "SortField" int type) still needed by 
FieldCacheImpl.Entry? ... nothing seems to use it so it would be nice to 
eliminate it and simplify the CacheEntry API as well.  (i suspect it got 
refactored into obsolescence when the Sorting got moved into the subreaders)
* The sanity checking ignores CreationPlaceholder -- largely because of the way 
the numeric caches first try one parser, and then if they get an NFE try a 
different parser -- but this leaves the CreationPlaceholder in the cache.  It's 
not a big object, so i assume it was implemented this way on purpose and the 
sanity checker is doing the correct thing by ignoring it, but i wanted to make 
sure people are aware/ok with this behavior.
 
Lastly: This patch feels unnecessarily large at this point.  Several of the 
bugs/improvements we've uncovered here don't seem like belong in this patch, 
and should be tracked in separate Jira issues, which can be committed 
independently and enumerated in CHANGES.txt
  * new ReaderUtil and the usage in DirectoryReader, MultiReader, MultiSearcher 
& IndexSearcher
  * explain subreader bug fixes in ConstantScoreQuery, QueryWeight, 
ValueSourceQuery, CustomScoreQuery, etc...
...i think this issue (and this patch) should be reduced to just the new sanity 
checkig API, and *tests* that have been changed to be more sane (where the 
underlying code was already fine)

Mark: would you mind splitting up the latest patch you have (you mentioned some 
additional minor tweaks) and opening new issues for these peripheral changes 
and then attaching back what's left for this patch.  Then I'll take the conch 
back and work on the missing javadocs.

(I'll happily commit once i get at least one thumbs up from someone on the "Big 
Questions" above ... we can always tweak the javadocs further in subsequent 
commits)



> FieldCache introspection API
> 
>
> Key: LUCENE-1749
> URL: https://issues.apache.org/jira/browse/LUCENE-1749
> Project: Lucene - Java
>  Issue Type: Improvement
>  Components: Search
>Reporter: Hoss Man
>Priority: Minor
> Fix For: 2.9
>
> Attachments: fieldcache-introspection.patch, 
> LUCENE-1749-hossfork.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, 
> LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch, LUCENE-1749.patch
>
>
> FieldCache should expose an Expert level API for runtime introspection of the 
> FieldCache to provide info about what is in the FieldCache at any given 
> moment.  We should also provide utility methods for sanity checking that the 
> FieldCache doesn't contain anything "odd"...
>* entries for the same reader/field with different types/parsers
>* entries for the same field/type/parser in a reader and it's subreader(s)
>* etc...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



  1   2   3   4   5   6   >