Re: Proposal about Version API relaxation

2010-04-15 Thread Erick Erickson
Coming in late to the discussion, and without really understanding the
underlying Lucene issues, but...

The size of the problem of reindexing is under-appreciated I think.
Somewhere
in my company is the original data I indexed. But the effort it would take
to
resurrect it is O(unknown). An unfortunate reality of commercial products is
that the often receive very little love for extended periods of time until
all of
the sudden more work is required. There ensues an extended period of
re-orientation, even if the people who originally worked on the project are
still
around.

*Assuming* the data is available to reindex (and there are many reasons
besides poor practice on the part of the company that it may not be),
remembering/finding out exactly which of the various backups you made
of the original data is the one that's actually in your product can be
highly
non-trivial. Compounded by the fact that the product manager will be
adamant about Do NOT surprise our customers.

So I can be in a spot of saying I *think* I have the original data set, and
I
*think* I have the original code used to index it, and if I get a new
version of
Lucene I *think* I can recreate the index and I *think* that the user will
see
the expected change. After all that effort is completed, I *think* we'll see
the
expected changes, but we won't know until we try it puts me in a very
precarious position.

This assumes that I have a reasonable chance of getting the original data.
But
say I've been indexing data from a live feed. Sure as hell hope I stored the
data somewhere, because going back to the source and saying please resend
me 10 years worth of data that I have in my index is...er...hard. Or say
that the original provider has gone out of business, or the licensing
arrangement
specifies a one-time transmission of data that may not be retained in its
original
form or.

The point of this long diatribe is that there are many reasons why
reindexing is
impossible and/or impractical. Making any decision that requires reindexing
for
a new version is locking a user into a version potentially forever. We
should not
underestimate how painful that can be and should never think that just
reindex
is acceptable in all situations. It's not. Period.

Be very clear that some number of Lucene users will absolutely not be able
to reindex. We may still make a decision that requires this, but let's make
it
without deluding ourselves that it's a possible solution for everyone.

So an upgrade tool seems like a reasonable compromise. I agree that being
hampered in what we can develop in Lucene by having to accomodate
reading old indexes slows new features etc. It's always nice to be
able to work without dealing with pesky legacy issues G. Perhaps
splitting out the indexing upgrades into a separate program lets us
accommodate both concerns.

FWIW
Erick

On Thu, Apr 15, 2010 at 9:42 AM, Danil ŢORIN torin...@gmail.com wrote:

 True. Just need the tool.

 On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot ear...@gmail.com wrote:
 
  On Thu, Apr 15, 2010 at 17:17, Yonik Seeley yo...@lucidimagination.com
 wrote:
   Seamless online upgrades have their place too... say you are upgrading
   one server at a time in a cluster.
 
  Nothing here that can't be solved with an upgrade tool. Down one
  server, upgrade index, upgrade sofware, up.
 
  --
  Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
  Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
  ICQ: 104465785
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: Proposal about Version API relaxation

2010-04-15 Thread Erick Erickson
'Cause some exec finally noticed the product was losing market share.
Or got a wild hair strategically placed. My point is only that
we should be clear that some number of Lucene users *will* be in such
a position.

I'm actually fine with a decision that we're not going to support such
a scenario, but let's be clear that that's the decision we're making.

And corporate competence aside, there's still licensing that may prevent
me archiving the raw data

Erick

On Thu, Apr 15, 2010 at 10:20 AM, Earwin Burrfoot ear...@gmail.com wrote:

 I think the need to upgrade to latest and greatest lucene for poor
 corporate users that lost all their data is somewhat overblown.
 Why the heck do you need to upgrade if your app rotted in neglect for
 years??

 On Thu, Apr 15, 2010 at 18:14, Erick Erickson erickerick...@gmail.com
 wrote:
  Coming in late to the discussion, and without really understanding the
  underlying Lucene issues, but...
  The size of the problem of reindexing is under-appreciated I think.
  Somewhere
  in my company is the original data I indexed. But the effort it would
 take
  to
  resurrect it is O(unknown). An unfortunate reality of commercial products
 is
  that the often receive very little love for extended periods of time
 until
  all of
  the sudden more work is required. There ensues an extended period of
  re-orientation, even if the people who originally worked on the project
 are
  still
  around.
  *Assuming* the data is available to reindex (and there are many reasons
  besides poor practice on the part of the company that it may not be),
  remembering/finding out exactly which of the various backups you made
  of the original data is the one that's actually in your product can be
  highly
  non-trivial. Compounded by the fact that the product manager will be
  adamant about Do NOT surprise our customers.
  So I can be in a spot of saying I *think* I have the original data set,
 and
  I
  *think* I have the original code used to index it, and if I get a new
  version of
  Lucene I *think* I can recreate the index and I *think* that the user
 will
  see
  the expected change. After all that effort is completed, I *think* we'll
 see
  the
  expected changes, but we won't know until we try it puts me in a very
  precarious position.
  This assumes that I have a reasonable chance of getting the original
 data.
  But
  say I've been indexing data from a live feed. Sure as hell hope I stored
 the
  data somewhere, because going back to the source and saying please
 resend
  me 10 years worth of data that I have in my index is...er...hard. Or say
  that the original provider has gone out of business, or the licensing
  arrangement
  specifies a one-time transmission of data that may not be retained in its
  original
  form or.
  The point of this long diatribe is that there are many reasons why
  reindexing is
  impossible and/or impractical. Making any decision that requires
 reindexing
  for
  a new version is locking a user into a version potentially forever. We
  should not
  underestimate how painful that can be and should never think that just
  reindex
  is acceptable in all situations. It's not. Period.
  Be very clear that some number of Lucene users will absolutely not be
 able
  to reindex. We may still make a decision that requires this, but let's
 make
  it
  without deluding ourselves that it's a possible solution for everyone.
  So an upgrade tool seems like a reasonable compromise. I agree that being
  hampered in what we can develop in Lucene by having to accomodate
  reading old indexes slows new features etc. It's always nice to be
  able to work without dealing with pesky legacy issues G. Perhaps
  splitting out the indexing upgrades into a separate program lets us
  accommodate both concerns.
  FWIW
  Erick
  On Thu, Apr 15, 2010 at 9:42 AM, Danil ŢORIN torin...@gmail.com wrote:
 
  True. Just need the tool.
 
  On Thu, Apr 15, 2010 at 16:39, Earwin Burrfoot ear...@gmail.com
 wrote:
  
   On Thu, Apr 15, 2010 at 17:17, Yonik Seeley 
 yo...@lucidimagination.com
   wrote:
Seamless online upgrades have their place too... say you are
 upgrading
one server at a time in a cluster.
  
   Nothing here that can't be solved with an upgrade tool. Down one
   server, upgrade index, upgrade sofware, up.
  
   --
   Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
   Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
   ICQ: 104465785
  
   -
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org
  
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 



 --
 Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
 Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5

Re: [jira] Account password

2010-04-13 Thread Erick Erickson
A, good. That means the very long e-mail that came to my regular account
about someone hacking the JIRA server is bogus too I assume..

Erick

On Tue, Apr 13, 2010 at 5:58 PM, Uwe Schindler u...@thetaphi.de wrote:

 LOL!

 This user is assigned to very old bugzilla issues :-)

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

  -Original Message-
  From: j...@apache.org [mailto:j...@apache.org]
  Sent: Tuesday, April 13, 2010 10:54 PM
  To: java-dev@lucene.apache.org
  Subject: [jira] Account password
 
 
You (or someone else) has reset your password.
 
  -
 
  Your password has been changed to: MCwqNr
 
  You can change your password here:
 
 https://issues.apache.org/jira/secure/ViewProfile.jspa
 
  Here are the details of your account:
  -
  Username: java-dev@lucene.apache.org
 Email: java-dev@lucene.apache.org
 Full Name: Lucene Developers
  Password: MCwqNr
  (You can always retrieve these via the Forgot Password link on the
  signup page)
  --
  This message is automatically generated by JIRA.
  -
  If you think it was sent incorrectly contact one of the administrators:
  https://issues.apache.org/jira/secure/Administrators.jspa
  -
  For more information on JIRA, see:
  http://www.atlassian.com/software/jira
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Account password

2010-04-13 Thread Erick Erickson
Oops, that'll teach me to just skim things, won't it?

Erick

On Tue, Apr 13, 2010 at 6:14 PM, Andi Vajda va...@osafoundation.org wrote:


 On Tue, 13 Apr 2010, Erick Erickson wrote:

  A, good. That means the very long e-mail that came to my regular
 account
 about someone hacking the JIRA server is bogus too I assume..


 Err, no, it's real. You should change your password.

 Andi..


  Erick

 On Tue, Apr 13, 2010 at 5:58 PM, Uwe Schindler u...@thetaphi.de wrote:
  LOL!

  This user is assigned to very old bugzilla issues :-)

  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de

  -Original Message-
  From: j...@apache.org [mailto:j...@apache.org]
  Sent: Tuesday, April 13, 2010 10:54 PM
  To: java-dev@lucene.apache.org
  Subject: [jira] Account password
 
 
You (or someone else) has reset your password.
 
 
 -
 
  Your password has been changed to: MCwqNr
 
  You can change your password here:
 
 https://issues.apache.org/jira/secure/ViewProfile.jspa
 
  Here are the details of your account:
 
 -
  Username: java-dev@lucene.apache.org
 Email: java-dev@lucene.apache.org
 Full Name: Lucene Developers
  Password: MCwqNr
  (You can always retrieve these via the Forgot Password link on the
  signup page)
  --
  This message is automatically generated by JIRA.
  -
  If you think it was sent incorrectly contact one of the
 administrators:
  https://issues.apache.org/jira/secure/Administrators.jspa
  -
  For more information on JIRA, see:
  http://www.atlassian.com/software/jira
 
 
 
 
 -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org






 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: [jira] Created: (LUCENE-2376) java.lang.OutOfMemoryError:Java heap space

2010-04-08 Thread Erick Erickson
What kind of JVM settings are you using? Lots of people index lots of
documents
without running into this, can you provide more specifics about your
indexing
settings?

On Tue, Apr 6, 2010 at 10:51 PM, Shivender Devarakonda (JIRA) 
j...@apache.org wrote:

 java.lang.OutOfMemoryError:Java heap space
 --

 Key: LUCENE-2376
 URL: https://issues.apache.org/jira/browse/LUCENE-2376
 Project: Lucene - Java
  Issue Type: Bug
  Components: Index
Affects Versions: 2.9.1
 Environment: Windows
Reporter: Shivender Devarakonda


 I see an OutOfMemory error in our product and it is happening when we have
 some data objects on which we built the index. I see the following
 OutOfmemory error, this is happening after we call Indexwriter.optimize():


 4/06/10 02:03:42.160 PM PDT [ERROR] [Lucene Merge Thread #12]  In thread
 Lucene Merge Thread #12 and the message is
 org.apache.lucene.index.MergePolicy$MergeException:
 java.lang.OutOfMemoryError: Java heap space
 4/06/10 02:03:42.207 PM PDT [VERBOSE] [Lucene Merge Thread #12] [Manager]
 Uncaught Exception in thread Lucene Merge Thread #12
 org.apache.lucene.index.MergePolicy$MergeException:
 java.lang.OutOfMemoryError: Java heap space
at
 org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:351)
at
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:315)
 Caused by: java.lang.OutOfMemoryError: Java heap space
at java.util.HashMap.resize(HashMap.java:462)
at java.util.HashMap.addEntry(HashMap.java:755)
at java.util.HashMap.put(HashMap.java:385)
at
 org.apache.lucene.index.FieldInfos.addInternal(FieldInfos.java:256)
at org.apache.lucene.index.FieldInfos.read(FieldInfos.java:366)
at org.apache.lucene.index.FieldInfos.init(FieldInfos.java:71)
at
 org.apache.lucene.index.SegmentReader$CoreReaders.init(SegmentReader.java:116)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:638)
at org.apache.lucene.index.SegmentReader.get(SegmentReader.java:608)
at
 org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:686)
at
 org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:4979)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:4614)
at
 org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:235)
at
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:291)
 4/06/10 02:03:42.895 PM PDT [ERROR]  this writer hit an OutOfMemoryError;
 cannot complete optimize


 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: lucene and solr trunk

2010-03-16 Thread Erick Erickson
My snap impression is that moving lucene to a sub-tree
under SOLR would introduce some confusion in the minds
of new folks looking at the code. *We* all know that Lucene
stands by itself, but putting it under a solr makes that less
obvious. I claim that there would be questions like so can
I just use Lucene without SOLR?.

That said, the questions about release management, branching,
tagging, etc. take complete precedence over minor
confusion when the answer is just go to directory X and
checkout if you want Lucene only.

FWIW
Erick



On Tue, Mar 16, 2010 at 8:30 AM, Robert Muir rcm...@gmail.com wrote:

 On Tue, Mar 16, 2010 at 3:43 AM, Simon Willnauer
 simon.willna...@googlemail.com wrote:

  One more thing which I wonder about even more is that this whole
  merging happens so quickly for reasons I don't see right now. I don't
  want to keep anybody from making progress but it appears like a rush
  to me.


 By the way, the serious changes we applied to the branch, most of them
 have been sitting in JIRA over 3 months not doing much: SOLR-1659

 if you follow the linked issues, you can see all the stuff that got
 put in the branch... the branch was helpful for me, as I could help
 Mark with the ton of little things, like TokenStreams embedded
 inside JSP files :)

 As its just a branch, if you want to go look at those patches
 (especially anything I did) and provide technical feedback, that would
 be great!

 But I think its a mistake to say things are rushed when the work has
 been done for months.

 --
 Robert Muir
 rcm...@gmail.com

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: How can I use QueryScorer() to find only perfect matches??

2010-03-15 Thread Erick Erickson
Try +contents:term +contents:query. By misplacing the
'+' you're getting the default OR operator and the '+'
is probably being thrown away by the analyzer.

Luke will help here a lot.

HTH
Erick

On Mon, Mar 15, 2010 at 9:46 AM, christian stadler stadler.christ...@web.de
 wrote:

 Hi there,

 I have an issue with the QueryScorer(query) method at the moment and I need
 some assistance.
 I was indexing my e-book lucene in action and based on this index-db I
 started to play around with some boolean queries like:
 (contents:+term contents:+query)
 As a result I'm expecting as a perfect match for the phrase term query
 four
 hits.

 But when I run my sample to highlight this phrase in the context then I get
 a
 lot more results. It also finds all the matches for term and query
 independently.

 I think the problem is the QueryScorer() which softens the former exact
 boolean
 query.
 Then I was trying the following:
 private static Highlighter GetHits(Query query, Formatter formatter)
 {
string filed = contents
BooleanQuery termsQuery = new BooleanQuery();

WeightedTerm[] terms = QueryTermExtractor.GetTerms(query, true, field);
foreach (WeightedTerm term in terms)
{
TermQuery termQuery = new TermQuery(new Term(field,
 term.GetTerm()));
termsQuery.Add(termQuery, BooleanClause.Occur.MUST);
}

// create query scorer based on term queries (field specific)
QueryScorer scorer = new QueryScorer(termsQuery);

Highlighter highlighter = new Highlighter(formatter, scorer);
highlighter.SetTextFragmenter(new SimpleFragmenter(20));

return highlighter;
 }
 to rewrite the query and set the term attribute from SHOULD to MUST

 But the result was the same.
 Do you have any example how I can use the QueryScorer() in exactly the same
 way
 as to mimic a BooleanSearch??

 thanks in advance
 Christian




 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-2308) Separately specify a field's type

2010-03-12 Thread Erick Erickson
Congrats Chris!

I vote for thinkAboutNotIncludingNormsMaybe(true|false) G.

Seriously double negatives are ugly IMO, +1 for changing

Erick

On Fri, Mar 12, 2010 at 12:56 PM, Chris Male (JIRA) j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12844587#action_12844587]

 Chris Male commented on LUCENE-2308:
 

 I agree entirely.  This is definitely the moment to remove any ambiguity or
 confusion in this API.  I'll make sure to incorporate this idea.

  Separately specify a field's type
  -
 
  Key: LUCENE-2308
  URL: https://issues.apache.org/jira/browse/LUCENE-2308
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Index
 Reporter: Michael McCandless
 
  This came up from dicussions on IRC.  I'm summarizing here...
  Today when you make a Field to add to a document you can set things
  index or not, stored or not, analyzed or not, details like omitTfAP,
  omitNorms, index term vectors (separately controlling
  offsets/positions), etc.
  I think we should factor these out into a new class (FieldType?).
  Then you could re-use this FieldType instance across multiple fields.
  The Field instance would still hold the actual value.
  We could then do per-field analyzers by adding a setAnalyzer on the
  FieldType, instead of the separate PerFieldAnalzyerWrapper (likewise
  for per-field codecs (with flex), where we now have
  PerFieldCodecWrapper).
  This would NOT be a schema!  It's just refactoring what we already
  specify today.  EG it's not serialized into the index.
  This has been discussed before, and I know Michael Busch opened a more
  ambitious (I think?) issue.  I think this is a good first baby step.  We
 could
  consider a hierarchy of FIeldType (NumericFieldType, etc.) but maybe hold
  off on that for starters...

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-2280) IndexWriter.optimize() throws NullPointerException

2010-03-08 Thread Erick Erickson
Quick side note: The recommended upgrade path is to upgrade to 2.9.latest,
fix all of the deprecation warnings, *then* upgrade to 3.0. The 2.9.X - 3.0
upgrade just removed all the deprecated stuff.

FWIW
Erick

On Mon, Mar 8, 2010 at 8:51 AM, Ritesh Nigam (JIRA) j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12842657#action_12842657]

 Ritesh Nigam commented on LUCENE-2280:
 --

 I checked the documentation of IndexWriter in 2.3.2, API commit() is not
 available with this version (I think it is introduced in 2.4), I am not
 explicitely setting autoCommit, so it should take default value which I
 believe is true.

 One more thing I am catching any exception hitting during indexing or
 optimizing, and then in finally block i am closing the IndexWriter by
 calling close(), method which sould take care of commit internally? Please
 suggest me if there is any equivalent method which i can use in place of
 commit()

 I have not upgraded to the newer version of lucene, but probably i will try
 3.0.0 version of lucene in future.

  IndexWriter.optimize() throws NullPointerException
  --
 
  Key: LUCENE-2280
  URL: https://issues.apache.org/jira/browse/LUCENE-2280
  Project: Lucene - Java
   Issue Type: Bug
   Components: Index
 Affects Versions: 2.3.2
  Environment: Win 2003, lucene version 2.3.2, IBM JRE 1.6
 Reporter: Ritesh Nigam
  Attachments: lucene.jar
 
 
  I am using lucene 2.3.2 search APIs for my application, i am indexing
 45GB database which creates approax 200MB index file, after finishing the
 indexing and while running optimize() i can see NullPointerExcception thrown
 in my log and index file is getting corrupted, log says
  
  Caused by:
  java.lang.NullPointerException
at
 org.apache.lucene.store.BufferedIndexOutput.writeBytes(BufferedIndexOutput.java:49)
at
 org.apache.lucene.store.IndexOutput.writeBytes(IndexOutput.java:40)
at
 org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:566)
at
 org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:135)
at
 org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3273)
at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2968)
at
 org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:240)
  
  and this is happening quite frequently, although I am not able to
 reproduce it on demand, I saw an issue logged which is some what related to
 mine issue (
 http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200809.mbox/%3c6e4a40db-5efc-42da-a857-d59f4ec34...@mikemccandless.com%3e)
 but the only difference here is I am not using Store.Compress for my fields,
 i am using Store.NO instead. please note that I am using IBM JRE for my
 application.
  Is this an issue with lucene?, if yes it is fixed in which version?

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: Lucene Filter

2010-03-02 Thread Erick Erickson
The very first thing I'd recommend is to get a copy of Luke
(google Lucene, Luke) and examine your index to see if
what you *think* is in there is *actually* in there.

One popular learning experience is to do something
like
Document = new Document();
while (more docs to add) {
   add field
   add field
   add doc
}

Problem is that the document simply accumulates. The first
add doc puts your first document in the index. The second
puts the contents of both the first and second doc in the
second doc of the index. The third puts the contents of 3
documents in for the third doc, etc.

Cure this by moving the new Document inside the while loop

If this doesn't help, please show your indexing and
searching code

HTH
Erick

On Tue, Mar 2, 2010 at 9:35 AM, Dyutiman dyutiman.chaudh...@gmail.comwrote:


 Hi,
 I am new in this forum and new to Lucene also. I m getting some issue while
 trying to filter my Lucene result.

 While creating the index I am creating a field called sentiment and
 possible
 values are 'positive', 'negative'  'neutral', I am indexing this field
 like
 doc.add(new Field(sentiment, sentiment, Field.Store.YES,
 Field.Index.NOT_ANALYZED_NO_NORMS));

 Now I want to search within my index but get only positive sentiment
 results
 for the searched string.
 For this I am doing something like this :

 QueryParser qp = new QueryParser(Version.LUCENE_CURRENT, contents,
 analyzer);
 Query query = qp.parse(searchString);
 Filter filter = new TermRangeFilter(sentiment, positive, positive,
 true, true);
 topDocs = searcher.search(query, filter, 20);

 But I am getting results mixed with all 3 sentiments. I tried other filters
 also but the result is same.
 Anybody got any solutions for me please help..

 thanks
 Dyutiman

 --
 View this message in context:
 http://old.nabble.com/Lucene-Filter-tp27756577p27756577.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: Lucene Filter

2010-03-02 Thread Erick Erickson
Taking a quick glance at the code, I don't see anything
obviously wrong as far as the problem you describe goes.

What happens if you just add a required clause to your query
string rather than use a Filter? Something like
+sentiment:positive? If you do that, query.toString is
your friend G...

Erick


On Tue, Mar 2, 2010 at 10:04 AM, Dyutiman dyutiman.chaudh...@gmail.comwrote:


 Thanks Erick for your quick reply.
 I am going to try Luke and examine my index. In the mean time let me tell
 you that I am indexing the documents every time creating the new document.
 Let me attach the code I am using here.

 thanks
 Dyutiman http://old.nabble.com/file/p27756896/IndexUtil.javaIndexUtil.java




 Erick Erickson wrote:
 
  The very first thing I'd recommend is to get a copy of Luke
  (google Lucene, Luke) and examine your index to see if
  what you *think* is in there is *actually* in there.
 
  One popular learning experience is to do something
  like
  Document = new Document();
  while (more docs to add) {
 add field
 add field
 add doc
  }
 
  Problem is that the document simply accumulates. The first
  add doc puts your first document in the index. The second
  puts the contents of both the first and second doc in the
  second doc of the index. The third puts the contents of 3
  documents in for the third doc, etc.
 
  Cure this by moving the new Document inside the while loop
 
  If this doesn't help, please show your indexing and
  searching code
 
  HTH
  Erick
 
  On Tue, Mar 2, 2010 at 9:35 AM, Dyutiman
  dyutiman.chaudh...@gmail.comwrote:
 
 
  Hi,
  I am new in this forum and new to Lucene also. I m getting some issue
  while
  trying to filter my Lucene result.
 
  While creating the index I am creating a field called sentiment and
  possible
  values are 'positive', 'negative'  'neutral', I am indexing this field
  like
  doc.add(new Field(sentiment, sentiment, Field.Store.YES,
  Field.Index.NOT_ANALYZED_NO_NORMS));
 
  Now I want to search within my index but get only positive sentiment
  results
  for the searched string.
  For this I am doing something like this :
 
  QueryParser qp = new QueryParser(Version.LUCENE_CURRENT, contents,
  analyzer);
  Query query = qp.parse(searchString);
  Filter filter = new TermRangeFilter(sentiment, positive, positive,
  true, true);
  topDocs = searcher.search(query, filter, 20);
 
  But I am getting results mixed with all 3 sentiments. I tried other
  filters
  also but the result is same.
  Anybody got any solutions for me please help..
 
  thanks
  Dyutiman
 
  --
  View this message in context:
  http://old.nabble.com/Lucene-Filter-tp27756577p27756577.html
  Sent from the Lucene - Java Developer mailing list archive at
 Nabble.com.
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 

 --
 View this message in context:
 http://old.nabble.com/Lucene-Filter-tp27756577p27756896.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: Adding .classpath.tmpl

2010-02-28 Thread Erick Erickson
Tangentially related, but the link on the how to contribute page to the
IntelliJ code style file is broken, it reached over into the SOLR Wiki... I
stole the one from SOLR and added it as an attachment and the how to
contribute page now links to it

Erick

On Sun, Feb 28, 2010 at 5:14 AM, Shai Erera ser...@gmail.com wrote:

 I've read BUILD.txt and it doesn't look like it'll fit there. That files
 discusses how to build Lucene using Ant and JDK. The word IDE is not
 mentioned, nor Eclipse.

 BTW, there is a typo in the file before returning to this README - not
 sure if the word README is intended to be like that, or a leftover from when
 this was once in README?

 Shai


 On Sun, Feb 28, 2010 at 12:11 PM, Uwe Schindler u...@thetaphi.de wrote:

  Maybe this change is better in BUILD.txt? I am not sure.



 -

 Uwe Schindler

 H.-H.-Meier-Allee 63, D-28213 Bremen

 http://www.thetaphi.de

 eMail: u...@thetaphi.de



 *From:* Shai Erera [mailto:ser...@gmail.com]
 *Sent:* Sunday, February 28, 2010 10:55 AM

 *To:* java-dev@lucene.apache.org
 *Subject:* Re: Adding .classpath.tmpl



 Index: README.txt
 ===
 --- README.txt(revision 917047)
 +++ README.txt(working copy)
 @@ -28,8 +28,6 @@
part of the core library.  Of special note are the JAR files in the
 analyzers directory which
contain various analyzers that people may find useful in place of the
 StandardAnalyzer.

 -
 -
  docs/index.html
The contents of the Lucene website.

 @@ -42,3 +40,10 @@

  src/demo
Some example code.
 +
 +SET UP THE ENVIRONMENT
 +
 +Checkout the HowToContribute wiki page
 +(http://wiki.apache.org/lucene-java/HowToContribute) which includes
 useful
 +information on how to contribute code to Lucene, as well as how to set up
 your
 +environment quickly (code formatting rules and setting the classpath
 quickly).
 \ No newline at end of file

 Is this ok?

 Shai

 On Sun, Feb 28, 2010 at 11:07 AM, Uwe Schindler u...@thetaphi.de wrote:

 I think we can add this to the README.txt! Do you have a patch?



 -

 Uwe Schindler

 H.-H.-Meier-Allee 63, D-28213 Bremen

 http://www.thetaphi.de

 eMail: u...@thetaphi.de



 *From:* Shai Erera [mailto:ser...@gmail.com]
 *Sent:* Sunday, February 28, 2010 6:30 AM
 *To:* java-dev@lucene.apache.org
 *Subject:* Re: Adding .classpath.tmpl



 I uploaded the file to 
 http://wiki.apache.org/lucene-java/HowToContribute(bottom of the page). But 
 I don't see any good spot to stuff it in the
 README. There is no pointer to the HowToContribute page at all, nor to the
 code formatting styles ... what do you think - create such section at the
 bottom of README, or leave it out?

 On Fri, Feb 26, 2010 at 2:58 PM, Shai Erera ser...@gmail.com wrote:

 Thanks for your response. I will update the Wiki with the file. After I do
 that, I'll add some text to the README file. I'll need one of you to help me
 commit it though.



 Thanks again,

 Shai



 On Thu, Feb 25, 2010 at 6:21 PM, Mark Miller markrmil...@gmail.com
 wrote:

 +1 - I'd prefer this stay out of svn as well - I'd rather it go on the
 wiki too - perhaps in the same place that you can find the formatting file
 for eclipse and intellij.

 --
 - Mark

 http://www.lucidimagination.com





 On 02/25/2010 11:10 AM, Grant Ingersoll wrote:

 To me, this is stuff that can go on the wiki or somewhere else, otherwise
 over time, there will be others to add in, etc.  We could simply add a
 pointer to the wiki page in the README.

 On Feb 24, 2010, at 11:55 PM, Shai Erera wrote:



 Hi

 I always find it annoying when I checkout the code to a new project in
 eclipse, that I need to put everything that I care about in the classpath
 and adding the dependent libraries. On another project I'm involved with, we
 did that process once, adding all the source code to the classpath and the
 libraries and created a .classpath.tmpl. Now when people checkout the code,
 they can copy the content of that file to their .classpath file and setting
 up the project is reducing from a couple of minutes to few seconds.

 I don't want to check-in .classpath because not everyone wants all the
 code in their classpath.

 I attached such file to the mail. Note that the only dependency which will
 break on other machines is the ant.jar dependency, which on my Windows is
 located under c:\ant. That jar is required to compile contrib/ant from
 eclipse. Not sure how to resolve that, except besides removing that line
 from the file and document separately that that's what you need to do if you
 want to add contrib/ant ...

 The file is sorted by name, putting the core stuff at the top - so it's
 easy for people to selectively add the interesting packages.

 I don't know if an issue is required, if so I can create it in and move
 the discussion there.

 Shai
 lucene.classpath.tmpl
 -
 To unsubscribe, e-mail: 

Re: [jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2010-02-26 Thread Erick Erickson
I won't be able to look at this till tonight, I'll see what I can see.


On Fri, Feb 26, 2010 at 9:02 AM, Uwe Schindler (JIRA) j...@apache.orgwrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838872#action_12838872]

 Uwe Schindler commented on LUCENE-2037:
 ---

 Committed revision: 916685

  Allow Junit4 tests in our environment.
  --
 
  Key: LUCENE-2037
  URL: https://issues.apache.org/jira/browse/LUCENE-2037
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
  Environment: Development
 Reporter: Erick Erickson
 Assignee: Michael McCandless
 Priority: Minor
  Fix For: 3.1
 
  Attachments: junit-4.7.jar, LUCENE-2037-getName.patch,
 LUCENE-2037.patch, LUCENE-2037.patch, LUCENE-2037.patch,
 LUCENE-2037_remove_testwatchman.patch, LUCENE-2037_revised_2.patch
 
Original Estimate: 8h
   Remaining Estimate: 8h
 
  Now that we're dropping Java 1.4 compatibility for 3.0, we can
 incorporate Junit4 in testing. Junit3 and junit4 tests can coexist, so no
 tests should have to be rewritten. We should start this for the 3.1 release
 so we can get a clean 3.0 out smoothly.
  It's probably worthwhile to convert a small set of tests as an exemplar.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: svn commit: r916685 - in /lucene/java/trunk/src/test/org/apache/lucene/util: InterceptTestCaseEvents.java LuceneTestCaseJ4.java

2010-02-26 Thread Erick Erickson
Nice simplification!

On Fri, Feb 26, 2010 at 9:02 AM, uschind...@apache.org wrote:

 Author: uschindler
 Date: Fri Feb 26 14:02:08 2010
 New Revision: 916685

 URL: http://svn.apache.org/viewvc?rev=916685view=rev
 Log:
 LUCENE-2037: Add support for LuceneTestCase.getName() for backwards
 compatibility when reporting failed tests. Also removed The
 InterceptTestCaseEvents class and added as anonymous class (simplified, no
 reflection)

 Removed:

  
 lucene/java/trunk/src/test/org/apache/lucene/util/InterceptTestCaseEvents.java
 Modified:
lucene/java/trunk/src/test/org/apache/lucene/util/LuceneTestCaseJ4.java

 Modified:
 lucene/java/trunk/src/test/org/apache/lucene/util/LuceneTestCaseJ4.java
 URL:
 http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/util/LuceneTestCaseJ4.java?rev=916685r1=916684r2=916685view=diff

 ==
 --- lucene/java/trunk/src/test/org/apache/lucene/util/LuceneTestCaseJ4.java
 (original)
 +++ lucene/java/trunk/src/test/org/apache/lucene/util/LuceneTestCaseJ4.java
 Fri Feb 26 14:02:08 2010
 @@ -25,6 +25,8 @@
  import org.junit.After;
  import org.junit.Before;
  import org.junit.Rule;
 +import org.junit.rules.TestWatchman;
 +import org.junit.runners.model.FrameworkMethod;

  import java.io.PrintStream;
  import java.util.Arrays;
 @@ -98,14 +100,21 @@
   // Think of this as start/end/success/failed
   // events.
   @Rule
 -  public InterceptTestCaseEvents intercept = new
 InterceptTestCaseEvents(this);
 +  public final TestWatchman intercept = new TestWatchman() {

 -  public LuceneTestCaseJ4() {
 -  }
 +@Override
 +public void failed(Throwable e, FrameworkMethod method) {
 +  reportAdditionalFailureInfo();
 +  super.failed(e, method);
 +}

 -  public LuceneTestCaseJ4(String name) {
 -this.name = name;
 -  }
 +@Override
 +public void starting(FrameworkMethod method) {
 +  LuceneTestCaseJ4.this.name = method.getName();
 +  super.starting(method);
 +}
 +
 +  };

   @Before
   public void setUp() throws Exception {
 @@ -291,6 +300,6 @@
   // static members
   private static final Random seedRnd = new Random();

 -  private String name = ;
 +  private String name = unknown;

  }





Re: Uwe's question

2010-02-26 Thread Erick Erickson
You can use Junit4 whenever you want right now. Just derive from
LuceneTestCaseJ4 rather than LuceneTestCase. And annotate
each test with @Test and you should be fine.

Junit4 does allow you to mix-n-match 3/4 tests
*on a whole class basis*. That is, all of the tests in a class must
be either 3-style deriving from TestCase and named appropriately)
or 4-style (annotated, with whatever Junit4 features you'd like).

The consensus seems to be that converting old tests to
Junit4 just to get them all using Junit4 isn't a good use
of time, and at least introduces the possibility that it would
mess things up. Upgrading old tests to Junit4 to improve
them, especially to speed them up (@BeforeClass and @
AfterClass can help) *is* a good use of time.

I might convert an old-style test case if I was
working in it, but that's probably a personal preference.

I've never tried to learn a command-line invocation of a test
case for a single test method, I've always just used the IDE
to run individual methods

Erick

On Fri, Feb 26, 2010 at 11:31 AM, Jason Rutherglen 
jason.rutherg...@gmail.com wrote:

 Lets go to JUnit 4 if possible...

 Does it provide method level testing?  (i.e. one doesn't need to
 execute every test method just to check the results of one method)

 On Thu, Feb 25, 2010 at 8:15 PM, Shai Erera ser...@gmail.com wrote:
  Ok this seems a discussion related to JUnit 4, so I'll port what I've
 said
  about it from the other thread (doing the code cleanup):
 
  {quote}
  Erik, I'm totally with you on JUnit 4. I think the @Test annotation is
  really not a big deal (it's actually very easy to migrate all the current
  tests to JUnit 4 with the added import using some script. Even manually
 it
  shouldn't be such a big deal.
 
  @Ignore is a perfect other advantage of JUnit4. I've found some tests
 which
  were prefixed with _, i.e. _testXYZ just to disable them. Nobody knows
 about
  them until he looks at the code (and pays attention). @Ignore would have
  been better.
 
  And there are lots of other advantages, like the @Before and @After (not
  only class). Another problem I've found in the tests is that not all
  extended LuceneTestCase, and usually their setUp and tearDown
  implementations were wrong - not calling super first/last. When I moved
 them
  to extend LuceneTestCase they broke (I fixed them, don't worry). However,
  that could never happen if the super's methods were tagged w/
 @Before/After,
  because JUnit would take care running them before/after their
 sub-classes'
  @Before/After. So that's another win for JUnit4.
 
  And of course the @Before/AfterClass are really great !
  {quote}
 
  I think the @Before/After annotations can be a real win for our tests.
 
  My two cents,
  Shai
 
  On Fri, Feb 26, 2010 at 4:57 AM, Erick Erickson erickerick...@gmail.com
 
  wrote:
 
  Well, Things got busy (tm). Uwe's point if valid; unless there's
  demonstrable gain, moving things to Junit4 just for fun is wasted
 motion,
  indeed dangerous. I was focusing on LocalizedTestCase to understand the
  place of runBare etc. in the scheme of things since when I created
  LuceneTestCaseJ4 that was something I wanted to figure out to make it a
  replacement for LuceneTestCase.
 
  I can't point to a compelling reason to shake up the code, the only
  improvement it would have is having a demonstration of using the Junit4
  @RunWith annotation for future reference.
 
  So, I've no compelling reason to push that patch forward. If y'all think
  it's worth it I'll be happy to crank that patch back up again, it'll
 take a
  few days though. It does affect a several files, and if the main value
 here
  is an exemplar of the @RunWith annotation, perhaps there's a better
 place to
  put that in.
 
  Erick
 
  On Thu, Feb 25, 2010 at 9:06 PM, Robert Muir rcm...@gmail.com wrote:
 
 
 
  LocalizedTestCase called runBare in LuceneTestCase which reported the
  seed value if an exception was thrown. I couldn't find a good way to
 access
  runBare or analogs in Junit4, but the interceptor pattern worked as
 well.
  The interceptor is called by the Junit framework on test events, so
 there
  aren't references to it in the Lucene test code. There are other
 places that
  call runBare, so I assumed that if anyone wanted to use Junit4 with
 those
  classes it would be a good thing to allow.
 
  I didn't forget about your patch Erick, in my opinion there is nothing
  wrong with it. I hope its not discouraging you, the problem is a few of
 us
  have spent countless hours trying to debug this hard-to-reproduce Thai
 test
  failure problem.
 
  It failed in the existing tests, too, with Junit 3 on hudson (one
 time!).
  At this point, i start to wonder if it could be related to stuff like
 this:
  http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6683975
 
  I don't think we should let this stop progress with the tests, if you
  think we should move LocalizedTestCase to junit 4 lets do it.
 
  --
  Robert Muir
  rcm

Re: Uwe's question

2010-02-26 Thread Erick Erickson
I poked around a little and didn't find any joy. But the *really clumsy* way
of doing this would be to add the @Ignore annotation to any test in the
class that you didn't want to run, then just run the class.

Or, equivalently, comment out the @Test annotation. I'd prefer adding the
@Ignore though so there's be some chance of noticing if it was inadvertently
checked in.

FWIW
Erick

On Fri, Feb 26, 2010 at 3:31 PM, Jason Rutherglen 
jason.rutherg...@gmail.com wrote:

  I've never tried to learn a command-line invocation of a test
  case for a single test method, I've always just used the IDE
  to run individual methods

 Right, I've been doing bunches of Solr dev which for me only works
 from the command line... I'm open to suggestions though!

 On Fri, Feb 26, 2010 at 10:16 AM, Erick Erickson
 erickerick...@gmail.com wrote:
  You can use Junit4 whenever you want right now. Just derive from
  LuceneTestCaseJ4 rather than LuceneTestCase. And annotate
  each test with @Test and you should be fine.
  Junit4 does allow you to mix-n-match 3/4 tests
  *on a whole class basis*. That is, all of the tests in a class must
  be either 3-style deriving from TestCase and named appropriately)
  or 4-style (annotated, with whatever Junit4 features you'd like).
  The consensus seems to be that converting old tests to
  Junit4 just to get them all using Junit4 isn't a good use
  of time, and at least introduces the possibility that it would
  mess things up. Upgrading old tests to Junit4 to improve
  them, especially to speed them up (@BeforeClass and @
  AfterClass can help) *is* a good use of time.
  I might convert an old-style test case if I was
  working in it, but that's probably a personal preference.
  I've never tried to learn a command-line invocation of a test
  case for a single test method, I've always just used the IDE
  to run individual methods
  Erick
  On Fri, Feb 26, 2010 at 11:31 AM, Jason Rutherglen
  jason.rutherg...@gmail.com wrote:
 
  Lets go to JUnit 4 if possible...
 
  Does it provide method level testing?  (i.e. one doesn't need to
  execute every test method just to check the results of one method)
 
  On Thu, Feb 25, 2010 at 8:15 PM, Shai Erera ser...@gmail.com wrote:
   Ok this seems a discussion related to JUnit 4, so I'll port what I've
   said
   about it from the other thread (doing the code cleanup):
  
   {quote}
   Erik, I'm totally with you on JUnit 4. I think the @Test annotation is
   really not a big deal (it's actually very easy to migrate all the
   current
   tests to JUnit 4 with the added import using some script. Even
 manually
   it
   shouldn't be such a big deal.
  
   @Ignore is a perfect other advantage of JUnit4. I've found some tests
   which
   were prefixed with _, i.e. _testXYZ just to disable them. Nobody knows
   about
   them until he looks at the code (and pays attention). @Ignore would
 have
   been better.
  
   And there are lots of other advantages, like the @Before and @After
 (not
   only class). Another problem I've found in the tests is that not all
   extended LuceneTestCase, and usually their setUp and tearDown
   implementations were wrong - not calling super first/last. When I
 moved
   them
   to extend LuceneTestCase they broke (I fixed them, don't worry).
   However,
   that could never happen if the super's methods were tagged w/
   @Before/After,
   because JUnit would take care running them before/after their
   sub-classes'
   @Before/After. So that's another win for JUnit4.
  
   And of course the @Before/AfterClass are really great !
   {quote}
  
   I think the @Before/After annotations can be a real win for our tests.
  
   My two cents,
   Shai
  
   On Fri, Feb 26, 2010 at 4:57 AM, Erick Erickson
   erickerick...@gmail.com
   wrote:
  
   Well, Things got busy (tm). Uwe's point if valid; unless there's
   demonstrable gain, moving things to Junit4 just for fun is wasted
   motion,
   indeed dangerous. I was focusing on LocalizedTestCase to understand
 the
   place of runBare etc. in the scheme of things since when I created
   LuceneTestCaseJ4 that was something I wanted to figure out to make it
 a
   replacement for LuceneTestCase.
  
   I can't point to a compelling reason to shake up the code, the only
   improvement it would have is having a demonstration of using the
 Junit4
   @RunWith annotation for future reference.
  
   So, I've no compelling reason to push that patch forward. If y'all
   think
   it's worth it I'll be happy to crank that patch back up again, it'll
   take a
   few days though. It does affect a several files, and if the main
 value
   here
   is an exemplar of the @RunWith annotation, perhaps there's a better
   place to
   put that in.
  
   Erick
  
   On Thu, Feb 25, 2010 at 9:06 PM, Robert Muir rcm...@gmail.com
 wrote:
  
  
  
   LocalizedTestCase called runBare in LuceneTestCase which reported
 the
   seed value if an exception was thrown. I couldn't find a good way
 to
   access
   runBare

[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2010-02-26 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12839126#action_12839126
 ] 

Erick Erickson commented on LUCENE-2037:


Uwe:

You were asking about getName in LuceneTestCaseJ4. It appears that you've taken 
care of this, is there still anything to do? There's no longer a c'tor that 
takes the test name.

But I did some poking around and came up with the following from someplace on 
the web. 

The only two place I could find that used getName were TestFieldScoreQuery and 
TestOrdValues. This bit of code works if you put it in these classes.

 private String testName() {
return getClass().getName()+.+ name.getMethodName(); // was getName() 
from LuceneTestCaseJ4...
  }

  @Rule
  public final TestName name = new TestName();

See:
http://kentbeck.github.com/junit/javadoc/4.7/org/junit/rules/TestName.html Note 
that this site is better than anything I could find at junit.org

Once  I found that, I thought gee, if I put that in the base class, it would 
be available to everyone. Which is exactly what you made 
LuceneTestCaseJ4.getName() do G. But at least I found Kent Beck's version of 
the docs, which is a plus...

So I guess there's nothing to do as far as getName is concerned If there 
is, let me know

Erick


 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Michael McCandless
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037-getName.patch, 
 LUCENE-2037.patch, LUCENE-2037.patch, LUCENE-2037.patch, 
 LUCENE-2037_remove_testwatchman.patch, LUCENE-2037_revised_2.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Uwe's question

2010-02-26 Thread Erick Erickson
Here are some tantalizing hints. I'll look at this tomorrow if someone
hasn't beaten me to it, but there are *better things I can be doing late
Friday night than messing around with stupid tests* G.

From : http://junit.sourceforge.net/doc/cookbook/cookbook.htm

Once you have tests, you'll want to run them. JUnit provides tools to define
the suite to be run and to display its results. To run tests and see the
results on the console, run this from a Java program:

org.junit.runner.JUnitCore.runClasses(TestClass1.class, ...);

 or this from the command line, with both your test class and junit on the
classpath:

java org.junit.runner.JUnitCore TestClass1.class [...other test classes...]

 From: http://kentbeck.github.com/junit/javadoc/4.7/index.html
See JunitCore and Request, especially Request.method.

From:
http://old.nabble.com/How-to-run-individual-test-case-within-a-test-class-from-command-line%28JUnit-4.x%29-td20003338.html

new JUnitCore.run(*Request*.*method*(class, *methodName*));

I think the still-remaining clumsy part of this is specifying the test class
file in the classpath. I can imagine that this could be part of a shell
script, but is it worth the effort if things run from the IDE?
Alternatively, a small Java program taking two arguments might do the trick.
But as I said, it's late and even *sleeping* would be better than this
G.

Sggghhh. Manning has a MEAP for JUnit In Action (herinafter JUIA) that
covers up through Junit 4.5. Anybody dare me to spring for the $30 and see
what wisdom is in there? I'm frustrated enough with the sparse documentation
that it sure seems worth it

Erick

P.S. no Double-Dog-Dares allowed.

On Fri, Feb 26, 2010 at 6:14 PM, Yonik Seeley yo...@lucidimagination.comwrote:

 On Fri, Feb 26, 2010 at 3:31 PM, Jason Rutherglen
 jason.rutherg...@gmail.com wrote:
  I've never tried to learn a command-line invocation of a test
  case for a single test method, I've always just used the IDE
  to run individual methods
 
  Right, I've been doing bunches of Solr dev which for me only works
  from the command line... I'm open to suggestions though!

 Should work from the IDE provided you've set the working directory to
 src/test/test-files
 But I'd love a way to run a single method from the command line too.

 -Yonik
 http://www.lucidimagination.com

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: Stored fields access

2010-02-25 Thread Erick Erickson
Does LazyLoading address this? I'm assuming your issue is
that the default behavior loads the entire document regardless
of whether you actually want all the fields.

Erick

On Thu, Feb 25, 2010 at 7:52 AM, Earwin Burrfoot ear...@gmail.com wrote:

 I'm thinking, should Lucene introduce new interface to read stored
 document fields?

 Current 'Document document(int n)' mechanism is barely usable due to
 overhead involved. While I believe underlying index structure works
 pretty fast (if it fits in memory, as is the case for most
 performance-concerned installations), there's no adequate access to it
 and people are forced to introduce contraptions like LinkedIn's
 payload-assisted luceneId-appId mapping or similar caches we employ.

 What I am thinking about is something along the lines of existing
 iterators like TermDocs/TermPositions. Iterate over docs, then iterate
 over fields stored for each, extract data, ???, profit.
 Comments?

 --
 Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
 Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
 ICQ: 104465785

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Updated: (LUCENE-2285) Code cleanup from all sorts of (trivial) warnings

2010-02-25 Thread Erick Erickson
I'm so glad somebody else gets bugged by all the trivial warnings, all along
I thought it was a personal problem G..

As I remember, I deprecated LuceneTestCase entirely to encourage people
to migrate to the Junit4 variant (LuceneTestCaseJ4). So removing those
deprecations should be approached with some caution. Of course this
may have changed in the interim

Erick

On Thu, Feb 25, 2010 at 10:01 AM, Shai Erera (JIRA) j...@apache.org wrote:


 [
 https://issues.apache.org/jira/browse/LUCENE-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]

 Shai Erera updated LUCENE-2285:
 ---

Attachment: LUCENE-2285.patch

 Quite a large patch. I've started off with 3832 compiler warnings based on
 my eclipse settings and we're now down to 510. All tests pass, including
 core, contrib and tag. I've also fixed a bunch of javadocs warnings, and
 ant javadocs now passes cleanly. I did not do any formatting to the code,
 in order to preserve the patch as clear and focused as possible, even though
 it's a very large one ...

 It touches a lot of files. So the sooner someone can help me commit it the
 better (before these files change).

  Code cleanup from all sorts of (trivial) warnings
  -
 
  Key: LUCENE-2285
  URL: https://issues.apache.org/jira/browse/LUCENE-2285
  Project: Lucene - Java
   Issue Type: Improvement
 Reporter: Shai Erera
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2285.patch
 
 
  I would like to do some code cleanup and remove all sorts of trivial
 warnings, like unnecessary casts, problems w/ javadocs, unused variables,
 redundant null checks, unnecessary semicolon etc. These are all very trivial
 and should not pose any problem.
  I'll create another issue for getting rid of deprecated code usage, like
 LuceneTestCase and all sorts of deprecated constructors. That's also trivial
 because it only affects Lucene code, but it's a different type of change.
  Another issue I'd like to create is about introducing more generics in
 the code, where it's missing today - not changing existing API. There are
 many places in the code like that.
  So, with you permission, I'll start with the trivial ones first, and then
 move on to the others.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-2285) Code cleanup from all sorts of (trivial) warnings

2010-02-25 Thread Erick Erickson
Junit4:

Well, simply disliking the @Test annotation seems like a poor reason to stay
with Junit3, although I admit it's a pain in the neck to change. Which is
why I didn't try to change all of them. The current system lends itself to
the practice of mangling the test name as a way of not running it, which far
too easily allows the test case to be forever ignored. One concrete
advantage of  annotations in Junit4 is the ability to add another stupid
annotation @Ignore, which then gets reported and thus doesn't get lost.

As I remember, that last place we left localization what that Mike (?) saw
some intermittent problem that I couldn't reproduce. I could dust off that
code and see what the current state of affairs is since this has come up
again. The other problem was that the implementation I used lead to
*increased* test run times. The localization tests basically spun through
all the Locales available and ran all the tests in the class against them.
The current system only runs *some* of the tests in a test class through the
localization process. This can be addressed by, at worst, splitting the test
class up, but in my proof-of-concept that seemed like too much detail...

My purpose in deprecating LuceneTestCase was to explicitly encourage
migration to Junit4, the deprecation warnings being the goad. I vote against
removing it

FWIW
Erick

On Thu, Feb 25, 2010 at 10:54 AM, Uwe Schindler (JIRA) j...@apache.orgwrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12838384#action_12838384]

 Uwe Schindler commented on LUCENE-2285:
 ---

 Hi Shai,

 I applied the patch to my checkout, so it will not get out-of date. As
 mentioned before, I have to review each change, as on my first diagonal
 look-around I found a removed cast in TestCharArraySet/Map that is important
 to call the right method, without the cast the test would pass, but the
 affected method is never called. I am also not want to remove some casts in
 NumericRange and other parts, where the casts were added for more clearness
 in code. Especially at some places without the cast it is not clear what
 javac will do, so the cast is for more security even if not needed.

 So please excuse by complaints, but two people looking over such a large
 patch is really needed.

 Thanks for the work! Uwe

  Code cleanup from all sorts of (trivial) warnings
  -
 
  Key: LUCENE-2285
  URL: https://issues.apache.org/jira/browse/LUCENE-2285
  Project: Lucene - Java
   Issue Type: Improvement
 Reporter: Shai Erera
 Assignee: Uwe Schindler
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2285.patch
 
 
  I would like to do some code cleanup and remove all sorts of trivial
 warnings, like unnecessary casts, problems w/ javadocs, unused variables,
 redundant null checks, unnecessary semicolon etc. These are all very trivial
 and should not pose any problem.
  I'll create another issue for getting rid of deprecated code usage, like
 LuceneTestCase and all sorts of deprecated constructors. That's also trivial
 because it only affects Lucene code, but it's a different type of change.
  Another issue I'd like to create is about introducing more generics in
 the code, where it's missing today - not changing existing API. There are
 many places in the code like that.
  So, with you permission, I'll start with the trivial ones first, and then
 move on to the others.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: Stored fields access

2010-02-25 Thread Erick Erickson
OK, never mind G

Erick

On Thu, Feb 25, 2010 at 1:48 PM, Earwin Burrfoot ear...@gmail.com wrote:

 My issue is with extra objects created in the process. Field selection
 can be handled with, well, FieldSelector.

 2010/2/25 Erick Erickson erickerick...@gmail.com:
  Does LazyLoading address this? I'm assuming your issue is
  that the default behavior loads the entire document regardless
  of whether you actually want all the fields.
  Erick
 
  On Thu, Feb 25, 2010 at 7:52 AM, Earwin Burrfoot ear...@gmail.com
 wrote:
 
  I'm thinking, should Lucene introduce new interface to read stored
  document fields?
 
  Current 'Document document(int n)' mechanism is barely usable due to
  overhead involved. While I believe underlying index structure works
  pretty fast (if it fits in memory, as is the case for most
  performance-concerned installations), there's no adequate access to it
  and people are forced to introduce contraptions like LinkedIn's
  payload-assisted luceneId-appId mapping or similar caches we employ.
 
  What I am thinking about is something along the lines of existing
  iterators like TermDocs/TermPositions. Iterate over docs, then iterate
  over fields stored for each, extract data, ???, profit.
  Comments?
 
  --
  Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
  Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
  ICQ: 104465785
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 



 --
 Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
 Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
 ICQ: 104465785

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-2285) Code cleanup from all sorts of (trivial) warnings

2010-02-25 Thread Erick Erickson
I don't have my heart set on keeping the deprecation, so taking it off works
for me. I'd also agree that we need a concerted effort to either completely
convert or we should leave it un-deprecated so feel free.

Let's move the junit4 stuff off to another discussion.

Erick


On Thu, Feb 25, 2010 at 1:27 PM, Shai Erera ser...@gmail.com wrote:

 Erik, I'm totally with you on JUnit 4. I think the @Test annotation is
 really not a big deal (it's actually very easy to migrate all the current
 tests to JUnit 4 with the added import using some script. Even manually it
 should be such a big deal.

 @Ignore is a perfect other advantage of JUnit4. I've found some tests which
 were prefixed with _, i.e. _testXYZ just to disable them. Nobody knows about
 it until he looks at the code (and pays attention). @Ignore would have been
 better.

 And there are lots of other advantages, like the @Before and @After (not
 only class). Another problem I've found in the tests is that not all
 extended LuceneTestCase, and usually their setUp and tearDown
 implementations were wrong - not calling super first/last. When I moved them
 to extend LuceneTestCase they broke (I fixed them, don't worry). However,
 that could never happen if the super's methods were tagged w/ @Before/After,
 because JUnit would take care running them before/after their sub-classes'
 @Before/After. So that's another win for JUnit4.

 And of course the @Before/AfterClass are really great !

 So all in all, I'm a big fan of JUnit4, and if the discussion will start
 again, I'll pay more attention to it and participate (I admit I didn't
 follow it before). As long as it happens on the list and not on some IRC
 channel (!?!?).

 But like Uwe said, that's slightly unrelated to that issue. Because that
 deprecation alone produced  500 warnings (probably even much more), I
 un-deprecated it, and when we make a decision one way or the other, we
 should simply remove it (in case that's the decision). Until then, let's get
 rid of the unnecessary noise, agree?

 Shai


 On Thu, Feb 25, 2010 at 7:15 PM, Uwe Schindler u...@thetaphi.de wrote:

  This discussion is out oft he scope of this issue. We can start the
 flamewar again. In IRC we came to the conculsion, that our primary intent is
 to make the test runs faster, which we achieved by patching lots of tests to
 not change static defaults and so be able to run all tests in the same JVM
 without forking. More speed improvements can be done by moving read-only
 index creation for search tests into static @BeforeClass and setting
 IndexReaders/-Searchers to NULL in @AfterClass to allow GC of static fields
 holding RAMDirectory and so on.



 The @Test annotation lead to more confusion and errors at our delevopers.
 E.g. we had a test merged back from 3.0 (without Junit4) to trunk or even
 new tests were added, but nobody added @Test to it, leading to the fact that
 the test were never run. So the most important change to LuceneTestCaseJ4
 would be to emulate the old test* method names as if they have @Test. By
 that you could still disable them as mentioned, but it would reduce the
 burden of these dumb import statements and useless annotations.



 By the way, why does LuceneTestCaseJ4 extend TestWatchman and also a
 instance field extends that class? I do not understand the whole magic
 behind, this is totally confusing to me – annotating a field that is never
 used in code by an annotation is stupid and looks totally incorrect (I mean
 the field holding the TestWatchman-subclass). - This is another thing why I
 am against the migration of our already proven tests.



 Because of that we don’t want to deprecate LuceneTestCase and instead only
 transform new tests and such needing @BeforeClass/@AfterClass for more speed
 to the new API.



 -

 Uwe Schindler

 H.-H.-Meier-Allee 63, D-28213 Bremen

 http://www.thetaphi.de

 eMail: u...@thetaphi.de



 *From:* Erick Erickson [mailto:erickerick...@gmail.com]
 *Sent:* Thursday, February 25, 2010 5:27 PM
 *To:* java-dev@lucene.apache.org
 *Subject:* Re: [jira] Commented: (LUCENE-2285) Code cleanup from all
 sorts of (trivial) warnings



 Junit4:



 Well, simply disliking the @Test annotation seems like a poor reason to
 stay with Junit3, although I admit it's a pain in the neck to change. Which
 is why I didn't try to change all of them. The current system lends itself
 to the practice of mangling the test name as a way of not running it, which
 far too easily allows the test case to be forever ignored. One concrete
 advantage of  annotations in Junit4 is the ability to add another stupid
 annotation @Ignore, which then gets reported and thus doesn't get lost.

 As I remember, that last place we left localization what that Mike (?) saw
 some intermittent problem that I couldn't reproduce. I could dust off that
 code and see what the current state of affairs is since this has come up
 again. The other problem was that the implementation I used lead to
 *increased* test

Uwe's question

2010-02-25 Thread Erick Erickson
By the way, why does LuceneTestCaseJ4 extend TestWatchman and also a
instance field extends that class?
No good reason, I plead confusion when figuring out how to use it. I've
attached a patch to Lucene 2037 that removes the LuceneTestCaseJ4 extending
TestWatchman.

I do not understand the whole magic behind, this is totally confusing to
me – annotating a field that is never used in code by an annotation is
stupid and looks totally incorrect (I mean the field holding the
TestWatchman-subclass).

Well, this is to provide the same functionality as LuceneTestCase. I'm
reaching a bit here since I haven't been in that code lately, but...

LocalizedTestCase called runBare in LuceneTestCase which reported the seed
value if an exception was thrown. I couldn't find a good way to access
runBare or analogs in Junit4, but the interceptor pattern worked as well.
The interceptor is called by the Junit framework on test events, so there
aren't references to it in the Lucene test code. There are other places that
call runBare, so I assumed that if anyone wanted to use Junit4 with those
classes it would be a good thing to allow.

I think the interceptor pattern is an elegant way to do something at
discrete points in the test run, although it is a bit opaque.

Most of this was put in when I was trying to move LocalizedTestCase to the
Junit4 world. We didn't do that, but this still needs to be kept if we want
LuceneTestCaseJ4 to be a drop-in replacement for LuceneTestCase.

 - This is another thing why I am against the migration of our already
proven tests.

If you'll recall the discussion at the time, neither am I. I do believe,
though, that if anyone wants to change a test class to use Junit4 it's a
good thing to have something that'll drop in without surprises, which is
what I was trying for.

Erick


[jira] Updated: (LUCENE-2037) Allow Junit4 tests in our environment.

2010-02-25 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-2037:
---

Attachment: LUCENE-2037_remove_testwatchman.patch

Removed unnecessary derivation from TestWatchman.

Corrected minor typo in comment.

 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Michael McCandless
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch, LUCENE-2037.patch, 
 LUCENE-2037.patch, LUCENE-2037_remove_testwatchman.patch, 
 LUCENE-2037_revised_2.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Uwe's question

2010-02-25 Thread Erick Erickson
Hmmm, didn't reopen the JIRA, should I? Or will it just magically get into
Michael's queue?

On Thu, Feb 25, 2010 at 8:52 PM, Erick Erickson erickerick...@gmail.comwrote:

 By the way, why does LuceneTestCaseJ4 extend TestWatchman and also a
 instance field extends that class?
 No good reason, I plead confusion when figuring out how to use it. I've
 attached a patch to Lucene 2037 that removes the LuceneTestCaseJ4 extending
 TestWatchman.

 I do not understand the whole magic behind, this is totally confusing to
 me – annotating a field that is never used in code by an annotation is
 stupid and looks totally incorrect (I mean the field holding the
 TestWatchman-subclass).

 Well, this is to provide the same functionality as LuceneTestCase. I'm
 reaching a bit here since I haven't been in that code lately, but...

 LocalizedTestCase called runBare in LuceneTestCase which reported the seed
 value if an exception was thrown. I couldn't find a good way to access
 runBare or analogs in Junit4, but the interceptor pattern worked as well.
 The interceptor is called by the Junit framework on test events, so there
 aren't references to it in the Lucene test code. There are other places that
 call runBare, so I assumed that if anyone wanted to use Junit4 with those
 classes it would be a good thing to allow.

 I think the interceptor pattern is an elegant way to do something at
 discrete points in the test run, although it is a bit opaque.

 Most of this was put in when I was trying to move LocalizedTestCase to the
 Junit4 world. We didn't do that, but this still needs to be kept if we want
 LuceneTestCaseJ4 to be a drop-in replacement for LuceneTestCase.

  - This is another thing why I am against the migration of our already
 proven tests.

 If you'll recall the discussion at the time, neither am I. I do believe,
 though, that if anyone wants to change a test class to use Junit4 it's a
 good thing to have something that'll drop in without surprises, which is
 what I was trying for.

 Erick



Re: Uwe's question

2010-02-25 Thread Erick Erickson
Well, Things got busy (tm). Uwe's point if valid; unless there's
demonstrable gain, moving things to Junit4 just for fun is wasted motion,
indeed dangerous. I was focusing on LocalizedTestCase to understand the
place of runBare etc. in the scheme of things since when I created
LuceneTestCaseJ4 that was something I wanted to figure out to make it a
replacement for LuceneTestCase.

I can't point to a compelling reason to shake up the code, the only
improvement it would have is having a demonstration of using the Junit4
@RunWith annotation for future reference.

So, I've no compelling reason to push that patch forward. If y'all think
it's worth it I'll be happy to crank that patch back up again, it'll take a
few days though. It does affect a several files, and if the main value here
is an exemplar of the @RunWith annotation, perhaps there's a better place to
put that in.

Erick

On Thu, Feb 25, 2010 at 9:06 PM, Robert Muir rcm...@gmail.com wrote:




 LocalizedTestCase called runBare in LuceneTestCase which reported the seed
 value if an exception was thrown. I couldn't find a good way to access
 runBare or analogs in Junit4, but the interceptor pattern worked as well.
 The interceptor is called by the Junit framework on test events, so there
 aren't references to it in the Lucene test code. There are other places that
 call runBare, so I assumed that if anyone wanted to use Junit4 with those
 classes it would be a good thing to allow.


 I didn't forget about your patch Erick, in my opinion there is nothing
 wrong with it. I hope its not discouraging you, the problem is a few of us
 have spent countless hours trying to debug this hard-to-reproduce Thai test
 failure problem.

 It failed in the existing tests, too, with Junit 3 on hudson (one time!).
 At this point, i start to wonder if it could be related to stuff like this:
 http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6683975

 I don't think we should let this stop progress with the tests, if you think
 we should move LocalizedTestCase to junit 4 lets do it.

 --
 Robert Muir
 rcm...@gmail.com



Re: FileNotFoundException for write.lock

2010-01-23 Thread Erick Erickson
Please repost this over on the users list. This list is for internal
development discussions.

Thanks
Erick

On Sat, Jan 23, 2010 at 9:56 PM, jchang jchangkihat...@gmail.com wrote:


 By the way: this happens with a brand new directory with no files at all in
 it.


 jchang wrote:
 
  When I try to start my service and construct an IndexWriter, I get this:
 
  java.io.FileNotFoundException: no segments* file found in
  org.apache.lucene.store.NIOFSDirectory@
 /home/jchang/IdeaProjects/index-service_trunk/target/testindexA/index/indexablemaildata:
  files: [write.lock]
 
  It is odd.  The problem is not that it is complaining about a lock file.
  There is none there.  It seems to be complaining that there is NOT a lock
  file.  Why?
 

 --
 View this message in context:
 http://old.nabble.com/FileNotFoundException-for-write.lock-tp27291955p27291981.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: Finding frequency of regex query match in a field

2010-01-15 Thread Erick Erickson
Could I ask you to re-post this on the java user's list? This list is for
*internal* Lucene development discussion.

Thanks
Erick

On Fri, Jan 15, 2010 at 8:28 AM, Altimatic chris.stuckl...@gmail.comwrote:


 Hi All,

 I have an application that has to count the frequency that a specific
 regular expression is matched on a particular field for each document in an
 indexed directory.

 For example.

 Lets say I have 10 documents in the directory and each document has 3
 fields, table, column and data.

 Example Doc(s):
 //***
 Document doc1 = new Document();
 doc1.add(new Field(table, EMPLOYEE_US, Field.Store.NO,
 Field.Index.ANALYZED);
 doc11.add(new Field(column, F_NAME, Field.Store.NO,
 Field.Index.ANALYZED);
 doc.add(new Field(data, Chris Hank Tony Cody Tom Tina Crystal,
 Field.Store.NO, Field.Index.ANALYZED,
 Field.TermVector.WITH_POSITIONS_OFFSETS);

 Document doc2 = new Document();
 doc2.add(new Field(table, EMPLOYEE_CA, Field.Store.NO,
 Field.Index.ANALYZED);
 doc2.add(new Field(column, F_NAME, Field.Store.NO,
 Field.Index.ANALYZED);
 doc2.add(new Field(data, Bob Billy Tom Toby Charles Krista Madonna,
 Field.Store.NO, Field.Index.ANALYZED,
 Field.TermVector.WITH_POSITIONS_OFFSETS);

 //I know I can  create a query to search for a regular expression and that
 will return each
 //document that contains a match.

 IndexWriter writer = new IndexWriter(directory, new WhitespaceAnalyzer(),
 true,

 IndexWriter.MaxFieldLength.LIMITED);
 writer.addDocument(doc);
 writer.optimize();
 writer.close();
 searcher = new IndexSearcher(directory);

 RegexQuery query = new RegexQuery( newTerm(data, ^T.*));
 ScoreDoc[] hits = searcher.search(query, null,
 maxNumOfHits).scoreDocs;//grab the score docs and go through them to find
 the documents that contain a match

 //*


 The code above will tell me that both doc1 and doc2 contain a match for the
 constructed query.

 However I need to know how many times the regular expression was matched in
 each document. ie.

 doc1 = 3
 doc2 = 2

 I hope I am being clear...and thanks in advance.


 Cheers

 --
 View this message in context:
 http://old.nabble.com/Finding-frequency-of-regex-query-match-in-a-field-tp27175040p27175040.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: svn commit: r894224 - in /lucene/java/trunk/contrib: benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/ highlighter/src/java/org/apache/lucene/search/highlight/ instantiated/src/java/o

2009-12-28 Thread Erick Erickson
I once knew of a math prof in the early days of electronic
book submissions who had a helpful person change all the
iffs into if thinking they were all typo's... in all the proofs
in a math text... As his fellow faculty member was relaying
the story added putting them back was non-trivial

Erick

On Mon, Dec 28, 2009 at 3:02 PM, Robert Muir rcm...@gmail.com wrote:

 Simon, are we sure these are spelling issues, I think this iff stands
 for 'if and only if' in these cases?

 http://en.wikipedia.org/wiki/If_and_only_if

 On Mon, Dec 28, 2009 at 1:52 PM,  sim...@apache.org wrote:
  Author: simonw
  Date: Mon Dec 28 18:52:19 2009
  New Revision: 894224
 
  URL: http://svn.apache.org/viewvc?rev=894224view=rev
  Log:
  fixed trivial spelling issues in javadoc
 
  Modified:
 
  
 lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/PerfTask.java
 
  
 lucene/java/trunk/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTerm.java
 
  
 lucene/java/trunk/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/InstantiatedTermPositions.java
 
  Modified:
 lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/PerfTask.java
  URL:
 http://svn.apache.org/viewvc/lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/PerfTask.java?rev=894224r1=894223r2=894224view=diff
 
 ==
  ---
 lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/PerfTask.java
 (original)
  +++
 lucene/java/trunk/contrib/benchmark/src/java/org/apache/lucene/benchmark/byTask/tasks/PerfTask.java
 Mon Dec 28 18:52:19 2009
  @@ -287,7 +287,7 @@
 
/**
 * Sub classes that supports parameters must override this method to
 return true.
  -   * @return true iff this task supports command line params.
  +   * @return true if this task supports command line params.
 */
public boolean supportsParams () {
  return false;
 
  Modified:
 lucene/java/trunk/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTerm.java
  URL:
 http://svn.apache.org/viewvc/lucene/java/trunk/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTerm.java?rev=894224r1=894223r2=894224view=diff
 
 ==
  ---
 lucene/java/trunk/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTerm.java
 (original)
  +++
 lucene/java/trunk/contrib/highlighter/src/java/org/apache/lucene/search/highlight/WeightedSpanTerm.java
 Mon Dec 28 18:52:19 2009
  @@ -53,8 +53,8 @@
 * Checks to see if this term is valid at codeposition/code.
 *
 * @param position
  -   *to check against valid term postions
  -   * @return true iff this term is a hit at this position
  +   *to check against valid term positions
  +   * @return true if this term is a hit at this position
 */
public boolean checkPosition(int position) {
  // There would probably be a slight speed improvement if
 PositionSpans
 
  Modified:
 lucene/java/trunk/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/InstantiatedTermPositions.java
  URL:
 http://svn.apache.org/viewvc/lucene/java/trunk/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/InstantiatedTermPositions.java?rev=894224r1=894223r2=894224view=diff
 
 ==
  ---
 lucene/java/trunk/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/InstantiatedTermPositions.java
 (original)
  +++
 lucene/java/trunk/contrib/instantiated/src/java/org/apache/lucene/store/instantiated/InstantiatedTermPositions.java
 Mon Dec 28 18:52:19 2009
  @@ -80,7 +80,7 @@
 
/**
 * Skips entries to the first beyond the current whose document number
 is
  -   * greater than or equal to
 currentTermPositionIndextarget/currentTermPositionIndex. pReturns true
 iff there is such
  +   * greater than or equal to
 currentTermPositionIndextarget/currentTermPositionIndex. pReturns true
 if there is such
 * an entry.  pBehaves as if written: pre
 *   boolean skipTo(int target) {
 * do {
 
 
 



 --
 Robert Muir
 rcm...@gmail.com

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: svn commit: r890427 - /lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java

2009-12-14 Thread Erick Erickson
Oooh, nice...

On Mon, Dec 14, 2009 at 1:26 PM, rm...@apache.org wrote:

 Author: rmuir
 Date: Mon Dec 14 18:26:26 2009
 New Revision: 890427

 URL: http://svn.apache.org/viewvc?rev=890427view=rev
 Log:
 LUCENE-2155: add assertion to check if something changes default locale
 behind our back when using LocalizedTestCase

 Modified:
lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java

 Modified:
 lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java
 URL:
 http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java?rev=890427r1=890426r2=890427view=diff

 ==
 ---
 lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java
 (original)
 +++
 lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java Mon
 Dec 14 18:26:26 2009
 @@ -73,6 +73,8 @@

   @Override
   protected void tearDown() throws Exception {
 +assertEquals(default locale unexpectedly changed:, locale, Locale
 +.getDefault());
 Locale.setDefault(defaultLocale);
 super.tearDown();
   }





Re: svn commit: r890427 - /lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java

2009-12-14 Thread Erick Erickson
Thanks for letting me know, it's quite a relief to be able to trust the
@Parameterized
stuff.

Just let me know if you need me to regenerate the patch whenever you want to
apply it. Between now and then I'll find something else to do G...

Erick

On Mon, Dec 14, 2009 at 1:59 PM, Robert Muir rcm...@gmail.com wrote:

 yeah i am convinced this is not a problem with your junit 4 patch Erick...
 as Uwe ran into the same trouble I ran into with the existing
 LocalizedTestCase

 however, if you don't mind, I'd like to let it set with the junit 3 impl a
 little bit longer and see if we get more random-hard-to-reproduce failures.


 On Mon, Dec 14, 2009 at 1:46 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 Oooh, nice...


 On Mon, Dec 14, 2009 at 1:26 PM, rm...@apache.org wrote:

 Author: rmuir
 Date: Mon Dec 14 18:26:26 2009
 New Revision: 890427

 URL: http://svn.apache.org/viewvc?rev=890427view=rev
 Log:
 LUCENE-2155: add assertion to check if something changes default locale
 behind our back when using LocalizedTestCase

 Modified:

  lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java

 Modified:
 lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java
 URL:
 http://svn.apache.org/viewvc/lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java?rev=890427r1=890426r2=890427view=diff

 ==
 ---
 lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java
 (original)
 +++
 lucene/java/trunk/src/test/org/apache/lucene/util/LocalizedTestCase.java Mon
 Dec 14 18:26:26 2009
 @@ -73,6 +73,8 @@

   @Override
   protected void tearDown() throws Exception {
 +assertEquals(default locale unexpectedly changed:, locale, Locale
 +.getDefault());
 Locale.setDefault(defaultLocale);
 super.tearDown();
   }






 --
 Robert Muir
 rcm...@gmail.com



Re: [jira] Updated: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-14 Thread Erick Erickson
its possible the problems are not reproduceable because
they are a crazy problem with these tests.

Agree absolutely. I was just making sure we considered
the *possibility* that the paramaterized version was showing
an underlying Lucene problem rather than assuming the
fault was with Junit4. Having spent a way too much of
my programming life being absolutely sure I knew what
part of the system was causing the failure only to find that
the problem was waaay over *there* instead, I'm kinda
sensitive that way G...

I'll eagerly await your results...

Erick

On Sun, Dec 13, 2009 at 5:24 PM, Robert Muir rcm...@gmail.com wrote:

 Erick it might be a gremlin on my computer or my brain...
 also i think i was inadvertently using different JVM's for running ant test
 (sometimes java 5/64bit sometimes 6/32bit). this is because i was doing
 something with forrest and changed my JAVA_HOME in one shell window.

 so i'm going to run 100 ant clean tests with each JVM, logging to a file.
 if these work reliably then I think I will conclude I was doing something
 stupid before... (like forgot to ant clean or something like that)

 this computer is windows, so you are right it might have different locales
 than your mac.

 however, i think we should consider your last comment: its possible the
 problems are not reproduceable because they are a crazy problem with these
 tests.
 for example, i think we should be extra cautious and call Calendar.clear()
 on all our calendars before changing time values and then asserting expected
 results.
 I don't see any obvious problem though, just thinking if something based on
 the 'current time' was affecting the tests, then this might make it hard to
 reproduce.


 On Sat, Dec 12, 2009 at 9:26 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 H, you can't get either patch to work reliably.
 On the other hand, I can't get either patch to fail.
 I ran the whole ant clean test thing half a dozen times.
 I'll make a script to loop all night tonight and we'll see.
 I also ran just the TestQueryParser around 700 times
 from Ant via a shell script. No problems. No problems
 in IntelliJ. Siiggghhh.

 Anybody else want to try applying either patch and see
 what happens? I'd hate to lose the capabilities of the
 Parameterized tests because of a gremlin that only exists
 on Robert's machine. I'd also hate to introduce cool new
 capabilities that started training us to ignore test failures.
 That's bad. Very bad.

 Robert: What kind of machine are you running on? I'm running
 on a Macbook Pro...

 As it stands, I'm not sure whether parameterized tests are
 the issue or whether the issue is Locale testing. Or whether
 Robert has some peculiar setup. Or, for that matter, whether
 I have some peculiar setup that makes it work by hiding an
 instability. It sure would be nice to figure out where the
 fragility is before relying on Parameterized tests...

 Robert:
 If you have the patience, could you try your patch out and
 capture the failure? I'm especially curious if your patch
 fails on the same language every time. Who knows? On
 your machine, this *could* be hitting an edge case, that's
 actually a flaw in the code somewhere rather than an artifact
 of the test framework. I don't even know if my machine
 is using all of the same Locale's as yours

 I'd have at figuring out what was going on, but I can't make
 it fail. It works on my machine doesn't leave me very many
 directions forward

 But I'm so glad that Robert is finding this nonsense
 *before* we get too much farther down this road rather than
 after

 I'll poke around on the internet and see if there's anything there
 that I can see.

 Erick


 On Sat, Dec 12, 2009 at 8:55 AM, Robert Muir (JIRA) j...@apache.orgwrote:


 [
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]

 Robert Muir updated LUCENE-2122:
 

Assignee: (was: Robert Muir)

 i am unassigning in case someone else can figure this one out, at my wits
 end here :)
 perhaps its just something wierd about my environment or something

  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch,
 LUCENE-2122-r4.patch, LUCENE-2122.patch, LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can

Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-13 Thread Erick Erickson
Uwe:

Thanks, I'll remember that in the future



On Sun, Dec 13, 2009 at 5:31 AM, Uwe Schindler u...@thetaphi.de wrote:

  Hi Erick,



 sadly, the eMail reply to JIRA issues does not work for mails sent to this
 mailing list (because the list overrides reply-to header so JIRA does not
 get the answer). If you answer only on the ML, we loss those comments in the
 issue.



 Uwe



 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
   --

 *From:* Erick Erickson [mailto:erickerick...@gmail.com]
 *Sent:* Sunday, December 13, 2009 4:02 AM
 *To:* java-dev@lucene.apache.org
 *Subject:* Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for
 more thorough Locale testing for classes deriving from LocalizedTestCase



 Robert:

 The -r4 patch runs for you and you want me to look at your patch compared
 to r4? Sure, I'll do that, but not til tomorrow, I do much better work when
 I'm not tired G.

 I confess I haven't looked at your patch beyond installing it to see if I
 could reproduce the failure (looks like our emails crossed). But it's
 *still* peculiar that it behaves differently between our two machines. OTOH,
 maybe your patch will fail on my machine sometime tonight, my 4 successes
 aren't very statistically significant after all..

 Erick

 On Sat, Dec 12, 2009 at 9:14 PM, Robert Muir (JIRA) j...@apache.org
 wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789837#action_12789837]

 Robert Muir commented on LUCENE-2122:
 -

 btw, I left 'ant clean test' running in a loop and just checked it with
 this patch, no problems.
 so perhaps its my own incompetence. Erick can you take a look? Do you see
 some obvious problem?


  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch,
 LUCENE-2122-r4.patch, LUCENE-2122.patch, LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.

-
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





IndexWriter failure

2009-12-13 Thread Erick Erickson
I was running the whole ant-clean-test in a loop last night for
LUCENE-2122 and had this error in IndexWriter occur once
in 30+ runs. I now there has been some work on spurious
failures here lately and thought I'd add this on the chance
it'd help anyone tracking this issue. Didn't see a JIRA...

I updated the trunk yesterday (12-Dec) afternoon sometime


[junit] Testcase:
testMaxBufferedDocsChange(org.apache.lucene.index.TestIndexWriterMergePolicy):
FAILED
[junit] maxMergeDocs=2147483647; numSegments=11; upperBound=10;
mergeFactor=10
[junit] junit.framework.AssertionFailedError: maxMergeDocs=2147483647;
numSegments=11; upperBound=10; mergeFactor=10
[junit] at
org.apache.lucene.index.TestIndexWriterMergePolicy.checkInvariants(TestIndexWriterMergePolicy.java:234)
[junit] at
org.apache.lucene.index.TestIndexWriterMergePolicy.testMaxBufferedDocsChange(TestIndexWriterMergePolicy.java:164)
[junit] at
org.apache.lucene.util.LuceneTestCase.runBare(LuceneTestCase.java:212)
[junit]

FWIW
Erick


Re: [jira] Updated: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-12 Thread Erick Erickson
H, you can't get either patch to work reliably.
On the other hand, I can't get either patch to fail.
I ran the whole ant clean test thing half a dozen times.
I'll make a script to loop all night tonight and we'll see.
I also ran just the TestQueryParser around 700 times
from Ant via a shell script. No problems. No problems
in IntelliJ. Siiggghhh.

Anybody else want to try applying either patch and see
what happens? I'd hate to lose the capabilities of the
Parameterized tests because of a gremlin that only exists
on Robert's machine. I'd also hate to introduce cool new
capabilities that started training us to ignore test failures.
That's bad. Very bad.

Robert: What kind of machine are you running on? I'm running
on a Macbook Pro...

As it stands, I'm not sure whether parameterized tests are
the issue or whether the issue is Locale testing. Or whether
Robert has some peculiar setup. Or, for that matter, whether
I have some peculiar setup that makes it work by hiding an
instability. It sure would be nice to figure out where the
fragility is before relying on Parameterized tests...

Robert:
If you have the patience, could you try your patch out and
capture the failure? I'm especially curious if your patch
fails on the same language every time. Who knows? On
your machine, this *could* be hitting an edge case, that's
actually a flaw in the code somewhere rather than an artifact
of the test framework. I don't even know if my machine
is using all of the same Locale's as yours

I'd have at figuring out what was going on, but I can't make
it fail. It works on my machine doesn't leave me very many
directions forward

But I'm so glad that Robert is finding this nonsense
*before* we get too much farther down this road rather than
after

I'll poke around on the internet and see if there's anything there
that I can see.

Erick

On Sat, Dec 12, 2009 at 8:55 AM, Robert Muir (JIRA) j...@apache.org wrote:


 [
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]

 Robert Muir updated LUCENE-2122:
 

Assignee: (was: Robert Muir)

 i am unassigning in case someone else can figure this one out, at my wits
 end here :)
 perhaps its just something wierd about my environment or something

  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch,
 LUCENE-2122-r4.patch, LUCENE-2122.patch, LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-12 Thread Erick Erickson
Robert:

The -r4 patch runs for you and you want me to look at your patch compared to
r4? Sure, I'll do that, but not til tomorrow, I do much better work when I'm
not tired G.

I confess I haven't looked at your patch beyond installing it to see if I
could reproduce the failure (looks like our emails crossed). But it's
*still* peculiar that it behaves differently between our two machines. OTOH,
maybe your patch will fail on my machine sometime tonight, my 4 successes
aren't very statistically significant after all..

Erick

On Sat, Dec 12, 2009 at 9:14 PM, Robert Muir (JIRA) j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12789837#action_12789837]

 Robert Muir commented on LUCENE-2122:
 -

 btw, I left 'ant clean test' running in a loop and just checked it with
 this patch, no problems.
 so perhaps its my own incompetence. Erick can you take a look? Do you see
 some obvious problem?


  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch,
 LUCENE-2122-r4.patch, LUCENE-2122.patch, LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-11 Thread Erick Erickson
So I ran this test suite from Idea a dozen times or so and no problem.

Then I ran it a couple of thousand times through Ant via a shell
script. No problem.

So I'm tending toward thinking it's an Eclipse issue, what do
you think?

Erick


On Thu, Dec 10, 2009 at 4:23 PM, Erick Erickson erickerick...@gmail.comwrote:

 I'll give this a whirl tonight. The reason I was wondering what language
 is to insure that my machine *also* tests the offending locale.

 A bit of a nit, the flaw in the approach with LocalizedTestCase is
 that *every* test in the class is run against *all* locales..
 To change this, as I understand it, we'd need to break the tests
 out into a separate class...

 Intermittent errors often smell like a race condition, so I'll be
 on the lookout for one.

 But I also wonder if you'd ever get this error running outside
 of Eclipse.

 I really, really, really hate ones like this. Let's say you have a script
 that runs 1,000 times flawlessly from the shell. What does that prove?
 nasty grin.

 But maybe if I relentlessly press the test button on that class it'll
 happen
 to me too

 FWIW
 Erick

 On Thu, Dec 10, 2009 at 3:30 PM, Robert Muir rcm...@gmail.com wrote:

 i just right clicked TestQueryParser and said 'run as junit test'

 i could not tell which locales failed, (just testing your original patch,
 no modifications)
 the way they are shown instead is like an array of 135 elements...
 [0]: testCJK[0] (0.000s)
   testSimple[0] (0.001s)
 ...
 [1]: testCJK[1] (0.000s)
 ...
 [135] testCJK[135]

 the only tests that failed were the localized methods like the date stuff,
 where its going to create an 'expected' localized string and then compare
 against that.
 it makes me suspect that somehow there is some race, and the default
 locale is actually changing as the test is running, or something crazy like
 this?!



 On Thu, Dec 10, 2009 at 3:23 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 Yep, that sure makes me nervous too. I've never seen a failure in
 IntelliJ or from a
 shell window.

 How often do you need to run it to see an error? And what language is it
 using?
 And what test?

 I can try this in my IntelliJ setup and see if I can reproduce it. Note
 I'm running
 on a Macbook Pro...

 I wonder if a repeating script would show an intermittent error

 Erick


 On Thu, Dec 10, 2009 at 3:10 PM, Robert Muir (JIRA) j...@apache.orgwrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1274#action_1274]

 Robert Muir commented on LUCENE-2122:
 -

 Hi Erick, I played with this patch some and (not intentionally trying) I
 would get random test failures for TestQueryParser under eclipse... its not
 really something I am able to repeat though.

 maybe some race condition (I do not know how eclipse executes
 parameterized tests) ?

 if it is a problem with my IDE that is one thing, just makes me a little
 nervous right now. trying to think what could cause this

  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL:
 https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Assignee: Robert Muir
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch,
 LUCENE-2122-r4.patch, LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





 --
 Robert Muir
 rcm...@gmail.com





Re: Lucene Analyzer that can handle C++ vs C#

2009-12-11 Thread Erick Erickson
This type of question is not appropriate on the developers list, this
list is devoted to development. Please please post this kind of
question on the user's list.

As it happens, this very topic is being discussed under a thread
Recover special terms from StandardTokenizer, that should give
you some ideas.

ERick

On Fri, Dec 11, 2009 at 11:19 AM, maxSchlein m_schl...@hotmail.com wrote:


 Can someone please point me in the right direction.

 We are creating an application that needs to beable to search on C++ and
 get
 back doc's that have C++ in it.  The StandardAnalyzer does not seem to
 index
 the +, so a search for C++ will bring back docs that contain, C++, C,
 C#, etc.  The WhiteSpaceAnalyzer will index the +, but if we have the
 term C++. that is, if C++ is at the end of a sentence, it will index
 C++. so a search for C++ will not return the doc.  I have heard of
 maybe
 a CustomAnalyzer; however, it seems like there would actually need to be a
 CustomFilter/CustomTokenizer, I looked at:
  - StandardAnalyzer.java
  - StandardFilter.java
  - StandardTokenizer.java
  - StandardTokenizerImpl.java
  - StandardTokenizerImpl.jflex

 I would guess that the StandardTokenizer is where the changes would need to
 be made to allow the + character, but I am unclear as to how.

 Any and all help is greatly appreciated.
 --
 View this message in context:
 http://old.nabble.com/Lucene-Analyzer-that-can-handle-C%2B%2B-vs-C--tp26747079p26747079.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-2133) [PATCH] IndexCache: Refactoring of FieldCache, FieldComparator, SortField

2009-12-10 Thread Erick Erickson
Mike:

Which of these do you think this patch *should* address before committing?
Just the last two?
As many as Christian has energy for G?

On Thu, Dec 10, 2009 at 12:24 PM, Michael McCandless (JIRA) j...@apache.org
 wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788798#action_12788798]

 Michael McCandless commented on LUCENE-2133:
 

 This patch is a good step forward -- it associates the cache directly
 with IndexReader, where it belongs; it cleanly decouples cache from
 reader (vs the hack we have today with IndexReader.getFieldCacheKey),
 so that eg cloned readers can share the same cache; it also preserves
 back compat, which is quite a stunning accomplishment :)

 But... there are many more things I don't like about FieldCache, that
 I'm not sure (?) the patch addresses:

  * Uninversion to derive eg an int[] is horribly slow, compared to
say loading the pre-encoded binary ints from disk, created during
indexing.  Ie, I think, if we are going to overhaul FieldCache
API, we should somehow make LUCENE-1231 feasible.

  * There's no pluggability to customize where the int[] comes from
for a given field -- most you can do is provide your own IntParser
that the uninverter uses.  EG the fact that the patch had to
move FieldCacheRange/TermsFilter down, is strange -- these
filters (and in general any future cache consumers) should live
in oal.search, but simply pull from a different int[] source,
somehow.

  * Error checking of single-value-per-field (for StringIndex) is
dangerous, today -- it's intermittent, and, it's an unchecked
exception.  We should probably just remove the exception, or,
maybe make it checked.  Actually I'll go open a new issue for
this.  We should simply fix this.

  * Single-value-per-field limitation (though, that's a nice to have,
future improvement)

  * Even accepting the single-value-per-field limitaiton, we should
allow multiple values per field during uninversion, w/
customizable logic about which value is kept as the single one
(there is an issue open for this I think).  This really should be
some sort of added extensibility to whatever class drives
uninversion...

  * The terror of accidentally asking for the array at the top-level
of Multi/DirReader.  I think this shouldn't even be allowed, at
least not easily, ie Dir/MultiReader.getIndexCache should throw
UOE.  If we really wanted to, we could provide sugar methods in
maybe ReaderUtil to glom N int[]'s into a new int[].  But it
should be named something scary :) Then we wouldn't need any
insanity checking.

  * No control over caching policy (cannot evict things)

  * If we make field cache flexible enough, we could maybe fold norms
 deleted docs into it (would be a separate future issue to
actually do so...).

 Some other questions about the patch:

  * Consumers of the cache API (the sort comparator,
FieldCacheTerms/RangeFilter, and any other future users of the
field cache) shouldn't have to move down into fields sub-package?

  * It's a little strange that the term vectors  fields reader also
got pulled into the cache?


  [PATCH] IndexCache: Refactoring of FieldCache, FieldComparator, SortField
  -
 
  Key: LUCENE-2133
  URL: https://issues.apache.org/jira/browse/LUCENE-2133
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Search
 Affects Versions: 2.9.1, 3.0
 Reporter: Christian Kohlschütter
  Attachments: LUCENE-2133-complete.patch, LUCENE-2133.patch,
 LUCENE-2133.patch, LUCENE-2133.patch
 
 
  Hi all,
  up to the current version Lucene contains a conceptual flaw, that is the
 FieldCache. The FieldCache is a singleton which is supposed to cache certain
 information for every IndexReader that is currently open
  The FieldCache is flawed because it is incorrect to assume that:
  1. one IndexReader instance equals one index. In fact, there can be many
 clones (of SegmentReader) or decorators (FilterIndexReader) which all access
 the very same data.
  2. the cache information remains valid for the lifetime of an
 IndexReader. In fact, some IndexReaders may be reopen()'ed and thus they may
 contain completely different information.
  3. all IndexReaders need the same type of cache. In fact, because of the
 limitations imposed by the singleton construct there was no implementation
 other than FieldCacheImpl.
  Furthermore, FieldCacheImpl and FieldComparator are bloated by several
 static inner-classes that could be moved to package level.
  There have been a few attempts to improve FieldCache, namely LUCENE-831,
 LUCENE-1579 and LUCENE-1749, but the overall situation remains the 

Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-10 Thread Erick Erickson
Yep, that sure makes me nervous too. I've never seen a failure in IntelliJ
or from a
shell window.

How often do you need to run it to see an error? And what language is it
using?
And what test?

I can try this in my IntelliJ setup and see if I can reproduce it. Note I'm
running
on a Macbook Pro...

I wonder if a repeating script would show an intermittent error

Erick

On Thu, Dec 10, 2009 at 3:10 PM, Robert Muir (JIRA) j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1274#action_1274]

 Robert Muir commented on LUCENE-2122:
 -

 Hi Erick, I played with this patch some and (not intentionally trying) I
 would get random test failures for TestQueryParser under eclipse... its not
 really something I am able to repeat though.

 maybe some race condition (I do not know how eclipse executes parameterized
 tests) ?

 if it is a problem with my IDE that is one thing, just makes me a little
 nervous right now. trying to think what could cause this

  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Assignee: Robert Muir
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch,
 LUCENE-2122-r4.patch, LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-10 Thread Erick Erickson
I'll give this a whirl tonight. The reason I was wondering what language
is to insure that my machine *also* tests the offending locale.

A bit of a nit, the flaw in the approach with LocalizedTestCase is
that *every* test in the class is run against *all* locales..
To change this, as I understand it, we'd need to break the tests
out into a separate class...

Intermittent errors often smell like a race condition, so I'll be
on the lookout for one.

But I also wonder if you'd ever get this error running outside
of Eclipse.

I really, really, really hate ones like this. Let's say you have a script
that runs 1,000 times flawlessly from the shell. What does that prove?
nasty grin.

But maybe if I relentlessly press the test button on that class it'll happen
to me too

FWIW
Erick

On Thu, Dec 10, 2009 at 3:30 PM, Robert Muir rcm...@gmail.com wrote:

 i just right clicked TestQueryParser and said 'run as junit test'

 i could not tell which locales failed, (just testing your original patch,
 no modifications)
 the way they are shown instead is like an array of 135 elements...
 [0]: testCJK[0] (0.000s)
   testSimple[0] (0.001s)
 ...
 [1]: testCJK[1] (0.000s)
 ...
 [135] testCJK[135]

 the only tests that failed were the localized methods like the date stuff,
 where its going to create an 'expected' localized string and then compare
 against that.
 it makes me suspect that somehow there is some race, and the default locale
 is actually changing as the test is running, or something crazy like this?!



 On Thu, Dec 10, 2009 at 3:23 PM, Erick Erickson 
 erickerick...@gmail.comwrote:

 Yep, that sure makes me nervous too. I've never seen a failure in IntelliJ
 or from a
 shell window.

 How often do you need to run it to see an error? And what language is it
 using?
 And what test?

 I can try this in my IntelliJ setup and see if I can reproduce it. Note
 I'm running
 on a Macbook Pro...

 I wonder if a repeating script would show an intermittent error

 Erick


 On Thu, Dec 10, 2009 at 3:10 PM, Robert Muir (JIRA) j...@apache.orgwrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=1274#action_1274]

 Robert Muir commented on LUCENE-2122:
 -

 Hi Erick, I played with this patch some and (not intentionally trying) I
 would get random test failures for TestQueryParser under eclipse... its not
 really something I am able to repeat though.

 maybe some race condition (I do not know how eclipse executes
 parameterized tests) ?

 if it is a problem with my IDE that is one thing, just makes me a little
 nervous right now. trying to think what could cause this

  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Assignee: Robert Muir
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch,
 LUCENE-2122-r4.patch, LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





 --
 Robert Muir
 rcm...@gmail.com



Patch for LUCENE-2122 ready to go

2009-12-09 Thread Erick Erickson
Does someone with commit rights want to pick this up? I've incorporated the
changes suggested by Robert (Thanks!) and think it's ready to go.

Erick


Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-09 Thread Erick Erickson
Sh. I'll look at it again tonight

On Wed, Dec 9, 2009 at 9:13 AM, Robert Muir (JIRA) j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788100#action_12788100]

 Robert Muir commented on LUCENE-2122:
 -

 Hi Erick, in the Date tools test I think you can delete the public static
 CollectionLocale[] data(), I think you might have accidentally included
 it?


  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Assignee: Erick Erickson
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch,
 LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




[jira] Updated: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-09 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-2122:
---

Attachment: LUCENE-2122-r4.patch

OK, I plead advanced senility or some other excuse for the last patch.

Robert:
Thanks so much for looking this over, I have no clue what I was thinking with 
the TestDateTools. Or the other classes that derive from LocalizedTestCase.

The @Parameterized and @RunWith only needed to be in LocalizedTestCase and all 
the inheriting classes just rely on the base class to collect the different 
locales.

Anyway, this one should be much better

Erick

 Use JUnit4 capabilites for more thorough Locale testing for classes deriving 
 from LocalizedTestCase
 ---

 Key: LUCENE-2122
 URL: https://issues.apache.org/jira/browse/LUCENE-2122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
Reporter: Erick Erickson
Assignee: Robert Muir
Priority: Minor
 Fix For: 3.1

 Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch, 
 LUCENE-2122-r4.patch, LUCENE-2122.patch


 Use the @Parameterized capabilities of Junit4 to allow more extensive testing 
 of Locales.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-09 Thread Erick Erickson
It's embarrassing that I had to poke around for 1/2 hour to find *code that
I had written recently*. siiiggghhh. Maybe this time it'll stick

In LuceneTestCaseJ4, we added an @Rule-annotated class
InterceptTestCaseEvents whose methods get called whenever an event
happens, things like succeeded, failed, started, etc.. The failed method
looks for a method in the failing class called reportAdditionalFailureInfo.
So by adding something like the below to LocalizedTestCase you can print any
information you have available whenever things fail. It gets printed in
addition to the usual information Junit prints. Warning: I tested this
*very* lightly, at least it worked in the one case I tried..

  @Override
  public void reportAdditionalFailureInfo() {
System.out.println(Failing locale is +
_currentLocale.getDisplayName(_origDefault));
super.reportAdditionalFailureInfo(); // call to super.report.
UNTESTED! and probably not necessary in this context. Left as an exercise
for the reader G.
  }

Currently this is only does extra stuff for failed cases, but it would be
trivial to extend for start, end, succeeded whenever there's a need.

Your second question seems quite do-able,just by putting the default locale
in the list before getting into the loop as the first entry. I'm not sure
removing the default language is worth the effort, so it gets run twice. But
if you're writing the code, do whatever you want.

Gotta get some sleep G...

Erick

On Wed, Dec 9, 2009 at 9:45 PM, Robert Muir (JIRA) j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12788455#action_12788455]

 Robert Muir commented on LUCENE-2122:
 -

 thanks Erick, i will play around with the patch some, generally just
 double-check the locale stuff is doing what we want, looks like it will.

 i havent tested yet, but looking at the code i have a few questions (i can
 try to add these to the patch just curious what you think):
 1. if a test fails under some locale, say th_TH, will junit 4 attempt to
 print this parameter out in some way so I know that it failed? If not do you
 know of a hack?
 2. i am thinking about reordering the locale array so that it tests the
 default one first. if you are trying to do some test-driven dev it might be
 strange if the test fails under a different locale first. I think this one
 is obvious, I will play with it to see how it behaves now.


  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Assignee: Robert Muir
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch,
 LUCENE-2122-r4.patch, LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




[jira] Updated: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-07 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-2122:
---

Attachment: LUCENE-2122-r3.patch

Made LocalizedTestCase abstract...

 Use JUnit4 capabilites for more thorough Locale testing for classes deriving 
 from LocalizedTestCase
 ---

 Key: LUCENE-2122
 URL: https://issues.apache.org/jira/browse/LUCENE-2122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: LUCENE-2122-r2.patch, LUCENE-2122-r3.patch, 
 LUCENE-2122.patch


 Use the @Parameterized capabilities of Junit4 to allow more extensive testing 
 of Locales.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Created: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-06 Thread Erick Erickson (JIRA)
Use JUnit4 capabilites for more thorough Locale testing for classes deriving 
from LocalizedTestCase
---

 Key: LUCENE-2122
 URL: https://issues.apache.org/jira/browse/LUCENE-2122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1


Use the @Parameterized capabilities of Junit4 to allow more extensive testing 
of Locales.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-06 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-2122:
---

Attachment: LUCENE-2122.patch

All tests pass. This modifies all test classes (core and contrib) that derive 
from LocalizedTestCase. LocalizedTestCase now tests all test methods in all 
derived classes against all available Locales.

If we want some of the tests to NOT run against all locales, we'd need to 
refactor them into their own test class

 Use JUnit4 capabilites for more thorough Locale testing for classes deriving 
 from LocalizedTestCase
 ---

 Key: LUCENE-2122
 URL: https://issues.apache.org/jira/browse/LUCENE-2122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: LUCENE-2122.patch


 Use the @Parameterized capabilities of Junit4 to allow more extensive testing 
 of Locales.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: [jira] Resolved: (LUCENE-2119) If you pass Integer.MAX_VALUE as 2nd param to search(Query, int) you hit unexpected NegativeArraySizeException

2009-12-06 Thread Erick Erickson
This may be a silly question, and I admit that I haven't looked a the code,
but was there a good reason to +1 it in the first place or was that just
paranoia
to prevent off-by-one errors? If there *was* a valid reason, might it make
sense to +1 min(nDocs, maxDoc())?


Erick

On Sun, Dec 6, 2009 at 6:43 AM, Michael McCandless (JIRA)
j...@apache.orgwrote:


 [
 https://issues.apache.org/jira/browse/LUCENE-2119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel]

 Michael McCandless resolved LUCENE-2119.
 

Resolution: Fixed

 Thanks Paul!

  If you pass Integer.MAX_VALUE as 2nd param to search(Query, int) you hit
 unexpected NegativeArraySizeException
 
 --
 
  Key: LUCENE-2119
  URL: https://issues.apache.org/jira/browse/LUCENE-2119
  Project: Lucene - Java
   Issue Type: Bug
   Components: Search
 Reporter: Michael McCandless
 Assignee: Michael McCandless
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2119.patch
 
 
  Note that this is a nonsense value to pass in, since our PQ impl
 allocates the array up front.
  It's because PQ takes 1+ this value (which wraps to -1), and attempts to
 allocate that.  We should bounds check it, and drop PQ size by one in this
 case.
  Better, maybe: in IndexSearcher, if that n is ever  maxDoc(), set it to
 maxDoc().
  This trips users up fairly often because they assume our PQ doesn't
 statically pre-allocate (a reasonable assumption...).

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: (LUCENE-2119) If you pass Integer.MAX_VALUE as 2nd param to search(Query, int) you hit unexpected NegativeArraySizeException

2009-12-06 Thread Erick Erickson
Should have mentioned in my first message that all I
was really after was prompting folks who actually know
something about the code in question to avoid
the mistake I've made, oh, several thousand times...

There's no reason for that to be there, I'll just take
it out G

Erick

On Sun, Dec 6, 2009 at 6:45 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 On Sun, Dec 6, 2009 at 5:51 PM, Uwe Schindler u...@thetaphi.de wrote:
  On Sun, Dec 06, 2009 at 05:31:53PM -0500, Erick Erickson wrote:
   This may be a silly question, and I admit that I haven't looked a the
  code,
   but was there a good reason to +1 it in the first place or was that
 just
   paranoia to prevent off-by-one errors?
 
  IIRC, this implementation of the priority queue algo leaves open slot 0
 to
  simplify internal calculations.  It was that way when I ported 1.4.3,
 and
  I
  doubt that's changed.
 
  Thats still the same. Because calculations in heaps are simplier when
  1-based. Because of that heap[0] is unused.

 Thanks for raising this Erick... it's a good question.  Technically,
 removing the +1 would be a bug if anyone ever inserted 2B items into
 the PQ, but I think this is exceptionally unlikely to occur in
 practice.

   If there *was* a valid reason, might it make sense to
   +1 min(nDocs, maxDoc())?
 
  I think the patch is fine.  It's really only needed to provide a more
  accurate
  error message in the event somebody specifies that they want
  Integer.MAX_VALUE
  elements, not realizing that they will be allocated up front rather than
  lazily -- they'll get an OOME rather than a NegativeArraySizeException.
 
  The new patch is more intelligent, it will not allocate such a big queue
 as
  far as I have seen. It takes the numDocs() of index reader/searcher into
  account.

 Hmm actually it takes maxDoc() into account, but it should in fact use
 numDocs().  I'll fix.

 Mike

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-06 Thread Erick Erickson
I just made a comment on how many times
I've made the that looks unnecessary, I'll
take it out mistake. Now I get to add one to
that total.

I'll attach a revised patch momentarily with this
change.

Thanks for pointing this out!

Erick

On Sun, Dec 6, 2009 at 8:00 PM, Robert Muir (JIRA) j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786749#action_12786749]

 Robert Muir commented on LUCENE-2122:
 -

 Hi Erick, I am a little nervous about the change to
 LocalizedTestCase.tearDown() here.

 I think we must restore the users default Locale, since its a JRE-system
 wide global thing and we are changing it on the fly here.

 this was stashed away here before:
 {code}
  /**
   * Before changing the default Locale, save the default Locale here so
 that it
   * can be restored.
   */
  private final Locale defaultLocale = Locale.getDefault();
 {code}

 and restored in tearDown()... otherwise strange things could happen, such
 as your IDE could go bonkers after running the tests! (but maybe I am
 missing something)

  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Assignee: Erick Erickson
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




[jira] Updated: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-06 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-2122:
---

Attachment: LUCENE-2122-r2.patch

Restoring original default Locale after test class has been run.

Thanks Robert!

 Use JUnit4 capabilites for more thorough Locale testing for classes deriving 
 from LocalizedTestCase
 ---

 Key: LUCENE-2122
 URL: https://issues.apache.org/jira/browse/LUCENE-2122
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: LUCENE-2122-r2.patch, LUCENE-2122.patch


 Use the @Parameterized capabilities of Junit4 to allow more extensive testing 
 of Locales.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-06 Thread Erick Erickson
Hmmm, you're probably right. There's no earthly
reason for a test writer to create an instance
of LocalizedTestCase, it has no use except as
a superclass by its nature even though it has
no abstract methods. So making it abstract will
server to flag that fact to anyone who tries
to instantiate it.

I'll change this too, hold off on applying this patch,
I'll wait for a bit to gather more comments and put
them all together in an r3 version.

Erick

On Sun, Dec 6, 2009 at 9:02 PM, Robert Muir (JIRA) j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786757#action_12786757]

 Robert Muir commented on LUCENE-2122:
 -

 Erick do you think LocalizedTestCase should be abstract?

  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Assignee: Erick Erickson
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122-r2.patch, LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-2122) Use JUnit4 capabilites for more thorough Locale testing for classes deriving from LocalizedTestCase

2009-12-06 Thread Erick Erickson
Well, under any circumstances, the line
Locale.setDefault(Locale.getDefault());
was just plain silly. At the cost of setting the
default locale exactly once per test *class*
(I used @BeforeClass/@AfterClass), I'd
far rather err on the side of paranoia than
cause someone to spend *hours* figuring
it out...


On Sun, Dec 6, 2009 at 8:41 PM, Robert Muir rcm...@gmail.com wrote:

 Erick, btw I may not be right about this... certainly if you are invoking
 each test in its own JVM it should be no problem... its just some paranoia.

 also this same changing of JRE-system wide variable would prevent these
 tests from being parallelized in the same jvm, in case that matters... (they
 should run in their own jvm sequentially)

 LocalizedTestCase is nasty, I admit, but it works and prevents hours of
 changing variables and running ant test under different locales... just one
 of those things

 thanks for tackling this one


 On Sun, Dec 6, 2009 at 8:30 PM, Erick Erickson erickerick...@gmail.comwrote:

 I just made a comment on how many times
 I've made the that looks unnecessary, I'll
 take it out mistake. Now I get to add one to
 that total.

 I'll attach a revised patch momentarily with this
 change.

 Thanks for pointing this out!

 Erick


 On Sun, Dec 6, 2009 at 8:00 PM, Robert Muir (JIRA) j...@apache.orgwrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2122?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12786749#action_12786749]

 Robert Muir commented on LUCENE-2122:
 -

 Hi Erick, I am a little nervous about the change to
 LocalizedTestCase.tearDown() here.

 I think we must restore the users default Locale, since its a JRE-system
 wide global thing and we are changing it on the fly here.

 this was stashed away here before:
 {code}
  /**
   * Before changing the default Locale, save the default Locale here so
 that it
   * can be restored.
   */
  private final Locale defaultLocale = Locale.getDefault();
 {code}

 and restored in tearDown()... otherwise strange things could happen, such
 as your IDE could go bonkers after running the tests! (but maybe I am
 missing something)

  Use JUnit4 capabilites for more thorough Locale testing for classes
 deriving from LocalizedTestCase
 
 ---
 
  Key: LUCENE-2122
  URL: https://issues.apache.org/jira/browse/LUCENE-2122
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
 Reporter: Erick Erickson
 Assignee: Erick Erickson
 Priority: Minor
  Fix For: 3.1
 
  Attachments: LUCENE-2122.patch
 
 
  Use the @Parameterized capabilities of Junit4 to allow more extensive
 testing of Locales.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





 --
 Robert Muir
 rcm...@gmail.com



[jira] Assigned: (LUCENE-2096) Investigate parallelizing Ant junit tests

2009-12-05 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson reassigned LUCENE-2096:
--

Assignee: (was: Erick Erickson)

Maybe for later

 Investigate parallelizing Ant junit tests
 -

 Key: LUCENE-2096
 URL: https://issues.apache.org/jira/browse/LUCENE-2096
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Reporter: Erick Erickson
Priority: Minor

 Ant Contrib has a ForEach construct that may speed up running all of the 
 Junit tests by parallelizing them with a configurable number of threads. I 
 envision this in several stages. First, see if ForEach works for us with 
 hard-coded lists, distribute this for testing then make the changes for 
 real. I intend to hard-code the list for the first pass, ordered by the time 
 they take. This won't do for check-in, but will give us a fast 
 proof-of-concept.
 This approach will be most useful for multi-core machines.
 In particular, we need to see whether the parallel tasks are isolated enough 
 from each other to prevent mutual interference.
 All this assumes the fragmentary reference I found is still available...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-12-05 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-2037:
---

Attachment: LUCENE-2037.patch

Had enough time this morning to reconcile this with Kay Kay's changes,

All tests pass.

Junit 3.X no longer necessary, running with Junit 4.7 jar runs junit 3 style 
tests as well as annotated Junit4 style tests.

It's preferable (but not necessary) to import from org.junit rather than 
junit.framework.

 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Michael McCandless
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch, LUCENE-2037.patch, 
 LUCENE-2037.patch, LUCENE-2037_revised_2.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-12-04 Thread Erick Erickson
Sure, but it won't be until late Saturday at the earliest, more likely
Sunday. Got
a busy Fri/Sat

Erick

On Fri, Dec 4, 2009 at 3:34 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 Thanks Kay Kay!  Erick can you have a look / iterate?  Thanks.

 Mike

 On Fri, Dec 4, 2009 at 3:30 PM, Kay Kay kaykay.uni...@gmail.com wrote:
  Erick / Mike -
   With 2065 commited onto trunk now - I created another patch for 2037 and
  attached in the ticket.
  3 classes remain pending though due to conflicts , that I had listed with
  the patch. But we can probably revisit them subsequently.  Please review
  them to serve as a starting point for the same.
 
 
  Erick Erickson wrote:
 
  Mike:
 
  I should be able to create a new 2037 patch pretty easily if you
  want to apply 2065 first. Let me know
 
  Erick
 
  On Thu, Dec 3, 2009 at 9:05 PM, Kay Kay kaykay.uni...@gmail.com
  mailto:kaykay.uni...@gmail.com wrote:
 
 Mike -
 I have attached another patch to LUCENE-2065 , in sync with the
 trunk now.
 
 
 
 
 Erick Erickson wrote:
 
 That's up to Mike, whichever way he finds easiest, I'll deal.
 
 Erick
 
 On Thu, Dec 3, 2009 at 8:43 PM, Kay Kay
 kaykay.uni...@gmail.com mailto:kaykay.uni...@gmail.com
 mailto:kaykay.uni...@gmail.com
 mailto:kaykay.uni...@gmail.com wrote:
 
I created Lucene-2065 while working on 1257 , the original
generics related ticket , and since we were running out of
 time
for 3.0 ,  I guess we could not get src/test converted in.
 
In any case , if you were comitting this one (2037) to trunk ,
 may be I can wait before creating the patch again.
 
 
 
 
Erick Erickson wrote:
 
I didn't realize 2065 had already been down this path,
 thought
you were volunteering to change all the code starting from
scratch. Your approach sounds like a fine plan.
 
Note that I'm not entirely sure that I cleaned up
*everything*, but we
need to get to a known state before tackling the rest,
 so I'll
wait for
these two patches to be applied before looking back at
 it...
 
Not to mention the Localized test thing.
 
Erick
 
 
On Thu, Dec 3, 2009 at 5:57 PM, Michael McCandless
luc...@mikemccandless.com
 mailto:luc...@mikemccandless.com
 mailto:luc...@mikemccandless.com
 mailto:luc...@mikemccandless.com
mailto:luc...@mikemccandless.com
 mailto:luc...@mikemccandless.com
 
mailto:luc...@mikemccandless.com
 mailto:luc...@mikemccandless.com wrote:
 
   On Thu, Dec 3, 2009 at 5:48 PM, Erick Erickson
   erickerick...@gmail.com
 mailto:erickerick...@gmail.com
 mailto:erickerick...@gmail.com mailto:erickerick...@gmail.com
 
mailto:erickerick...@gmail.com
 mailto:erickerick...@gmail.com
 
mailto:erickerick...@gmail.com
 mailto:erickerick...@gmail.com wrote:
I generified the searches/function files in patch
 2037. I
don't
   really think
there's a conflict, just commit my patch and have at
generifying
   the rest.
 
   OK so then we'll start with 2037, then take 2065's
 patch,
hopefully
   updated to current trunk, but minus search/function
 sources.
 
I know, I know. I did two things at once. So sue
 me. Honest,
   I'll try not to
do this very often G...
 
   In fact I prefer this.  I used to think we shouldn't do
that but I
   flip-flopped and now think in practice you just have to
clean code
   while you're there, otherwise it won't get cleaned.
 
Mike:
You really want to to the generify the whole shootin'
match or
   do you want
to partition them? I'll be happy to take a set of
 them.
Or would
   that make
things too complicated to apply?
 
   2065 already has done alot here (adding generics to the
tests)... I
   think we start from that and take it from there?
 
   Mike
 
 
   -
   To unsubscribe, e-mail:
java-dev-unsubscr...@lucene.apache.org
 mailto:java-dev-unsubscr...@lucene.apache.org
mailto:java-dev-unsubscr...@lucene.apache.org

Re: [jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-12-03 Thread Erick Erickson
I generified the searches/function files in patch 2037. I don't really think
there's a conflict, just commit my patch and have at generifying the rest.

I know, I know. I did two things at once. So sue me. Honest, I'll try not to
do this very often G...

Mike:
You really want to to the generify the whole shootin' match or do you want
to partition them? I'll be happy to take a set of them. Or would that make
things too complicated to apply?

Erick

On Thu, Dec 3, 2009 at 3:15 PM, Michael McCandless (JIRA)
j...@apache.orgwrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12785479#action_12785479]

 Michael McCandless commented on LUCENE-2037:
 

 bq.  but there is another patch - LUCENE-2065 to port the existing tests to
 Java 5 generics

 Ahh thanks for the reminder -- I can take this one as well, but, there will
 be conflicts b/w the two patches, I think.  Should we do the generics first
 (simpler change, but touches many files), and then the junit4 upgrade?

  Allow Junit4 tests in our environment.
  --
 
  Key: LUCENE-2037
  URL: https://issues.apache.org/jira/browse/LUCENE-2037
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
  Environment: Development
 Reporter: Erick Erickson
 Assignee: Michael McCandless
 Priority: Minor
  Fix For: 3.1
 
  Attachments: junit-4.7.jar, LUCENE-2037.patch, LUCENE-2037.patch
 
Original Estimate: 8h
   Remaining Estimate: 8h
 
  Now that we're dropping Java 1.4 compatibility for 3.0, we can
 incorporate Junit4 in testing. Junit3 and junit4 tests can coexist, so no
 tests should have to be rewritten. We should start this for the 3.1 release
 so we can get a clean 3.0 out smoothly.
  It's probably worthwhile to convert a small set of tests as an exemplar.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-12-03 Thread Erick Erickson
I didn't realize 2065 had already been down this path, thought
you were volunteering to change all the code starting from
scratch. Your approach sounds like a fine plan.

Note that I'm not entirely sure that I cleaned up *everything*, but we
need to get to a known state before tackling the rest, so I'll wait for
these two patches to be applied before looking back at it...

Not to mention the Localized test thing.

Erick


On Thu, Dec 3, 2009 at 5:57 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 On Thu, Dec 3, 2009 at 5:48 PM, Erick Erickson erickerick...@gmail.com
 wrote:
  I generified the searches/function files in patch 2037. I don't really
 think
  there's a conflict, just commit my patch and have at generifying the
 rest.

 OK so then we'll start with 2037, then take 2065's patch, hopefully
 updated to current trunk, but minus search/function sources.

  I know, I know. I did two things at once. So sue me. Honest, I'll try not
 to
  do this very often G...

 In fact I prefer this.  I used to think we shouldn't do that but I
 flip-flopped and now think in practice you just have to clean code
 while you're there, otherwise it won't get cleaned.

  Mike:
  You really want to to the generify the whole shootin' match or do you
 want
  to partition them? I'll be happy to take a set of them. Or would that
 make
  things too complicated to apply?

 2065 already has done alot here (adding generics to the tests)... I
 think we start from that and take it from there?

 Mike

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-12-03 Thread Erick Erickson
That's up to Mike, whichever way he finds easiest, I'll deal.

Erick

On Thu, Dec 3, 2009 at 8:43 PM, Kay Kay kaykay.uni...@gmail.com wrote:

 I created Lucene-2065 while working on 1257 , the original generics related
 ticket , and since we were running out of time for 3.0 ,  I guess we could
 not get src/test converted in.

 In any case , if you were comitting this one (2037) to trunk ,  may be I
 can wait before creating the patch again.




 Erick Erickson wrote:

 I didn't realize 2065 had already been down this path, thought
 you were volunteering to change all the code starting from
 scratch. Your approach sounds like a fine plan.

 Note that I'm not entirely sure that I cleaned up *everything*, but we
 need to get to a known state before tackling the rest, so I'll wait for
 these two patches to be applied before looking back at it...

 Not to mention the Localized test thing.

 Erick


 On Thu, Dec 3, 2009 at 5:57 PM, Michael McCandless 
 luc...@mikemccandless.com mailto:luc...@mikemccandless.com wrote:

On Thu, Dec 3, 2009 at 5:48 PM, Erick Erickson
erickerick...@gmail.com mailto:erickerick...@gmail.com wrote:
 I generified the searches/function files in patch 2037. I don't
really think
 there's a conflict, just commit my patch and have at generifying
the rest.

OK so then we'll start with 2037, then take 2065's patch, hopefully
updated to current trunk, but minus search/function sources.

 I know, I know. I did two things at once. So sue me. Honest,
I'll try not to
 do this very often G...

In fact I prefer this.  I used to think we shouldn't do that but I
flip-flopped and now think in practice you just have to clean code
while you're there, otherwise it won't get cleaned.

 Mike:
 You really want to to the generify the whole shootin' match or
do you want
 to partition them? I'll be happy to take a set of them. Or would
that make
 things too complicated to apply?

2065 already has done alot here (adding generics to the tests)... I
think we start from that and take it from there?

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
mailto:java-dev-unsubscr...@lucene.apache.org

For additional commands, e-mail: java-dev-h...@lucene.apache.org
mailto:java-dev-h...@lucene.apache.org




 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-12-03 Thread Erick Erickson
Mike:

I should be able to create a new 2037 patch pretty easily if you
want to apply 2065 first. Let me know

Erick

On Thu, Dec 3, 2009 at 9:05 PM, Kay Kay kaykay.uni...@gmail.com wrote:

 Mike -
 I have attached another patch to LUCENE-2065 , in sync with the trunk now.




 Erick Erickson wrote:

 That's up to Mike, whichever way he finds easiest, I'll deal.

 Erick

 On Thu, Dec 3, 2009 at 8:43 PM, Kay Kay kaykay.uni...@gmail.com mailto:
 kaykay.uni...@gmail.com wrote:

I created Lucene-2065 while working on 1257 , the original
generics related ticket , and since we were running out of time
for 3.0 ,  I guess we could not get src/test converted in.

In any case , if you were comitting this one (2037) to trunk ,
 may be I can wait before creating the patch again.




Erick Erickson wrote:

I didn't realize 2065 had already been down this path, thought
you were volunteering to change all the code starting from
scratch. Your approach sounds like a fine plan.

Note that I'm not entirely sure that I cleaned up
*everything*, but we
need to get to a known state before tackling the rest, so I'll
wait for
these two patches to be applied before looking back at it...

Not to mention the Localized test thing.

Erick


On Thu, Dec 3, 2009 at 5:57 PM, Michael McCandless
luc...@mikemccandless.com mailto:luc...@mikemccandless.com
mailto:luc...@mikemccandless.com

mailto:luc...@mikemccandless.com wrote:

   On Thu, Dec 3, 2009 at 5:48 PM, Erick Erickson
   erickerick...@gmail.com mailto:erickerick...@gmail.com
mailto:erickerick...@gmail.com

mailto:erickerick...@gmail.com wrote:
I generified the searches/function files in patch 2037. I
don't
   really think
there's a conflict, just commit my patch and have at
generifying
   the rest.

   OK so then we'll start with 2037, then take 2065's patch,
hopefully
   updated to current trunk, but minus search/function sources.

I know, I know. I did two things at once. So sue me. Honest,
   I'll try not to
do this very often G...

   In fact I prefer this.  I used to think we shouldn't do
that but I
   flip-flopped and now think in practice you just have to
clean code
   while you're there, otherwise it won't get cleaned.

Mike:
You really want to to the generify the whole shootin'
match or
   do you want
to partition them? I'll be happy to take a set of them.
Or would
   that make
things too complicated to apply?

   2065 already has done alot here (adding generics to the
tests)... I
   think we start from that and take it from there?

   Mike


 -
   To unsubscribe, e-mail:
java-dev-unsubscr...@lucene.apache.org
mailto:java-dev-unsubscr...@lucene.apache.org
   mailto:java-dev-unsubscr...@lucene.apache.org
mailto:java-dev-unsubscr...@lucene.apache.org

   For additional commands, e-mail:
java-dev-h...@lucene.apache.org
mailto:java-dev-h...@lucene.apache.org
   mailto:java-dev-h...@lucene.apache.org
mailto:java-dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
mailto:java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
mailto:java-dev-h...@lucene.apache.org




 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




LUCENE-2037 (Junit4 capabilities)

2009-12-02 Thread Erick Erickson
Is anyone thinking about committing this patch? And/or what do I need to
do/should have done to indicate it's ready for review?

Poor lonely patch, sitting out there all alone and neglected G...

Erick


[jira] Commented: (LUCENE-2096) Investigate parallelizing Ant junit tests

2009-11-29 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2096?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783436#action_12783436
 ] 

Erick Erickson commented on LUCENE-2096:


Parallelizing tests is proving trickier than I'd hoped. Part of the problem is 
my not-wonderful ant skills...

But what I've found so far with trying to use ForEach is that stuff gets in the 
way. In particular, the sequential tag in the test-macro body I'm pretty sure 
defeats any parallelizing attempts by ForEach. Taking it out isn't 
straightforward.

In some of my experiments, I got tests to fire off in parallel, but then 
started running into wonky errors that were so strange now I can't remember 
them, but some having to do with what looked like file contention for some 
temporary test files.

Googling around I think I remember posts by Jason Ruthgren trying to so 
something similar in SOLR (?). Jason: if I'm remembering right did you find any 
joy?

Then we'd have to rework how success and failure are handled because there's 
contention for that file as well.

Now I'm wondering if the scary python script gets us more bang for the buck. 
I wrote a Groovy script the probably is a near-cousin for experiments and I'm 
wondering what would happen if we wrote a special testcase-type target that did 
NOT depend upon compile-test or, really, much of anything else and counted on 
the user to make sure to build the system first before using whatever script 
wecame up with. We don't really lose functionality by recursively looking for 
Test*.java files because that's what's done internally in the build files 
anyway. So doing that outside or inside the ant files doesn't seem like a loss.

I'm putting this in the JIRA issue to preserve it for posterity. Meanwhile, 
I'll appeal to Ant gurus if they want to try whacking the Ant build files, and 
see what the script notion brings...



 Investigate parallelizing Ant junit tests
 -

 Key: LUCENE-2096
 URL: https://issues.apache.org/jira/browse/LUCENE-2096
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor

 Ant Contrib has a ForEach construct that may speed up running all of the 
 Junit tests by parallelizing them with a configurable number of threads. I 
 envision this in several stages. First, see if ForEach works for us with 
 hard-coded lists, distribute this for testing then make the changes for 
 real. I intend to hard-code the list for the first pass, ordered by the time 
 they take. This won't do for check-in, but will give us a fast 
 proof-of-concept.
 This approach will be most useful for multi-core machines.
 In particular, we need to see whether the parallel tasks are isolated enough 
 from each other to prevent mutual interference.
 All this assumes the fragmentary reference I found is still available...

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-29 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-2037:
---

Attachment: LUCENE-2037.patch

See JIRA comments

 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch, LUCENE-2037.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-29 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12783442#action_12783442
 ] 

Erick Erickson commented on LUCENE-2037:


Darn it! I'll get the comments right sometime and not have to retype them after 
making an attachment

Anyway, this patch allows us to use Junit4 constructs as well as Junit3 
constructs. It includes a sibling class to LuceneTestCase called 
LuceneTestCaseJ4 that provides the functionality we used to get from 
LuceneTestCase.

When creating Junit4-style tests, preferentially import from org.junit rather 
than from junit.framework.

Junit-3.8.2.jar may (should?) be removed from the distro, all tests run just 
fine under Junit-4.7,jar, which is attached to this issue. I wrote a little 
script that compares the results of running the tests and we run exactly the 
same number of TestSuites and each runs exactly the same number of tests, so 
I'm pretty confident about this one. I may be wrong, but I'm not uncertain. 
Single data-points aren't worth much, but on my Macbook Pro, running under 
Junit4 took about a minute longer than Junit3 (about 23 1/2 minutes). Which 
could have been the result of my Time Machine running for all I know

All the tests in test...search.function have been converted to use 
LuceneTestCaseJ4 as an exemplar. I've deprecated LuceneTestCase to prompt 
people. When you derive from LuceneTestCaseJ4, you *must* use the @Before, 
@After and @Test annotations to get the functionality you expect, as must *all* 
subclasses. So one gotcha people will surely run across is deriving form J4 and 
failing to put @Test 

Converting all the tests was my way of working through the derivation issues. I 
don't particularly see the value in doing a massive conversion just for the 
heck of it. Unless someone has a real urge. More along the lines of I'm in 
this test anyway, lets upgrade it and add new ones.

What about new tests? Should we encourage new patches to use Junit4 rather than 
Junit3? If so, how?

I've noticed the convention of putting underscores in front of some tests to 
keep them from running. The Junit4 convention is the @Ignore annotation, which 
will cause the @Ignored tests to be reported (something like 1300 successful, 0 
failures, 23 ignored), which is a nice way to keep these from getting lost in 
the shuffle.

When this gets applied, I can put up the patch for LocalizedTestCase and we can 
give that a whirl


 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch, LUCENE-2037.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: (LUCENE-1844) Speed up junit tests

2009-11-27 Thread Erick Erickson
But then I got to thinking. I admit I've only scratched the
surface of the JUnit4 parallelization stuff. That said, it
seems like the real benefit comes from making use of
multiple cores, we don't get huge speedups just from
running multiple threads at once on a single core. Which
makes sense if you're not doing much in the way of I/O.

This notion was inspired by the scary Python script
comment.

So what if we use Ant ForEach construct instead? Yet
again this is a fuzzy idea I'm throwing out without much
to back it up. Mostly I'm wondering if anyone's thought about
it before or can shoot it down before it takes wing. Or if
it is worth exploring.

Assuming we structure our test directories so there are only
directories at the root of the test area, could we persuade Ant
to fire off the tests N directories at a time in parallel?
N would default to 1 but could be passed in to the task, something
like -DmaxThreads=4. ForEach actually has a maxThreads
parameter. In fact, we wouldn't even need to have only directories
at the test root, but the individual test files at the root would probably
be inefficiently run.

I suspect that keeping the test directories in balance would be
much less work that trying to parallelize using JUnit4, and be
much less fraught with gremlins. This assumes we get
sufficient isolation by Ant running separate threads, about
which I have absolutely NO information. Like I said, mostly
I'm wondering if anybody's gone down this path before and
has wisdom to offer.

Which *still* doesn't mean we shouldn't do whatever we can
to speed up individual tests, but looking that the timings there's
no obvious low-hanging fruit

I wonder if we could somehow run the various directories in
time order, longest-to-shortest in the hope that all the threads
would finish up close enough to the same time. I haven't
thought about *how* to make this happen yet though

Anyway, I'll be happy to pursue this if y'all think it has merit,
let me know and I'll open a JIRA and take it on. For the
benefit of those aforementioned *real* people with *real*
machines, who I'll rely upon to help test this notion

Is the poor-mans version of this on a dual-core machine
just running test-core and test-contrib in two separate
windows?

Best
Erick

On Thu, Nov 26, 2009 at 10:38 AM, Erick Erickson erickerick...@gmail.comwrote:

 Despite my long rambling, I agree that speeding things up is worthwhile.
 Just
 not a huge deal for some of us poor peons who are on dinky little 2-core
 machines and feel inadequate even *talking* to people who have *real*
 machines G...

 Time to go get ready to eat Turkey

 Erick


 On Thu, Nov 26, 2009 at 9:02 AM, Mark Miller markrmil...@gmail.comwrote:

  right - as soon as you have to start running the tests often enough, any
 decent savings turns into less waiting and more work. Waiting for tests to
 run is time that could be better spent elsewhere. And many of us runthe
 tests *a lot* considering how long they take. And we will only keep adding
 more and will continue to do so.

 Also, many of us *are* on multicore and should be able to benifit from it.
 I don't dev on anything less than 4 cores these days. It's a life changer :)
 and cheap currently. I'd like 8.

 - Mark

 http://www.lucidimagination.com (mobile)


 On Nov 26, 2009, at 5:24 AM, Michael McCandless 
 luc...@mikemccandless.com wrote:

  I still think there's value to faster tests, even if they don't become
 so fast as to enable fully interactive testing.

 Plus, this is an ongoing goal with time, not a one-time event.  As we
 create tests we should generally try to maximize coverage and minimize
 CPU cost, as long as the effort is smallish.

 Mike

 On Wed, Nov 25, 2009 at 9:32 PM, Erick Erickson erickerick...@gmail.com
 wrote:

 I posted a rather long diatribe outlining why I think speed-ups
 are a false goal for Lucene. Briefly, I'm convinced that as long
 as the tests are run when Hudson builds Lucene, 99% of the
 value of unit tests is realized. I suppose this implies that the
 hard-core committers agree that as long as failed tests
 are caught/corrected within a day things are fine.

 Although coming from a background where unit
 tests are not always required, my viewpoint may be
 suspect G.

 er...@nottobeconfusedwithhatcher.com

 On Wed, Nov 25, 2009 at 8:43 PM, Michael McCandless (JIRA)
 j...@apache.orgwrote:


   [

 https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782716#action_12782716
 ]

 Michael McCandless commented on LUCENE-1844:
 

 Will we also speed up back-compat tests?

  Speed up junit tests
 

Key: LUCENE-1844
URL: https://issues.apache.org/jira/browse/LUCENE-1844
Project: Lucene - Java
 Issue Type: Improvement
   Reporter: Mark Miller
Attachments

Re: (LUCENE-1844) Speed up junit tests

2009-11-27 Thread Erick Erickson
Also, will ant's ForEach take a set of say 30 things to work on, and
take the # threads to use, and just pull from that queue of 30, in
order?

That's the implication I took from here:
http://ant-contrib.sourceforge.net/tasks/tasks/index.html

Ignorance is bliss, I didn't find the ForEach by looking at Ant
documentation, but by googling ant parallel. Turns out this
is in Contrib. I don't even know if it's current.

Tell ya' what. I'll take a quick whack at it. I'm a believer
in prototyping if at all possible. So I'll create a really stupid
implementation of this with a hard-coded list of tests to run
and see what happens. If it works for me, I'll pass it along
to whoever wants to give it a spin and we'll get a clue whether
it provides enough of an improvement to pursue seriously.

I'll open a JIRA since at least Mike and I seem to be interested

Erick

On Fri, Nov 27, 2009 at 1:27 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 On Fri, Nov 27, 2009 at 10:52 AM, Erick Erickson
 erickerick...@gmail.com wrote:
  But then I got to thinking. I admit I've only scratched the
  surface of the JUnit4 parallelization stuff. That said, it
  seems like the real benefit comes from making use of
  multiple cores, we don't get huge speedups just from
  running multiple threads at once on a single core. Which
  makes sense if you're not doing much in the way of I/O.

 Right, it's the multi-core machines that gain the most from this.

  This notion was inspired by the scary Python script
  comment.
 
  So what if we use Ant ForEach construct instead? Yet
  again this is a fuzzy idea I'm throwing out without much
  to back it up. Mostly I'm wondering if anyone's thought about
  it before or can shoot it down before it takes wing. Or if
  it is worth exploring.
 
  Assuming we structure our test directories so there are only
  directories at the root of the test area, could we persuade Ant
  to fire off the tests N directories at a time in parallel?
  N would default to 1 but could be passed in to the task, something
  like -DmaxThreads=4. ForEach actually has a maxThreads
  parameter. In fact, we wouldn't even need to have only directories
  at the test root, but the individual test files at the root would
 probably
  be inefficiently run.
 
  I suspect that keeping the test directories in balance would be
  much less work that trying to parallelize using JUnit4, and be
  much less fraught with gremlins. This assumes we get
  sufficient isolation by Ant running separate threads, about
  which I have absolutely NO information. Like I said, mostly
  I'm wondering if anybody's gone down this path before and
  has wisdom to offer.

 I think this rough idea is a good approach, though I don't know much
 about ant's ForEach.

 One thing the scary Python script does is divide up index  search
 packages into 2 parts (a and b), by breaking up the tests
 according to 1st letter.  We might be able to take a similar approach,
 so that we're not forced to unnaturally separate tests into subdirs?

 The entire index or search package was too slow to run otherwise (ie,
 I needed to throw concurrency at it).

  Which *still* doesn't mean we shouldn't do whatever we can
  to speed up individual tests, but looking that the timings there's
  no obvious low-hanging fruit

 Yup.  It's definitely an ongoing thing too...

  I wonder if we could somehow run the various directories in
  time order, longest-to-shortest in the hope that all the threads
  would finish up close enough to the same time. I haven't
  thought about *how* to make this happen yet though

 This is very important -- I do the same thing in the python script.

 Also, will ant's ForEach take a set of say 30 things to work on, and
 take the # threads to use, and just pull from that queue of 30, in
 order?

  Anyway, I'll be happy to pursue this if y'all think it has merit,
  let me know and I'll open a JIRA and take it on. For the
  benefit of those aforementioned *real* people with *real*
  machines, who I'll rely upon to help test this notion
 
  Is the poor-mans version of this on a dual-core machine
  just running test-core and test-contrib in two separate
  windows?

 I think you could, except, I think they share sub-tasks (eg,
 compile-core) so the two will sometimes stomp on each other.

 The scary python script first uses a single thread to compile
 everything, then runs N threads pulling from the queue.  BUT: I apply
 a temporary patch to the ant build files, so that the N threads do not
 try to, eg, compile-core or jar-core, separately.

 Also one thing I'd love to try is NOT forking the JVM for each test
 (fork=no in the junit task).  I wonder how much time that'd buy...

 Mike

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




[jira] Created: (LUCENE-2096) Investigate parallelizing Ant junit tests

2009-11-27 Thread Erick Erickson (JIRA)
Investigate parallelizing Ant junit tests
-

 Key: LUCENE-2096
 URL: https://issues.apache.org/jira/browse/LUCENE-2096
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Build
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor


Ant Contrib has a ForEach construct that may speed up running all of the 
Junit tests by parallelizing them with a configurable number of threads. I 
envision this in several stages. First, see if ForEach works for us with 
hard-coded lists, distribute this for testing then make the changes for real. 
I intend to hard-code the list for the first pass, ordered by the time they 
take. This won't do for check-in, but will give us a fast proof-of-concept.

This approach will be most useful for multi-core machines.

In particular, we need to see whether the parallel tasks are isolated enough 
from each other to prevent mutual interference.

All this assumes the fragmentary reference I found is still available...


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1844) Speed up junit tests

2009-11-26 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-1844:
---

Attachment: LUCENE-1844-Junit3.patch

Speeds up TestBooleanMinShouldMatch and TestCustomScoreQuery without using 
JUnit4

 Speed up junit tests
 

 Key: LUCENE-1844
 URL: https://issues.apache.org/jira/browse/LUCENE-1844
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Mark Miller
 Attachments: FastCnstScoreQTest.patch, hi_junit_test_runtimes.png, 
 LUCENE-1844-Junit3.patch, LUCENE-1844.patch


 As Lucene grows, so does the number of JUnit tests. This is obviously a good 
 thing, but it comes with longer and longer test times. Now that we also run 
 back compat tests in a standard test run, this problem is essentially doubled.
 There are some ways this may get better, including running parallel tests. 
 You will need the hardware to fully take advantage, but it should be a nice 
 gain. There is already an issue for this, and Junit 4.6, 4.7 have the 
 beginnings of something we might be able to count on soon. 4.6 was buggy, and 
 4.7 still doesn't come with nice ant integration. Parallel tests will come 
 though.
 Beyond parallel testing, I think we also need to concentrate on keeping our 
 tests lean. We don't want to sacrifice coverage or quality, but I'm sure 
 there is plenty of fat to skim.
 I've started making a list of some of the longer tests - I think with some 
 work we can make our tests much faster - and then with parallelization, I 
 think we could see some really great gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-1844) Speed up junit tests

2009-11-26 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782915#action_12782915
 ] 

Erick Erickson commented on LUCENE-1844:


OK, fire when ready Gridley. Pretty soon I'll understand when to comment and 
how to keep from multiple comments 

This patch does NOT use the Java5 features like generics etc. I've done that 
work and it'll be included in the TestCustomScoreQuery changes for JUnit4.



 Speed up junit tests
 

 Key: LUCENE-1844
 URL: https://issues.apache.org/jira/browse/LUCENE-1844
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Mark Miller
 Attachments: FastCnstScoreQTest.patch, hi_junit_test_runtimes.png, 
 LUCENE-1844-Junit3.patch, LUCENE-1844.patch


 As Lucene grows, so does the number of JUnit tests. This is obviously a good 
 thing, but it comes with longer and longer test times. Now that we also run 
 back compat tests in a standard test run, this problem is essentially doubled.
 There are some ways this may get better, including running parallel tests. 
 You will need the hardware to fully take advantage, but it should be a nice 
 gain. There is already an issue for this, and Junit 4.6, 4.7 have the 
 beginnings of something we might be able to count on soon. 4.6 was buggy, and 
 4.7 still doesn't come with nice ant integration. Parallel tests will come 
 though.
 Beyond parallel testing, I think we also need to concentrate on keeping our 
 tests lean. We don't want to sacrifice coverage or quality, but I'm sure 
 there is plenty of fat to skim.
 I've started making a list of some of the longer tests - I think with some 
 work we can make our tests much faster - and then with parallelization, I 
 think we could see some really great gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-26 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782916#action_12782916
 ] 

Erick Erickson commented on LUCENE-2037:


Hold off on this patch until I get a chance to submit a new one, we're 
straightening out LUCENE-1844 interdependencies between patches.

 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: (LUCENE-1844) Speed up junit tests

2009-11-26 Thread Erick Erickson
Despite my long rambling, I agree that speeding things up is worthwhile.
Just
not a huge deal for some of us poor peons who are on dinky little 2-core
machines and feel inadequate even *talking* to people who have *real*
machines G...

Time to go get ready to eat Turkey

Erick

On Thu, Nov 26, 2009 at 9:02 AM, Mark Miller markrmil...@gmail.com wrote:

  right - as soon as you have to start running the tests often enough, any
 decent savings turns into less waiting and more work. Waiting for tests to
 run is time that could be better spent elsewhere. And many of us runthe
 tests *a lot* considering how long they take. And we will only keep adding
 more and will continue to do so.

 Also, many of us *are* on multicore and should be able to benifit from it.
 I don't dev on anything less than 4 cores these days. It's a life changer :)
 and cheap currently. I'd like 8.

 - Mark

 http://www.lucidimagination.com (mobile)


 On Nov 26, 2009, at 5:24 AM, Michael McCandless luc...@mikemccandless.com
 wrote:

  I still think there's value to faster tests, even if they don't become
 so fast as to enable fully interactive testing.

 Plus, this is an ongoing goal with time, not a one-time event.  As we
 create tests we should generally try to maximize coverage and minimize
 CPU cost, as long as the effort is smallish.

 Mike

 On Wed, Nov 25, 2009 at 9:32 PM, Erick Erickson erickerick...@gmail.com
 wrote:

 I posted a rather long diatribe outlining why I think speed-ups
 are a false goal for Lucene. Briefly, I'm convinced that as long
 as the tests are run when Hudson builds Lucene, 99% of the
 value of unit tests is realized. I suppose this implies that the
 hard-core committers agree that as long as failed tests
 are caught/corrected within a day things are fine.

 Although coming from a background where unit
 tests are not always required, my viewpoint may be
 suspect G.

 er...@nottobeconfusedwithhatcher.com

 On Wed, Nov 25, 2009 at 8:43 PM, Michael McCandless (JIRA)
 j...@apache.orgwrote:


   [

 https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782716#action_12782716
 ]

 Michael McCandless commented on LUCENE-1844:
 

 Will we also speed up back-compat tests?

  Speed up junit tests
 

Key: LUCENE-1844
URL: https://issues.apache.org/jira/browse/LUCENE-1844
Project: Lucene - Java
 Issue Type: Improvement
   Reporter: Mark Miller
Attachments: FastCnstScoreQTest.patch,

 hi_junit_test_runtimes.png, LUCENE-1844.patch



 As Lucene grows, so does the number of JUnit tests. This is obviously a

 good thing, but it comes with longer and longer test times. Now that we
 also
 run back compat tests in a standard test run, this problem is
 essentially
 doubled.

 There are some ways this may get better, including running parallel

 tests. You will need the hardware to fully take advantage, but it should
 be
 a nice gain. There is already an issue for this, and Junit 4.6, 4.7 have
 the
 beginnings of something we might be able to count on soon. 4.6 was
 buggy,
 and 4.7 still doesn't come with nice ant integration. Parallel tests
 will
 come though.

 Beyond parallel testing, I think we also need to concentrate on keeping

 our tests lean. We don't want to sacrifice coverage or quality, but I'm
 sure
 there is plenty of fat to skim.

 I've started making a list of some of the longer tests - I think with

 some work we can make our tests much faster - and then with
 parallelization,
 I think we could see some really great gains.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



 I posted a rather long diatribe outlining why I think speed-ups
 are a false goal for Lucene. Briefly, I'm convinced that as long
 as the tests are run when Hudson builds Lucene, 99% of the
 value of unit tests is realized. I suppose this implies that the
 hard-core committers agree that as long as failed tests
 are caught/corrected within a day things are fine.

 Although coming from a background where unit
 tests are not always required, my viewpoint may be
 suspect G.

 er...@nottobeconfusedwithhatcher.com

 On Wed, Nov 25, 2009 at 8:43 PM, Michael McCandless (JIRA) 
 j...@apache.org wrote:


   [
 https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782716#action_12782716
  ]

 Michael McCandless commented on LUCENE-1844:
 

 Will we also speed up back-compat tests?

  Speed up junit tests

Re: Jira emails via Gmail

2009-11-25 Thread Erick Erickson
Agreed, annoying. Haven't found any solution either.

Erick

On Wed, Nov 25, 2009 at 7:51 AM, Uwe Schindler u...@thetaphi.de wrote:

 I would like to have a link to the patch/file/... in the mail about an
 update to the attached files. This is also annoying.

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

  -Original Message-
  From: Simon Willnauer [mailto:simon.willna...@googlemail.com]
  Sent: Wednesday, November 25, 2009 1:38 PM
  To: java-dev@lucene.apache.org
  Subject: Re: Jira emails via Gmail
 
  I would be very interested in a solution too.
  kind of annoying...
 
  simon
 
  On Wed, Nov 25, 2009 at 11:34 AM, Michael McCandless
  luc...@mikemccandless.com wrote:
   Here's a Jira issue (on Jira!) about the problem:
  
  http://jira.atlassian.com/browse/JRA-12640
  
   But doesn't point to a workaround...
  
   Mike
  
   On Wed, Nov 25, 2009 at 5:20 AM, Michael McCandless
   luc...@mikemccandless.com wrote:
   Sort of off topic, but I wanted to see if anyone else is using Gmail's
   web UI and has solved this...
  
   When an issue is updated, Jira sends out an email... but Gmail doesn't
   group all such emails together... it groups them into separate groups
   (for updated, file attached, edited, etc.), which I'm now
   getting very tired of...
  
   Has anyone found a solution for this?
  
   I could swear I've seen a Python script in that past that logs in via
   IMAP and does something to solve this, but I can't find it right now.
  
   Mike
  
  
   -
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org
  
  
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-1844) Speed up junit tests

2009-11-25 Thread Erick Erickson
They're ready to go, but at Uwe's suggestion, I've been waiting for 3.0 to
get settled before prompting someone to apply this patch. I was going to
generate a new patch for this and for 2037 (junit4 tests) just to make sure
they were easy to apply. But if you're willing, the patches are already
attached to the JIRA issues. Do note that the decision in
MinBooleanShouldMatch to stop checking the query after 100 rather than
checking all 1,000 is included in the patch

Do you want to apply the patches or should I regenerate? It's no big deal to
regenerate them and I'll have a better feel for reconciling any conflicts. I
don't know whether there even *are* any conflicts, but just in case

For my info, though, if I have a more recent patch that *replaces* an
earlier patch, especially one that hasn't yet been applied, is it preferred
to delete the earlier patch when providing a new one?

I'm not pleased with the Junit4 documentation, most of what I've been able
to glean has come from brave souls blogging. Does anyone have a gold mine or
is it as hit-or-miss as I think? There are hints of parallelization
capabilities in Junit4, but I'm having a hard time finding anything in much
depth. The Junit website is pathetic, I can't even find 4.7 javadocs, it
keeps giving me 4.5, as evidenced by no @Rule docs or @Intercept. And no
version information in the javadocs. Or I'm completely missing the
boat I was thinking about getting the entire project over the weekend
and generating my own if I have the time

Erick

On Wed, Nov 25, 2009 at 11:49 AM, Michael McCandless (JIRA) j...@apache.org
 wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782497#action_12782497]

 Michael McCandless commented on LUCENE-1844:
 

 Is this ready to go in?  I'd really love to see unit tests run faster :)

  Speed up junit tests
  
 
  Key: LUCENE-1844
  URL: https://issues.apache.org/jira/browse/LUCENE-1844
  Project: Lucene - Java
   Issue Type: Improvement
 Reporter: Mark Miller
  Attachments: FastCnstScoreQTest.patch,
 hi_junit_test_runtimes.png, LUCENE-1844.patch
 
 
  As Lucene grows, so does the number of JUnit tests. This is obviously a
 good thing, but it comes with longer and longer test times. Now that we also
 run back compat tests in a standard test run, this problem is essentially
 doubled.
  There are some ways this may get better, including running parallel
 tests. You will need the hardware to fully take advantage, but it should be
 a nice gain. There is already an issue for this, and Junit 4.6, 4.7 have the
 beginnings of something we might be able to count on soon. 4.6 was buggy,
 and 4.7 still doesn't come with nice ant integration. Parallel tests will
 come though.
  Beyond parallel testing, I think we also need to concentrate on keeping
 our tests lean. We don't want to sacrifice coverage or quality, but I'm sure
 there is plenty of fat to skim.
  I've started making a list of some of the longer tests - I think with
 some work we can make our tests much faster - and then with parallelization,
 I think we could see some really great gains.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-1844) Speed up junit tests

2009-11-25 Thread Erick Erickson
Hmmm, the patches that I supplied for Junit4 *require* 4.7 anyway, which
I included in the patch... Is this a problem? Or just a document problem?

Erick

On Wed, Nov 25, 2009 at 1:14 PM, Mark Miller markrmil...@gmail.com wrote:

 junit 4 parallelization is still in its infancy. I think the docs for it
 are just in the changes file that it was first released with. That
 version had severe bugs that made it almost unusable - I think thats
 mostly fixed in a newer release. There is also a much better impl of one
 of the key classes (I think they call it computer) written by someone
 else that will eventually go into the code base I think (written by the
 guy(s) that I think found/fixed the initial buggy-ness) - essentially, I
 think its still unbaked.

 Here are some docs from the release notes of 4.6:

 http://sourceforge.net/project/shownotes.php?release_id=675664group_id=15278

 Thats also an issue - it arrived only in 4.6 - so this would need to be
 optional unless we bumped up our req from 4 - and it really requires at
 least 4.7 for the fixes (if everything is even fixed).

 You also have to setup which tests run in parallel by hand essentially.
 No ant task to help with this last I looked. So it will probably end up
 being an alternate way to run the tests initially (at best).

 - Mark

 Erick Erickson wrote:
  They're ready to go, but at Uwe's suggestion, I've been waiting for
  3.0 to get settled before prompting someone to apply this patch. I was
  going to generate a new patch for this and for 2037 (junit4 tests)
  just to make sure they were easy to apply. But if you're willing, the
  patches are already attached to the JIRA issues. Do note that the
  decision in MinBooleanShouldMatch to stop checking the query after 100
  rather than checking all 1,000 is included in the patch
 
  Do you want to apply the patches or should I regenerate? It's no big
  deal to regenerate them and I'll have a better feel for reconciling
  any conflicts. I don't know whether there even *are* any conflicts,
  but just in case
 
  For my info, though, if I have a more recent patch that *replaces* an
  earlier patch, especially one that hasn't yet been applied, is it
  preferred to delete the earlier patch when providing a new one?
 
  I'm not pleased with the Junit4 documentation, most of what I've been
  able to glean has come from brave souls blogging. Does anyone have a
  gold mine or is it as hit-or-miss as I think? There are hints of
  parallelization capabilities in Junit4, but I'm having a hard time
  finding anything in much depth. The Junit website is pathetic, I
  can't even find 4.7 javadocs, it keeps giving me 4.5, as evidenced by
  no @Rule docs or @Intercept. And no version information in the
  javadocs. Or I'm completely missing the boat I was thinking
  about getting the entire project over the weekend and generating my
  own if I have the time
 
  Erick
 
  On Wed, Nov 25, 2009 at 11:49 AM, Michael McCandless (JIRA)
  j...@apache.org mailto:j...@apache.org wrote:
 
 
 [
 
 https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782497#action_12782497
  
 https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782497#action_12782497
 
  ]
 
  Michael McCandless commented on LUCENE-1844:
  
 
  Is this ready to go in?  I'd really love to see unit tests run
  faster :)
 
   Speed up junit tests
   
  
   Key: LUCENE-1844
   URL:
  https://issues.apache.org/jira/browse/LUCENE-1844
   Project: Lucene - Java
Issue Type: Improvement
  Reporter: Mark Miller
   Attachments: FastCnstScoreQTest.patch,
  hi_junit_test_runtimes.png, LUCENE-1844.patch
  
  
   As Lucene grows, so does the number of JUnit tests. This is
  obviously a good thing, but it comes with longer and longer test
  times. Now that we also run back compat tests in a standard test
  run, this problem is essentially doubled.
   There are some ways this may get better, including running
  parallel tests. You will need the hardware to fully take
  advantage, but it should be a nice gain. There is already an issue
  for this, and Junit 4.6, 4.7 have the beginnings of something we
  might be able to count on soon. 4.6 was buggy, and 4.7 still
  doesn't come with nice ant integration. Parallel tests will come
  though.
   Beyond parallel testing, I think we also need to concentrate on
  keeping our tests lean. We don't want to sacrifice coverage or
  quality, but I'm sure there is plenty of fat to skim.
   I've started making a list of some of the longer tests - I think
  with some work

Re: (LUCENE-1844) Speed up junit tests

2009-11-25 Thread Erick Erickson
IMHO there are other reasons to upgrade to junit4 besides
parallelization, there are some nice new capabilities. I
suppose the analogous question is why upgrade to
Lucene 2.9?

Especially since it's not a matter of upgrading. Junit3 tests run
just fine under junit4. I've tested after removing the
junit3 jar from lib, no problem. It even seems to run
slightly faster, which makes me wonder...

So really, we have the best of both worlds. No work involved in
using Junit4 with the current tests, but the ability to use the
new features of Junit4. Although I'm sure there'll be
*something* that bites us, I have great faith in Murphy.

Kinda reminds me of the Lucene drop-in replacement policy
G...

But on the topic of parallelization: I'm not at all sure
it's worth the effort. As far as I can tell, it really only gets significant
gains when you have more cores to run with. It's not at all clear
to me how much time we spend doing I/O in the tests... very little
I suspect (although I confess I don't know for sure). And if we're
CPU bound anyway, parallelization doesn't help. Anybody know
for sure?

And say we did all the work to parallelize all the tests. And say that
instead of taking 25 minutes on my 3 year old MacBook Pro, we
got it down to 10 minutes. Who cares? 10 minutes is still too long
according to the eXtreme Programming (XP) folks, and I sympathize
with their point of view. Even though I did  spend some time trying
to trim some time.

The XP approach to unit testing is to run it almost every time you
change a line of code. OK, I'm exaggerating, but not by too
much with the die-hard XP folks. Even at 10 minutes, we can't
do that.

So,  I think the value for Lucene/SOLR comes *not* from running the
tests 15 times an hour. I think the real value comes from not
letting errors hide for days/weeks/months/releases. So I'm quite willing
to let the automated builds catch the unit test failures in unexpected
places in those instances where I don't run all of the tests before
a patch is committed. As long as we fix them as soon as they're
found.

OK, I'm rambling. I'm off for Thanksgiving, and my daughter is
at her in-laws until tomorrow (they're visiting from CA). So sue
me G.

Best
Erick

On Wed, Nov 25, 2009 at 5:07 PM, Michael McCandless 
luc...@mikemccandless.com wrote:

 Is the only reason to upgrade to junit 4, to get the parallelization
 possibility (which isn't sounding very compelling!)?

 Ie, making our unit tests lean is fully independent of junit 4?

 Mike

 On Wed, Nov 25, 2009 at 4:17 PM, Uwe Schindler u...@thetaphi.de wrote:
  junit 4 parallelization is still in its infancy. I think the docs for it
  are just in the changes file that it was first released with. That
  version had severe bugs that made it almost unusable - I think thats
  mostly fixed in a newer release. There is also a much better impl of one
  of the key classes (I think they call it computer) written by someone
  else that will eventually go into the code base I think (written by the
  guy(s) that I think found/fixed the initial buggy-ness) - essentially, I
  think its still unbaked.
 
  There is another problem. Parallelization would only work with tests,
 that
  do not change gloabl defaults. E.g. LocalizedTestCase changes the default
  locale. If another test would run in Paralale, it would break.
 
  Son only isolated tests can run in parallel. This LocalizedTestCase
 cannot
  solved in another way. The same would have been in 2.9 with the
  TokenStream.useOnlyNewAPI switch, but this is now longer the case for
 3.1.
 
  Uwe
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-1844) Speed up junit tests

2009-11-25 Thread Erick Erickson
I posted a rather long diatribe outlining why I think speed-ups
are a false goal for Lucene. Briefly, I'm convinced that as long
as the tests are run when Hudson builds Lucene, 99% of the
value of unit tests is realized. I suppose this implies that the
hard-core committers agree that as long as failed tests
are caught/corrected within a day things are fine.

Although coming from a background where unit
tests are not always required, my viewpoint may be
suspect G.

er...@nottobeconfusedwithhatcher.com

On Wed, Nov 25, 2009 at 8:43 PM, Michael McCandless (JIRA)
j...@apache.orgwrote:


[
 https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12782716#action_12782716]

 Michael McCandless commented on LUCENE-1844:
 

 Will we also speed up back-compat tests?

  Speed up junit tests
  
 
  Key: LUCENE-1844
  URL: https://issues.apache.org/jira/browse/LUCENE-1844
  Project: Lucene - Java
   Issue Type: Improvement
 Reporter: Mark Miller
  Attachments: FastCnstScoreQTest.patch,
 hi_junit_test_runtimes.png, LUCENE-1844.patch
 
 
  As Lucene grows, so does the number of JUnit tests. This is obviously a
 good thing, but it comes with longer and longer test times. Now that we also
 run back compat tests in a standard test run, this problem is essentially
 doubled.
  There are some ways this may get better, including running parallel
 tests. You will need the hardware to fully take advantage, but it should be
 a nice gain. There is already an issue for this, and Junit 4.6, 4.7 have the
 beginnings of something we might be able to count on soon. 4.6 was buggy,
 and 4.7 still doesn't come with nice ant integration. Parallel tests will
 come though.
  Beyond parallel testing, I think we also need to concentrate on keeping
 our tests lean. We don't want to sacrifice coverage or quality, but I'm sure
 there is plenty of fat to skim.
  I've started making a list of some of the longer tests - I think with
 some work we can make our tests much faster - and then with parallelization,
 I think we could see some really great gains.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




[jira] Commented: (LUCENE-2092) BooleanQuery.hashCode and equals ignore isCoordDisabled

2009-11-23 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2092?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12781716#action_12781716
 ] 

Erick Erickson commented on LUCENE-2092:


Well, if it's been there since 1.9 and this is the first time it's been 
reported, it hasn't caused the world to stop yet. So I don't think it's worth 
the work unless we have to spin another 3.0 for additional reasons.

 BooleanQuery.hashCode and equals ignore isCoordDisabled
 ---

 Key: LUCENE-2092
 URL: https://issues.apache.org/jira/browse/LUCENE-2092
 Project: Lucene - Java
  Issue Type: Bug
  Components: Query/Scoring
Affects Versions: 1.9, 2.0.0, 2.1, 2.2, 2.3, 2.3.1, 2.3.2, 2.4, 2.4.1, 
 2.9, 2.9.1
Reporter: Hoss Man
Assignee: Michael McCandless
 Attachments: LUCENE-2092.patch


 BooleanQuery.isCoordDisabled() is not considered by BooleanQuery's hashCode() 
 or equals() methods ... this can cause serious badness to happen when caching 
 BooleanQueries.
 bug traces back to at least 1.9

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-19 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12780368#action_12780368
 ] 

Erick Erickson commented on LUCENE-2037:


Well, last night I changed LocalizedTestCase to do the @RunWith and
@Parameterized thing and it works just fine with a minimal change to
subclasses, mainly adding @Test and a c'tor with a Locale parameter. Total,
it adds probably a minute to the test run.

About the cross product of versions and locales. The @Parameterized thingy
returns a list of Object[], where the elements of the list are matched
against a c'tor. So if each object[] in your list has, say, an (int, float,
int), then as long as you have a matching c'tor with a signature that takes
an (int, float, int) you're good to go. So to handle the mXn case you
mentioned, if your @Parameters method returned a list of object[], one
object[] for each Locale, Version pair, you'd get all your Locales run
against all your versions.

Whether we *want* this to happen or not is another question. It's a
worthwhile question whether we really *need* to run all the possible locales
or if there's a subset of locales that would serve.

It's kind of ironic that I have a patch waiting to be applied that cuts down
on the time it takes to run the unit tests and another patch that adds to
the time it takes. Two steps forward, one step back and a jink sideways just
for fun.

Best
Erick




 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-18 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779449#action_12779449
 ] 

Erick Erickson commented on LUCENE-2037:


I was thinking more about TestQueryParser. One of the features of the
current setup is that you specify which tests in a class you want to have
run under all locales. Tests not in that list are run only under the default
locale.Always assuming I'm reading things right...

I don't see a clean way to emulate that part of the behavior without either
refactoring or introducing a test in the tests we don't want to run under
all locales and aborting early.

But I think we're finding different ways to agree here. I'm interpreting
your comments that running all the tests in the class is OK at least for
now...

But I did notice last night that a number of tests in contrib reference
LocalizedTestCase (I have two separate projects, core and contrib so it
wasn't obvious until I ran the ant task). I'll look into those tonight or
tomorrow night.

Erick




 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-18 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779491#action_12779491
 ] 

Erick Erickson commented on LUCENE-2037:


I think you're mis-reading this. This is the annotation for the static
method that return a list of parameters, not for a method that is an actual
test.

The thing that causes the framework to gather the list and run test for each
element on the list is the @RunWith annotation on the class AFAIK.

Or I'm misreading it

Erick




 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-18 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779492#action_12779492
 ] 

Erick Erickson commented on LUCENE-2037:


Frankly, I don't see how that would work without getting into the guts of
the @RunWith(value = Parameterized.class) Junit4 annotation. As I understand
it, that annotation *on the class* causes the framework to make a call to
the static method that provides a list of parameters (annotated with
@Parameters). The framework then takes the returned list and, *for each
element in the list* calls a constructor with that element and runs all the
tests in the class.

So annotating a test with @AllLocales would somehow have to get in there and
change what the framework does. No doubt it's do-able, but until I see more
than 10 seconds difference in running the tests I'm not sure it's worth it.
Nor would I advocate altering the behavior of the framework for back-compat,
I'd far rather refactor the tests into those that run for all locales and
those that don't.

I suppose one could to the inverse, that is create an annotation
@DefaultLocaleOnly that aborts early if the locale isn't the default, but
again I think the first approach I'd advocate would be to work within the
framework until it was too painful

FWIW
Erick




 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-18 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779530#action_12779530
 ] 

Erick Erickson commented on LUCENE-2037:


{quote}
Yes, I do feel we should keep LocalizedTestCase. It is handy, we might use
it in more places to prevent test failures in other locales for new code.
{quote}

Light went off when walking around. I think I can just change the
LocalizedTestCase class and put the @RunWith() and @Parameters *there*.
Which makes waay more sense than what I was doing which was putting
those in every subclass of the current LocalizedTestCase. Doh! I'll take a
peek tonight. Although last night I was thinking Gee, this is
repetitive

There are only two classes in core that use LocalizedTestCase, but there are
several in contrib too. They'll all require the @Test annotation if I munge
LocalizedTesCase, but that should be the only change necessary then,
assuming we're content to run all the locales past all the test cases in all
derived classes.

H, why was subclassing invented again? Something about putting common
behavior in one place or some nonsense like that G.

Erick




 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-17 Thread Erick Erickson (JIRA)

[ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12779251#action_12779251
 ] 

Erick Erickson commented on LUCENE-2037:


Well, it all depend on how you feel about 10 seconds as far as 
LocalizedTestCase is concerned.

JUnit4 is really not built to run some tests in a class with the @Parameterized 
notation and some not, it runs all the tests in the class with all the 
parameters. In the case of TestQueryParser, which is the only test class I saw 
that made use of the include some tests but not others' in LocalizedTestCase, 
I hacked in running all the tests with all the locales available (152 in my 
case). Which pushes the number of tests in that one class up over 4,000 FWIW.

Running that test case went from around 5 seconds to around 15 seconds on my 2 
year old Macbook Pro, from inside IntelliJ.

I don't think it's worth trying to refactor that class into two classes, one 
that has all the tests run with all the locales and one that has the rest of 
the tests run only with the default locale (which is how I read the code in 
LocalizedTestcase) for 10 seconds worth of time savings. One could emulate the 
old process of excluding some tests by returning immediately from those tests 
that *weren't* intended to be run with all locales if the current locale wasn't 
the default, but I don't see that as worth the effort, although I could be 
convinced otherwise if people feel strongly.

I'll provide a patch for this if there are no objections later this week, 
perhaps I'll get a chance to look at BaseTokenStreamTestCase before then.

This will make LocalizedTestCase obsolete and I'll remove it in the patch.



 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Why release 3.0?

2009-11-16 Thread Erick Erickson
One of my specialties is asking obvious questions just to see if
everyone's assumptions
are aligned. So with the discussion about branching 3.0 I have to ask Is
there going to
be any 3.0 release intended for *production*?. And if not, would we save a
lot of work
by just not worrying about retrofitting fixes to a 3.0 branch and carrying
on with 3.1
as the first *supported* 3.x release?

Since 3.0 is upgrade-to-java5 and remove deprecations, I'm not sure *as a
user* I see a
good reason to upgrade to 3.0. Getting a beta/snapshot release to get a
head start on
cleaning up my code does seem worthwhile, if I have the spare time. And
having a base
3.0 version that's not changing all over the place would be useful for that.

That said, I'm also not terribly comfortable with a release that's out
there and unsupported.

Apologies if this has already been discussed, but I don't remember it.
Although my memory
isn't what it used to be (but some would claim it never wasG)...

Erick


Re: Why release 3.0?

2009-11-16 Thread Erick Erickson
On Mon, Nov 16, 2009 at 2:03 PM, Uwe Schindler u...@thetaphi.de wrote:

  Hi Erick,



 3.0 is **not** unsupported or beta release, it is the cleaned up 2.9.1
 release. You are right, it is not needed for 2.9.1 users to upgrade (but
 they can), but for new users starting with Lucene, the recommendadion is to
 use it and not 2.9.

 3.0 also contains some cleanups needed for 3.1, as the compressed fields
 are no longer supported, so they must be uncompressed, which is done during
 optimizing/merging in 3.0. Later versions will remove support for older
 index types, but you should really update your indexes, especially because
 flex indexing will possibly remove more support for older indexes (as it
 gets more complex to maintain all the different file formats).



 So 3.0 is recommended for users starting new Java 5 projects and want a
 clean API. People needing backwards compatibility can use 2.9.1, but support
 for that version will be cancelled in future and bugfixes will only go into
 3.x.

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
   --

 *From:* Erick Erickson [mailto:erickerick...@gmail.com]
 *Sent:* Monday, November 16, 2009 7:10 PM
 *To:* java-dev@lucene.apache.org
 *Subject:* Why release 3.0?



 One of my specialties is asking obvious questions just to see if
 everyone's assumptions

 are aligned. So with the discussion about branching 3.0 I have to ask Is
 there going to

 be any 3.0 release intended for *production*?. And if not, would we save a
 lot of work

 by just not worrying about retrofitting fixes to a 3.0 branch and carrying
 on with 3.1

 as the first *supported* 3.x release?



 Since 3.0 is upgrade-to-java5 and remove deprecations, I'm not sure *as a
 user* I see a

 good reason to upgrade to 3.0. Getting a beta/snapshot release to get a
 head start on

 cleaning up my code does seem worthwhile, if I have the spare time. And
 having a base

 3.0 version that's not changing all over the place would be useful for
 that.



 That said, I'm also not terribly comfortable with a release that's out
 there and unsupported.



 Apologies if this has already been discussed, but I don't remember it.
 Although my memory

 isn't what it used to be (but some would claim it never wasG)...



 Erick







Re: Why release 3.0?

2009-11-16 Thread Erick Erickson
Oops, stupid mouse made me send a blank message.

Ok, I withdraw the question since there *are* good reasons to put
3.0 in a prod environment G. It's also an easier thing to say new Lucene
users should start with 3.0 rather than new Lucene users should
start with 3.1. Use 3.0 until we release 3.1 but be aware we're not going to
support 3.0 Yccc

Erick

On Mon, Nov 16, 2009 at 2:03 PM, Uwe Schindler u...@thetaphi.de wrote:

  Hi Erick,



 3.0 is **not** unsupported or beta release, it is the cleaned up 2.9.1
 release. You are right, it is not needed for 2.9.1 users to upgrade (but
 they can), but for new users starting with Lucene, the recommendadion is to
 use it and not 2.9.

 3.0 also contains some cleanups needed for 3.1, as the compressed fields
 are no longer supported, so they must be uncompressed, which is done during
 optimizing/merging in 3.0. Later versions will remove support for older
 index types, but you should really update your indexes, especially because
 flex indexing will possibly remove more support for older indexes (as it
 gets more complex to maintain all the different file formats).



 So 3.0 is recommended for users starting new Java 5 projects and want a
 clean API. People needing backwards compatibility can use 2.9.1, but support
 for that version will be cancelled in future and bugfixes will only go into
 3.x.

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
   --

 *From:* Erick Erickson [mailto:erickerick...@gmail.com]
 *Sent:* Monday, November 16, 2009 7:10 PM
 *To:* java-dev@lucene.apache.org
 *Subject:* Why release 3.0?



 One of my specialties is asking obvious questions just to see if
 everyone's assumptions

 are aligned. So with the discussion about branching 3.0 I have to ask Is
 there going to

 be any 3.0 release intended for *production*?. And if not, would we save a
 lot of work

 by just not worrying about retrofitting fixes to a 3.0 branch and carrying
 on with 3.1

 as the first *supported* 3.x release?



 Since 3.0 is upgrade-to-java5 and remove deprecations, I'm not sure *as a
 user* I see a

 good reason to upgrade to 3.0. Getting a beta/snapshot release to get a
 head start on

 cleaning up my code does seem worthwhile, if I have the spare time. And
 having a base

 3.0 version that's not changing all over the place would be useful for
 that.



 That said, I'm also not terribly comfortable with a release that's out
 there and unsupported.



 Apologies if this has already been discussed, but I don't remember it.
 Although my memory

 isn't what it used to be (but some would claim it never wasG)...



 Erick







Re: [jira] Commented: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-15 Thread Erick Erickson
Good suggestions, it's really helpful to have someone intimately familiar
with the code suggest the next direction. I didn't want to go too far afield
for the proof-of-concept, I mostly wanted to have a place to start.
LuceneTestCaseJ4 should be useful both as a template and a base to build
with. If you wanted to put in a  JIRA or two and assign them to me I'd be
happy to take a look. I'm pushing this off on you since you have a better
sense of what's important here

About reformatting. I'm torn, for all the reasons I'm certain you can
quote.  Of course I'll abide by the sense of the community, but the
community doesn't speak with one voice. Michael McCandless and I had an
exchange on this very topic and he is in the opposite camp. I guess I was
heavily influenced by Martin Fowler's Refactoring book and the eXtreme
Programming folks

What I'd personally like would be for someone with heavy commit privileges
to reformat the whole thing at once and just get it *done*, as was
apparently discussed at ApacheCon. Eclipse makes this easy. I'd also like to
be wealthy Look at the bright side, I'm not trying to convince anyone
that my way of formatting is obviously superior because I put braces on
their own line G

Best
Erick

On Sun, Nov 15, 2009 at 6:16 AM, Uwe Schindler (JIRA) j...@apache.orgwrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12778086#action_12778086]

 Uwe Schindler commented on LUCENE-2037:
 ---

 One thing that would also be good:
 We have LocalizedTestCase, which has the possibility to run each test for
 all available Locales (it overrides currently runBare() and iterates while
 setting Locale.setDefault()). As this test should only be ran for specific
 methods, how about adding a annotation in addition to @Test (with
 Retention(method) like @TestLocalized.

 What to do with BaseTokenStreamTestCase? In 2.9 it had also overridden
 runBare(), but not anymore (because we only have the new TS API anymore),
 but this is also a typical example when we want to rerun tests multiple
 times. One on our plan is that this test now runs all analyzer test for
 different default versions (iterate over Version enum constants). We need
 then something like @TestAllVersions or something like that. If we jump to
 JUnit4, we should use the new features for a more elegant solution of these
 multiple-run tests.

 One note: It would be good to *not* reformat the whole tests with an
 Eclipse cleanup, just change the lines you modified, not reformat everything
 or organize imports and so on. Its hard to find out what has changed.

  Allow Junit4 tests in our environment.
  --
 
  Key: LUCENE-2037
  URL: https://issues.apache.org/jira/browse/LUCENE-2037
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
  Environment: Development
 Reporter: Erick Erickson
 Assignee: Erick Erickson
 Priority: Minor
  Fix For: 3.1
 
  Attachments: junit-4.7.jar, LUCENE-2037.patch
 
Original Estimate: 8h
   Remaining Estimate: 8h
 
  Now that we're dropping Java 1.4 compatibility for 3.0, we can
 incorporate Junit4 in testing. Junit3 and junit4 tests can coexist, so no
 tests should have to be rewritten. We should start this for the 3.1 release
 so we can get a clean 3.0 out smoothly.
  It's probably worthwhile to convert a small set of tests as an exemplar.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Issue Comment Edited: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-15 Thread Erick Erickson
That thought occurred to me earlier, but I don't know enough specifics yet.
I intend
to find out though

Erick


On Sun, Nov 15, 2009 at 8:46 AM, Robert Muir (JIRA) j...@apache.org wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12778101#action_12778101]

 Robert Muir edited comment on LUCENE-2037 at 11/15/09 1:45 PM:
 ---

 Is there some way to use Junit4 parameterized tests to do this
 LocalizedTestCase-type thing, so we don't have to override runBare()?


  was (Author: rcmuir):
Is there some way to use Junit4 parameterized tests to do this
 LocalizedTestCase-type thing, so we don't have to override runBase()?


  Allow Junit4 tests in our environment.
  --
 
  Key: LUCENE-2037
  URL: https://issues.apache.org/jira/browse/LUCENE-2037
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Other
 Affects Versions: 3.1
  Environment: Development
 Reporter: Erick Erickson
 Assignee: Erick Erickson
 Priority: Minor
  Fix For: 3.1
 
  Attachments: junit-4.7.jar, LUCENE-2037.patch
 
Original Estimate: 8h
   Remaining Estimate: 8h
 
  Now that we're dropping Java 1.4 compatibility for 3.0, we can
 incorporate Junit4 in testing. Junit3 and junit4 tests can coexist, so no
 tests should have to be rewritten. We should start this for the 3.1 release
 so we can get a clean 3.0 out smoothly.
  It's probably worthwhile to convert a small set of tests as an exemplar.

 --
 This message is automatically generated by JIRA.
 -
 You can reply to this email to add a comment to the issue online.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: Junit4

2009-11-14 Thread Erick Erickson
Well, the patch is in shape to submit, but looking at the various comments
on the 3.0 release, I guess I should wait until 3.0 is actually out the door
before submitting unless someone just can't wait.

How do we include a new jar file in a patch?

Best
Erick

On Fri, Nov 13, 2009 at 6:25 PM, Erick Erickson erickerick...@gmail.comwrote:

 OK thanks for adding me to the ACL. I'll have it tomorrow sometime. Does
 anyone object to deprecating LuceneTestCase with notations to use
 LuceneTestCaseJ4?

 I tried two approaches, both work. Both allow you to use LuceneTestCaseJ4
 rather than LuceneTestCase as a superclass, with the caveat you have to use
 the proper annotations with the J4 variant.

 The difference is that for one approach, I copied LuceneTestCase to
 LuceneTestCaseJ4 and hacked. The other approach was extracting the meat of
 LuceneTestCase to a common class, and using that class as a member of both
 variants, delegating to avoid code duplication.

 Personally, I think it'll be cleanest to just clone LuceneTestCase and NOT
 extract to common. Eventually LuceneTestCase will fade away, enhancements
 should be made to the J4 variant as needed. But if folks have strong
 opinions, let me know.

 Best
 Erick


 On Fri, Nov 13, 2009 at 5:02 PM, Chris Hostetter hossman_luc...@fucit.org
  wrote:

 : putting too many irons in the fire, especially non-critical ones. I
 don't
 : see a way to assign it to myself, either I'm missing something or I'm
 just
 : underprivileged G, so if someone would go ahead and assign it to me
 I'll
 : work on it post 3.0.

 Jira's ACLs prevent issues from being assigned to people who aren't listed
 in the Contributors group.  THe policy has been to add people to that
 list (for issue assignment) on request, so i hooked you up.

 (NOTE: if anyone else has issues they're actively working on and would
 like to be flagged as a Contributor in Jira so that the issues can be
 assigned directly to you for tracking purpose, please speak up)



 -Hoss


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





[jira] Updated: (LUCENE-2037) Allow Junit4 tests in our environment.

2009-11-14 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-2037:
---

Attachment: junit-4.7.jar
LUCENE-2037.patch

LuceneTestCaseJ4 should replace LuceneTestCase. There's a bit of overkill here 
to emulate the override of runBare in LuceneTestCase, but I thought it was 
worth it to work out the mechanisms.

We'll need to put the lucene 4.7 jar in the right place.

 Allow Junit4 tests in our environment.
 --

 Key: LUCENE-2037
 URL: https://issues.apache.org/jira/browse/LUCENE-2037
 Project: Lucene - Java
  Issue Type: Improvement
  Components: Other
Affects Versions: 3.1
 Environment: Development
Reporter: Erick Erickson
Assignee: Erick Erickson
Priority: Minor
 Fix For: 3.1

 Attachments: junit-4.7.jar, LUCENE-2037.patch

   Original Estimate: 8h
  Remaining Estimate: 8h

 Now that we're dropping Java 1.4 compatibility for 3.0, we can incorporate 
 Junit4 in testing. Junit3 and junit4 tests can coexist, so no tests should 
 have to be rewritten. We should start this for the 3.1 release so we can get 
 a clean 3.0 out smoothly.
 It's probably worthwhile to convert a small set of tests as an exemplar.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1844) Speed up junit tests

2009-11-14 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-1844:
---

Attachment: (was: LUCENE-1844.patch)

 Speed up junit tests
 

 Key: LUCENE-1844
 URL: https://issues.apache.org/jira/browse/LUCENE-1844
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Mark Miller
 Attachments: FastCnstScoreQTest.patch, hi_junit_test_runtimes.png


 As Lucene grows, so does the number of JUnit tests. This is obviously a good 
 thing, but it comes with longer and longer test times. Now that we also run 
 back compat tests in a standard test run, this problem is essentially doubled.
 There are some ways this may get better, including running parallel tests. 
 You will need the hardware to fully take advantage, but it should be a nice 
 gain. There is already an issue for this, and Junit 4.6, 4.7 have the 
 beginnings of something we might be able to count on soon. 4.6 was buggy, and 
 4.7 still doesn't come with nice ant integration. Parallel tests will come 
 though.
 Beyond parallel testing, I think we also need to concentrate on keeping our 
 tests lean. We don't want to sacrifice coverage or quality, but I'm sure 
 there is plenty of fat to skim.
 I've started making a list of some of the longer tests - I think with some 
 work we can make our tests much faster - and then with parallelization, I 
 think we could see some really great gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1844) Speed up junit tests

2009-11-14 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-1844:
---

Attachment: LUCENE-1844.patch

This supersedes the first patch I submitted. Apply after LUCENE-2037.

Render judgment on whether TestBooleanMinShouldMatch it's really OK to cut off 
checking the queries after 100.
 

 Speed up junit tests
 

 Key: LUCENE-1844
 URL: https://issues.apache.org/jira/browse/LUCENE-1844
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Mark Miller
 Attachments: FastCnstScoreQTest.patch, hi_junit_test_runtimes.png


 As Lucene grows, so does the number of JUnit tests. This is obviously a good 
 thing, but it comes with longer and longer test times. Now that we also run 
 back compat tests in a standard test run, this problem is essentially doubled.
 There are some ways this may get better, including running parallel tests. 
 You will need the hardware to fully take advantage, but it should be a nice 
 gain. There is already an issue for this, and Junit 4.6, 4.7 have the 
 beginnings of something we might be able to count on soon. 4.6 was buggy, and 
 4.7 still doesn't come with nice ant integration. Parallel tests will come 
 though.
 Beyond parallel testing, I think we also need to concentrate on keeping our 
 tests lean. We don't want to sacrifice coverage or quality, but I'm sure 
 there is plenty of fat to skim.
 I've started making a list of some of the longer tests - I think with some 
 work we can make our tests much faster - and then with parallelization, I 
 think we could see some really great gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1844) Speed up junit tests

2009-11-14 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-1844:
---

Attachment: (was: LUCENE-1844.patch)

 Speed up junit tests
 

 Key: LUCENE-1844
 URL: https://issues.apache.org/jira/browse/LUCENE-1844
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Mark Miller
 Attachments: FastCnstScoreQTest.patch, hi_junit_test_runtimes.png


 As Lucene grows, so does the number of JUnit tests. This is obviously a good 
 thing, but it comes with longer and longer test times. Now that we also run 
 back compat tests in a standard test run, this problem is essentially doubled.
 There are some ways this may get better, including running parallel tests. 
 You will need the hardware to fully take advantage, but it should be a nice 
 gain. There is already an issue for this, and Junit 4.6, 4.7 have the 
 beginnings of something we might be able to count on soon. 4.6 was buggy, and 
 4.7 still doesn't come with nice ant integration. Parallel tests will come 
 though.
 Beyond parallel testing, I think we also need to concentrate on keeping our 
 tests lean. We don't want to sacrifice coverage or quality, but I'm sure 
 there is plenty of fat to skim.
 I've started making a list of some of the longer tests - I think with some 
 work we can make our tests much faster - and then with parallelization, I 
 think we could see some really great gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



[jira] Updated: (LUCENE-1844) Speed up junit tests

2009-11-14 Thread Erick Erickson (JIRA)

 [ 
https://issues.apache.org/jira/browse/LUCENE-1844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Erick Erickson updated LUCENE-1844:
---

Attachment: LUCENE-1844.patch

Saves 3-4 minutes overall. Arbitrarily limited the TestBooleanMinShouldMatch to 
stop checking queries after 100.

I don't see much point in checking the *same* queries again and again in 
TestCustomScoreQuery, so I just moved the check outside the loop.

Apply this patch *after* LUCENE-2037 since TestCustomScoreQuery happens to be 
common to both patches.

Sorry about the noise with the license grant...

 Speed up junit tests
 

 Key: LUCENE-1844
 URL: https://issues.apache.org/jira/browse/LUCENE-1844
 Project: Lucene - Java
  Issue Type: Improvement
Reporter: Mark Miller
 Attachments: FastCnstScoreQTest.patch, hi_junit_test_runtimes.png, 
 LUCENE-1844.patch


 As Lucene grows, so does the number of JUnit tests. This is obviously a good 
 thing, but it comes with longer and longer test times. Now that we also run 
 back compat tests in a standard test run, this problem is essentially doubled.
 There are some ways this may get better, including running parallel tests. 
 You will need the hardware to fully take advantage, but it should be a nice 
 gain. There is already an issue for this, and Junit 4.6, 4.7 have the 
 beginnings of something we might be able to count on soon. 4.6 was buggy, and 
 4.7 still doesn't come with nice ant integration. Parallel tests will come 
 though.
 Beyond parallel testing, I think we also need to concentrate on keeping our 
 tests lean. We don't want to sacrifice coverage or quality, but I'm sure 
 there is plenty of fat to skim.
 I've started making a list of some of the longer tests - I think with some 
 work we can make our tests much faster - and then with parallelization, I 
 think we could see some really great gains.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Junit4

2009-11-13 Thread Erick Erickson
OK thanks for adding me to the ACL. I'll have it tomorrow sometime. Does
anyone object to deprecating LuceneTestCase with notations to use
LuceneTestCaseJ4?

I tried two approaches, both work. Both allow you to use LuceneTestCaseJ4
rather than LuceneTestCase as a superclass, with the caveat you have to use
the proper annotations with the J4 variant.

The difference is that for one approach, I copied LuceneTestCase to
LuceneTestCaseJ4 and hacked. The other approach was extracting the meat of
LuceneTestCase to a common class, and using that class as a member of both
variants, delegating to avoid code duplication.

Personally, I think it'll be cleanest to just clone LuceneTestCase and NOT
extract to common. Eventually LuceneTestCase will fade away, enhancements
should be made to the J4 variant as needed. But if folks have strong
opinions, let me know.

Best
Erick

On Fri, Nov 13, 2009 at 5:02 PM, Chris Hostetter
hossman_luc...@fucit.orgwrote:

 : putting too many irons in the fire, especially non-critical ones. I don't
 : see a way to assign it to myself, either I'm missing something or I'm
 just
 : underprivileged G, so if someone would go ahead and assign it to me
 I'll
 : work on it post 3.0.

 Jira's ACLs prevent issues from being assigned to people who aren't listed
 in the Contributors group.  THe policy has been to add people to that
 list (for issue assignment) on request, so i hooked you up.

 (NOTE: if anyone else has issues they're actively working on and would
 like to be flagged as a Contributor in Jira so that the issues can be
 assigned directly to you for tracking purpose, please speak up)



 -Hoss


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Re: [jira] Commented: (LUCENE-1257) Port to Java5

2009-11-10 Thread Erick Erickson
About formatting. I know the how to contribute section of the Wiki warns
against gratuitous reformatting, but if *someone* with commit privileges
wanted to, they could  format an entire tree in Eclipse from the context
menu of, say, the contrib directory. It'd have to be coordinated for a
moment when not too many others were editing the code...

I mention this since we're doing a bunch of non-functional changes for the
3.0 release, and it might be a reasonable thing to do so future commits were
easier to compare, at least after the reformatting was done. As long as
we're all using the same formatting, it might be worthwhile.

Somebody mentioned uploading a new codestyle.xml for Eclipse. Were there any
changes or is this just getting the one from SOLR up there? Because I'm
using IntelliJ

Erick

On Tue, Nov 10, 2009 at 7:08 PM, Uwe Schindler (JIRA) j...@apache.orgwrote:


[
 https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776184#action_12776184]

 Uwe Schindler commented on LUCENE-1257:
 ---

 Kay Kay: We only have SuppressWarnings at some places in core, marked with
 a big TODO (will be done when flex indeixng comes). The wanted
 @SuppressWarnings are only at places, where generic Arrays are created.
 There is no way to fix this (see Sun Generics Howto).

  Port to Java5
  -
 
  Key: LUCENE-1257
  URL: https://issues.apache.org/jira/browse/LUCENE-1257
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Analysis, Examples, Index, Other, Query/Scoring,
 QueryParser, Search, Store, Term Vectors
 Affects Versions: 3.0
 Reporter: Cédric Champeau
 Assignee: Uwe Schindler
 Priority: Minor
  Fix For: 3.0
 
  Attachments: instantiated_fieldable.patch,
 LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch,
 LUCENE-1257-BufferedDeletes_DocumentsWriter.patch,
 LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch,
 LUCENE-1257-CompoundFileReaderWriter.patch,
 LUCENE-1257-ConcurrentMergeScheduler.patch,
 LUCENE-1257-DirectoryReader.patch,
 LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch,
 LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch,
 LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch,
 LUCENE-1257-IndexDeleter.patch,
 LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch,
 LUCENE-1257-iw.patch, LUCENE-1257-MTQWF.patch,
 LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch,
 LUCENE-1257-org_apache_lucene_document.patch,
 LUCENE-1257-org_apache_lucene_document.patch,
 LUCENE-1257-org_apache_lucene_document.patch,
 LUCENE-1257-SegmentInfos.patch, LUCENE-1257-StringBuffer.patch,
 LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch,
 LUCENE-1257-TopDocsCollector.patch, LUCENE-1257-WordListLoader.patch,
 LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch,
 LUCENE-1257_contrib_benchmark.patch, LUCENE-1257_contrib_benchmark_2.patch,
 LUCENE-1257_contrib_highlighting.patch, LUCENE-1257_contrib_memory.patch,
 LUCENE-1257_contrib_misc.patch, LUCENE-1257_contrib_smartcn.patch,
 LUCENE-1257_javacc_upgrade.patch, LUCENE-1257_lucil.patch,
 LUCENE-1257_lucli.patch, LUCENE-1257_messages.patch,
 LUCENE-1257_more_unnecessary_casts.patch,
 LUCENE-1257_MultiFieldQueryParser.patch,
 LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch,
 LUCENE-1257_o_a_l_demo.patch, LUCENE-1257_o_a_l_index_test.patch,
 LUCENE-1257_o_a_l_index_test.patch, LUCENE-1257_o_a_l_search.patch,
 LUCENE-1257_o_a_l_search_spans.patch,
 LUCENE-1257_org_apache_lucene_index.patch,
 LUCENE-1257_org_apache_lucene_index.patch,
 LUCENE-1257_precendence_parser.patch, LUCENE-1257_queryParser_jj.patch,
 LUCENE-1257_swing_wikipedia_wordnet_xmlqp.patch,
 LUCENE-1257_unnecessary_casts.patch, LUCENE-1257_unnnecessary_casts_2.patch,
 lucene1257surround1.patch, lucene1257surround1.patch,
 shinglematrixfilter_generified.patch
 
 
  For my needs I've updated Lucene so that it uses Java 5 constructs. I
 know Java 5 migration had been planned for 2.1 someday in the past, but
 don't know when it is planned now. This patch against the trunk includes :
  - most obvious generics usage (there are tons of usages of sets, ...
 Those which are commonly used have been generified)
  - PriorityQueue generification
  - replacement of indexed for loops with for each constructs
  - removal of unnececessary unboxing
  The code is to my opinion much more readable with those features (you
 actually *know* what is stored in collections reading the code, without the
 need to lookup for field definitions everytime) and it simplifies many
 algorithms.
  Note that this patch also includes an interface for the Query class. This
 has been done for my company's needs for building custom Query 

Re: [jira] Commented: (LUCENE-1257) Port to Java5

2009-11-10 Thread Erick Erickson
And here I was hoping to make Uwe stay up for *days* without sleep finding
all the gotchas G.

Thanks for the response. I'll see if I can update my IntelliJ codestyle
appropriately, but probably won't get there 'til this weekend. I'll upload
it to the Wiki or attach it to a Jira if nobody beats me to it.

Erick


On Tue, Nov 10, 2009 at 7:37 PM, Robert Muir rcm...@gmail.com wrote:

 this was the similar to the discussion we had at apachecon, where i wanted
 to create a jira issue as Uwe Schindlersome invisible unicode space and
 suggest a patch to reformat all of contrib!

 (would never attribute such a thing to my name but this formatting issue
 consistently gets in my way)


 On Tue, Nov 10, 2009 at 7:29 PM, Uwe Schindler u...@thetaphi.de wrote:

  Yes this one is new, but it is almost the default Java 1.5 style with
 tabs=2chars and the modified generics formatting.



 I know about the reformatting method in Eclipse, but that would break more
 patches now L (a lot of are already broken).

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
   --

 *From:* Erick Erickson [mailto:erickerick...@gmail.com]
 *Sent:* Wednesday, November 11, 2009 1:27 AM
 *To:* java-dev@lucene.apache.org
 *Subject:* Re: [jira] Commented: (LUCENE-1257) Port to Java5



 About formatting. I know the how to contribute section of the Wiki warns
 against gratuitous reformatting, but if *someone* with commit privileges
 wanted to, they could  format an entire tree in Eclipse from the context
 menu of, say, the contrib directory. It'd have to be coordinated for a
 moment when not too many others were editing the code...

 I mention this since we're doing a bunch of non-functional changes for the
 3.0 release, and it might be a reasonable thing to do so future commits were
 easier to compare, at least after the reformatting was done. As long as
 we're all using the same formatting, it might be worthwhile.

 Somebody mentioned uploading a new codestyle.xml for Eclipse. Were there
 any changes or is this just getting the one from SOLR up there? Because I'm
 using IntelliJ

 Erick

 On Tue, Nov 10, 2009 at 7:08 PM, Uwe Schindler (JIRA) j...@apache.org
 wrote:


[
 https://issues.apache.org/jira/browse/LUCENE-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12776184#action_12776184]


 Uwe Schindler commented on LUCENE-1257:
 ---

 Kay Kay: We only have SuppressWarnings at some places in core, marked with
 a big TODO (will be done when flex indeixng comes). The wanted
 @SuppressWarnings are only at places, where generic Arrays are created.
 There is no way to fix this (see Sun Generics Howto).


  Port to Java5
  -
 
  Key: LUCENE-1257
  URL: https://issues.apache.org/jira/browse/LUCENE-1257
  Project: Lucene - Java
   Issue Type: Improvement
   Components: Analysis, Examples, Index, Other, Query/Scoring,
 QueryParser, Search, Store, Term Vectors
 Affects Versions: 3.0
 Reporter: Cédric Champeau
 Assignee: Uwe Schindler
 Priority: Minor
  Fix For: 3.0
 
  Attachments: instantiated_fieldable.patch,
 LUCENE-1257-BooleanQuery.patch, LUCENE-1257-BooleanScorer_2.patch,
 LUCENE-1257-BufferedDeletes_DocumentsWriter.patch,
 LUCENE-1257-CheckIndex.patch, LUCENE-1257-CloseableThreadLocal.patch,
 LUCENE-1257-CompoundFileReaderWriter.patch,
 LUCENE-1257-ConcurrentMergeScheduler.patch,
 LUCENE-1257-DirectoryReader.patch,
 LUCENE-1257-DisjunctionMaxQuery-more_type_safety.patch,
 LUCENE-1257-DocFieldProcessorPerThread.patch, LUCENE-1257-Document.patch,
 LUCENE-1257-FieldCacheImpl.patch, LUCENE-1257-FieldCacheRangeFilter.patch,
 LUCENE-1257-IndexDeleter.patch,
 LUCENE-1257-IndexDeletionPolicy_IndexFileDeleter.patch,
 LUCENE-1257-iw.patch, LUCENE-1257-MTQWF.patch,
 LUCENE-1257-NormalizeCharMap.patch, LUCENE-1257-o.a.l.util.patch,
 LUCENE-1257-org_apache_lucene_document.patch,
 LUCENE-1257-org_apache_lucene_document.patch,
 LUCENE-1257-org_apache_lucene_document.patch,
 LUCENE-1257-SegmentInfos.patch, LUCENE-1257-StringBuffer.patch,
 LUCENE-1257-StringBuffer.patch, LUCENE-1257-StringBuffer.patch,
 LUCENE-1257-TopDocsCollector.patch, LUCENE-1257-WordListLoader.patch,
 LUCENE-1257_analysis.patch, LUCENE-1257_BooleanFilter_Generics.patch,
 LUCENE-1257_contrib_benchmark.patch, LUCENE-1257_contrib_benchmark_2.patch,
 LUCENE-1257_contrib_highlighting.patch, LUCENE-1257_contrib_memory.patch,
 LUCENE-1257_contrib_misc.patch, LUCENE-1257_contrib_smartcn.patch,
 LUCENE-1257_javacc_upgrade.patch, LUCENE-1257_lucil.patch,
 LUCENE-1257_lucli.patch, LUCENE-1257_messages.patch,
 LUCENE-1257_more_unnecessary_casts.patch,
 LUCENE-1257_MultiFieldQueryParser.patch,
 LUCENE-1257_o.a.l.queryParser.patch, LUCENE-1257_o.a.l.store.patch,
 LUCENE-1257_o_a_l_demo.patch, LUCENE

Re: Lucene - Text Classification.

2009-11-09 Thread Erick Erickson
Please re-post this question on the lucene user's list, this list is
intended for development discussions

Best
Erick

On Mon, Nov 9, 2009 at 10:02 AM, lucenenew mitesh.jes...@yahoo.com wrote:


 i want to classify sentences stored as strings to a bunch of keywords
 related
 to a certain category.

 so i will have 10 strings which will be a sentence long. and i will want to
 compare each string to a set of 30 keywords stored somewhere, and then
 compare with another set of 30 keywords, so on.

 i want to rank each string based on the number of times it matches a set of
 keywords. so basically i want to categorize each sentence.

 is this possible with lucene, or would any other approach be more
 efficient.

 will this process take long? in terms of speed of program.

 and what tools would i need?

 any help would be great.

 thanks.
 --
 View this message in context:
 http://old.nabble.com/Lucene---Text-Classification.-tp26267794p26267794.html
 Sent from the Lucene - Java Developer mailing list archive at Nabble.com.


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




  1   2   >