[jira] [Commented] (LUCENE-3041) Support Query Visting / Walking
[ https://issues.apache.org/jira/browse/LUCENE-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027279#comment-13027279 ] Chris Male commented on LUCENE-3041: To follow up on Earwin's comments, I'm going to do the following: - Leave Query#rewrite out of the walking process. As Earwin said, rewrite provides vital query optimization / conversion to primitive runnable queries. Having this method on Query is a good idea since user Queries can simply implement this method and move on. - In a separate issue, add a RewriteState like concept which can be used for caching rewrites like that suggested by Simon. This will have a considerable performance improvement for people doing lots of repeated FuzzyQuerys for example. - Change my processing concept into a generic Walker system, which can be used for lots of things in Lucene. Users can implement this Walker to do whatever they want (maybe we can pry Earwin's walker based highlighter from him? :D) - Overload IndexSearcher's methods to support passing in a Walker. We need this, instead of simply having the Walker external, because we really want to support per-segment Walking. I'll make a patch for the stuff related to this issue shortly, and spin off the RewriteState stuff. > Support Query Visting / Walking > --- > > Key: LUCENE-3041 > URL: https://issues.apache.org/jira/browse/LUCENE-3041 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Chris Male >Priority: Minor > Attachments: LUCENE-3041.patch, LUCENE-3041.patch, LUCENE-3041.patch, > LUCENE-3041.patch > > > Out of the discussion in LUCENE-2868, it could be useful to add a generic > Query Visitor / Walker that could be used for more advanced rewriting, > optimizations or anything that requires state to be stored as each Query is > visited. > We could keep the interface very simple: > {code} > public interface QueryVisitor { > Query visit(Query query); > } > {code} > and then use a reflection based visitor like Earwin suggested, which would > allow implementators to provide visit methods for just Querys that they are > interested in. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Index searcher can't find the doc of any field value
Hi Friends, i'm using lucene to index a file with this format, each lines contains 4 elements which separated by space. because I want to retrieve any line with special text in a special part, so I try to add each line to index in a seprate document with 4 fields. for example I named fields A,B,C,D so i use this code to index my file: File file = new File("e://data3"); BufferedReader reader = new BufferedReader(new FileReader(file)); IndexWriter writer = new IndexWriter(indexDirectory, new SimpleAnalyzer(),true); writer.setUseCompoundFile(true); String line; while ((line = reader.readLine()) != null) { string[] index = line.split(" "); Document document = new Document(); document.add(new Field("A", index[0], Field.Store.YES, Field.Index.UN_TOKENIZED)); document.add(new Field("B", index[1], Field.Store.YES, Field.Index.UN_TOKENIZED)); document.add(new Field("C", index[2], Field.Store.YES, Field.Index.UN_TOKENIZED)); document.add(new Field("D", index[3], Field.Store.YES, Field.Index.UN_TOKENIZED)); writer.addDocument(document); System.out.println(writer.docCount()); } } catch (Exception e) { e.printStackTrace(); } but when i try to search this index with some letters which exist in for example field A it fails to find the document(line) :( my search code is as follows: try { IndexSearcher is = new IndexSearcher(FSDirectory.getDirectory(indexDirectory, false)); Query q = new TermQuery(new Term("A", "hello")); Hits hits = is.search(q); for (int i = 0; i < hits.length(); i++) { Document doc = hits.doc(i); System.out.println("A: "+doc.get("A")+" B:"+doc.get("B")+" C:"+doc.get("C")+" D:"+doc.get("D")); } } catch (Exception e) { e.printStackTrace(); } kindly let me know if there is any error in my code . thanks in advance.
[jira] [Issue Comment Edited] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027245#comment-13027245 ] Lance Norskog edited comment on SOLR-445 at 4/30/11 12:18 AM: -- If the DIH semantics cover all of the use cases, please follow that model: behavior, names, etc. It will be much easier on users. was (Author: lancenorskog): If the DIH semantics cover all of the use cases, please follow that model: behavior, names, etc. It will be much easier on developers. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Grant Ingersoll > Fix For: Next > > Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-445) Update Handlers abort with bad documents
[ https://issues.apache.org/jira/browse/SOLR-445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027245#comment-13027245 ] Lance Norskog commented on SOLR-445: If the DIH semantics cover all of the use cases, please follow that model: behavior, names, etc. It will be much easier on developers. > Update Handlers abort with bad documents > > > Key: SOLR-445 > URL: https://issues.apache.org/jira/browse/SOLR-445 > Project: Solr > Issue Type: Bug > Components: update >Affects Versions: 1.3 >Reporter: Will Johnson >Assignee: Grant Ingersoll > Fix For: Next > > Attachments: SOLR-445-3_x.patch, SOLR-445.patch, SOLR-445.patch, > SOLR-445.patch, SOLR-445.patch, SOLR-445_3x.patch, solr-445.xml > > > Has anyone run into the problem of handling bad documents / failures mid > batch. Ie: > > > 1 > > > 2 > I_AM_A_BAD_DATE > > > 3 > > > Right now solr adds the first doc and then aborts. It would seem like it > should either fail the entire batch or log a message/return a code and then > continue on to add doc 3. Option 1 would seem to be much harder to > accomplish and possibly require more memory while Option 2 would require more > information to come back from the API. I'm about to dig into this but I > thought I'd ask to see if anyone had any suggestions, thoughts or comments. > -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027207#comment-13027207 ] Michael McCandless commented on LUCENE-3023: +1 to commit! Great work everyone :) > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023-svn-diff.patch, > LUCENE-3023-ws-changes.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023_CHANGES.patch, > LUCENE-3023_CHANGES.patch, LUCENE-3023_iw_iwc_jdoc.patch, > LUCENE-3023_simonw_review.patch, LUCENE-3023_svndiff.patch, > LUCENE-3023_svndiff.patch, diffMccand.py, diffSources.patch, > diffSources.patch, realtime-TestAddIndexes-3.txt, > realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene.Net] new structure
I was thinking along the lines it was for all executables. i'll put them in the build folder then. On Fri, Apr 29, 2011 at 4:15 PM, Troy Howard wrote: > Only thing I would suggest is keeping .cmd/bin files in the build > folder. The bin folder is meant for the compiled artifacts. > > Otherwise, everything else sounds great. > > Thanks, > Troy > > > On Fri, Apr 29, 2011 at 1:08 PM, Michael Herndon < > mhern...@wickedsoftware.net> wrote: > > > If you think it would be beneficial to have the scripts in the branch, I > > can > > do that. > > > > On Fri, Apr 29, 2011 at 3:50 PM, Digy wrote: > > > > > Would you add the same stuff to 2.9.4g branch too? > > > > > > DIGY > > > > > > -Original Message- > > > From: Michael Herndon [mailto:mhern...@wickedsoftware.net] > > > Sent: Friday, April 29, 2011 10:28 PM > > > To: lucene-net-...@lucene.apache.org > > > Subject: Re: [Lucene.Net] new structure > > > > > > I'm going to move ahead with this stuff this weekend unless anyone > > objects. > > > > > > On Sun, Apr 24, 2011 at 4:42 PM, Michael Herndon < > > > mhern...@wickedsoftware.net> wrote: > > > > > > > if you celebrate Easter, Happy Easter, if not, then Happy > > lucene.netclean > > > > up day. > > > > > > > > > > > > couple of questions. would it be cool if I can add a .gitignore to > the > > > root > > > > folder? > > > > > > > > also would it upset anyone if I add .cmd & .sh files to the /bin > > folder > > > > and .xml/.build files to the /build folder ? > > > > > > > > and sand castle and shfb to the /lib folder? > > > > > > > > - Michael > > > > > > > > > > > > On Sat, Apr 23, 2011 at 7:57 AM, Digy wrote: > > > > > > > >> Everything seems to be OK. > > > >> +1 for removing old directory structure. > > > >> > > > >> Thanks Troy > > > >> > > > >> DIGY > > > >> > > > >> -Original Message- > > > >> From: Troy Howard [mailto:thowar...@gmail.com] > > > >> Sent: Saturday, April 23, 2011 3:07 AM > > > >> To: lucene-net-...@lucene.apache.org > > > >> Subject: Re: [Lucene.Net] new structure > > > >> > > > >> I guess by 'today' I meant 'In about 6 days'. > > > >> > > > >> Anyhow, I completed the commit of the new directory structure.. I > did > > > not > > > >> delete the OLD directory structure, because they can live > > side-by-side. > > > >> Also, please note that I only created vs2010 solutions and upgraded > > the > > > >> projects to same. > > > >> > > > >> Please pull down the latest revision and validate these changes. If > > all > > > >> goes > > > >> well, I'll delete the old directory structure (everything under the > > 'C#' > > > >> directory). > > > >> > > > >> Thanks, > > > >> Troy > > > >> > > > >> On Sat, Apr 16, 2011 at 3:42 PM, Troy Howard > > > wrote: > > > >> > > > >> > Apologize. I got a bit derailed. Will be commiting today. > > > >> > On Apr 16, 2011 2:20 PM, "Prescott Nasser" > > > > >> wrote: > > > >> > > > > > >> > > > > > >> > > Hey Troy any status update on the new structure? I'm hesistant > to > > do > > > >> > updates since I know you're going to be modifying it all shortly > > > >> > > > > > >> > > ~P > > > >> > > > > > >> > > > > >> > > > >> > > > > > > > > > > > > >
[jira] [Commented] (LUCENE-3053) improve test coverage for Multi*
[ https://issues.apache.org/jira/browse/LUCENE-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027204#comment-13027204 ] Michael McCandless commented on LUCENE-3053: Patch looks good Robert -- make our tests eviler!! > improve test coverage for Multi* > > > Key: LUCENE-3053 > URL: https://issues.apache.org/jira/browse/LUCENE-3053 > Project: Lucene - Java > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-3053.patch, LUCENE-3053.patch, LUCENE-3053.patch > > > It seems like an easy win that when the test calls newSearcher(), > it should sometimes wrap the reader with a SlowMultiReaderWrapper. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Issue Comment Edited] (LUCENE-3055) LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027179#comment-13027179 ] Uwe Schindler edited comment on LUCENE-3055 at 4/29/11 8:51 PM: {quote} >From my perspective the most important reason is to avoid a huge performance >trap: previously if you subclassed one of these analyzers, override >tokenStream(), and added SpecialFilter for example, most of the time users >would actually slow down indexing, because now reusableTokenStream() cannot be >used by the indexer. {quote} Additionally, exactly this special case (overwriting one of the methods) was the biggest problem, leading to ugly reflection based checks in Lucene 3.0: In 3.0 StandardAnalyzer correctly implemented both tokenStream() and reuseableTokenStream(). As soon as one subclass only overrided tokenStream(), but the indexer still calling reuseableTokenStream() the changes were not even used, leading to lots of bug reports. Because of this, a reflection based backwards hack was done in 3.0 (see o.a.l.util.VirtualMethod class to make this easier), that prevented the indexer from calling reuseableTokenStream if a subclass suddenly overwrote only one of the methods. With moving forward in 3.1, these backwards hacks even got heavier (e.g. changes in TokenStreams, new base class ReuseableAnalyzerBase,...), so the only solution was to enforce the decorator pattern. The above example by Robert is the correct way to implement your "factory" of TokenStreams. Everything else like subclassing StandardAnalyzer is ugly as it hides what you are really doing. The above pattern does exactly what also Solr's Schema does: You have to explicitely list all your components, making it clear what your TokenStreams are doing. Trust me, the above example is shorter than subclassing previous StandardAnalyzer completely (both tokenStream and reuseableTokenStream) and is showing like solrschema.xml what your Analyzer looks like (no hidden stuff in superfactories,...) was (Author: thetaphi): {quote} >From my perspective the most important reason is to avoid a huge performance >trap: previously if you subclassed one of these analyzers, override >tokenStream(), and added SpecialFilter for example, most of the time users >would actually slow down indexing, because now reusableTokenStream() cannot be >used by the indexer. {quote} Additionally, exactly this special case (overwriting one of the methods) was the biggest problem, leading to ugly reflection based checks in Lucene 3.0: In 3.0 StandardAnalyzer correctly implemented both tokenStream() and reuseableTokenStream(). As soon as one subclass only overrided tokenStream(), but the indexer still calling reuseableTokenStream() the changes were not even used, leading to lots of bug reports. Because of this, a reflection based backwards hack was done in 3.0 (see o.a.l.util.VirtualMethod class to make this easier), that prevented the indexer from calling reuseableTokenStream if a subclass suddenly overwrote only one of the methods. With moving forward in 3.1, these backwards hacks even got heavier (e.g. changes in TokenStreams, new base class ReuseableAnalyzerBase,...), so the only solution was to enforce the decorator pattern. The above example by Robert is the correct way to implement you "factory" of TokenStreams. Everything else like subclassing StandardAnalyzer is ugly as it hides what you are really doing. The above pattern does exactly what also Solr's Schemadoes: You have to explicitely list all your components, making it clear what your TokenStreams are doing. Trust me, the above example is shorter than subclassing previous StandardAnalyzer completely (both tokenStream and reuseableTokenStream) and is showing like solrschema.xml what your Analyzer looks like (no hidden stuff in superfactories,...) > LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers > -- > > Key: LUCENE-3055 > URL: https://issues.apache.org/jira/browse/LUCENE-3055 > Project: Lucene - Java > Issue Type: Bug > Components: Analysis >Affects Versions: 3.1 >Reporter: Ian Soboroff > > LUCENE-2372 and LUCENE-2389 marked all analyzers as final. This makes > ReusableAnalyzerBase useless, and makes it impossible to subclass e.g. > StandardAnalyzer to make a small modification e.g. to tokenStream(). These > issues don't indicate a new method of doing this. The issues don't give a > reason except for design considerations, which seems a poor reason to make a > backward-incompatible change -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira --
[jira] [Commented] (LUCENE-3055) LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027179#comment-13027179 ] Uwe Schindler commented on LUCENE-3055: --- {quote} >From my perspective the most important reason is to avoid a huge performance >trap: previously if you subclassed one of these analyzers, override >tokenStream(), and added SpecialFilter for example, most of the time users >would actually slow down indexing, because now reusableTokenStream() cannot be >used by the indexer. {quote} Additionally, exactly this special case (overwriting one of the methods) was the biggest problem, leading to ugly reflection based checks in Lucene 3.0: In 3.0 StandardAnalyzer correctly implemented both tokenStream() and reuseableTokenStream(). As soon as one subclass only overrided tokenStream(), but the indexer still calling reuseableTokenStream() the changes were not even used, leading to lots of bug reports. Because of this, a reflection based backwards hack was done in 3.0 (see o.a.l.util.VirtualMethod class to make this easier), that prevented the indexer from calling reuseableTokenStream if a subclass suddenly overwrote only one of the methods. With moving forward in 3.1, these backwards hacks even got heavier (e.g. changes in TokenStreams, new base class ReuseableAnalyzerBase,...), so the only solution was to enforce the decorator pattern. The above example by Robert is the correct way to implement you "factory" of TokenStreams. Everything else like subclassing StandardAnalyzer is ugly as it hides what you are really doing. The above pattern does exactly what also Solr's Schemadoes: You have to explicitely list all your components, making it clear what your TokenStreams are doing. Trust me, the above example is shorter than subclassing previous StandardAnalyzer completely (both tokenStream and reuseableTokenStream) and is showing like solrschema.xml what your Analyzer looks like (no hidden stuff in superfactories,...) > LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers > -- > > Key: LUCENE-3055 > URL: https://issues.apache.org/jira/browse/LUCENE-3055 > Project: Lucene - Java > Issue Type: Bug > Components: Analysis >Affects Versions: 3.1 >Reporter: Ian Soboroff > > LUCENE-2372 and LUCENE-2389 marked all analyzers as final. This makes > ReusableAnalyzerBase useless, and makes it impossible to subclass e.g. > StandardAnalyzer to make a small modification e.g. to tokenStream(). These > issues don't indicate a new method of doing this. The issues don't give a > reason except for design considerations, which seems a poor reason to make a > backward-incompatible change -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3055) LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers
[ https://issues.apache.org/jira/browse/LUCENE-3055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027171#comment-13027171 ] Robert Muir commented on LUCENE-3055: - Hi Ian, you are right the justifications don't totally explain the reasoning behind this change. >From my perspective the most important reason is to avoid a huge performance >trap: previously if you subclassed one of these analyzers, override >tokenStream(), and added SpecialFilter for example, most of the time users >would actually slow down indexing, because now reusableTokenStream() cannot be >used by the indexer. This created worst-case situations like LUCENE-2279. Instead, the recommended approach is to just let analyzers be tokenstream factories (which is all they are). They aren't really "extendable" only "overridable" since they are just factories for tokenstreams, and by doing so it creates the worst-case performance trap where new objects are created for every document. I would instead recommend writing your analyzer by extending ReusableAnalyzerBase instead, which is easy and safe: {noformat} Analyzer analyzer = new ReusableAnalyzerBase() { protected TokenStreamComponents createComponents(String fieldName, Reader reader) { Tokenizer tokenizer = new WhitespaceTokenizer(...); TokenStream filteredStream = new FooTokenFilter(tokenizer, ...); filteredStream = new BarTokenFilter(filteredStream, ...); return new TokenStreamComponents(tokenizer, filteredStream); } }; {noformat} > LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers > -- > > Key: LUCENE-3055 > URL: https://issues.apache.org/jira/browse/LUCENE-3055 > Project: Lucene - Java > Issue Type: Bug > Components: Analysis >Affects Versions: 3.1 >Reporter: Ian Soboroff > > LUCENE-2372 and LUCENE-2389 marked all analyzers as final. This makes > ReusableAnalyzerBase useless, and makes it impossible to subclass e.g. > StandardAnalyzer to make a small modification e.g. to tokenStream(). These > issues don't indicate a new method of doing this. The issues don't give a > reason except for design considerations, which seems a poor reason to make a > backward-incompatible change -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene.Net] new structure
Only thing I would suggest is keeping .cmd/bin files in the build folder. The bin folder is meant for the compiled artifacts. Otherwise, everything else sounds great. Thanks, Troy On Fri, Apr 29, 2011 at 1:08 PM, Michael Herndon < mhern...@wickedsoftware.net> wrote: > If you think it would be beneficial to have the scripts in the branch, I > can > do that. > > On Fri, Apr 29, 2011 at 3:50 PM, Digy wrote: > > > Would you add the same stuff to 2.9.4g branch too? > > > > DIGY > > > > -Original Message- > > From: Michael Herndon [mailto:mhern...@wickedsoftware.net] > > Sent: Friday, April 29, 2011 10:28 PM > > To: lucene-net-...@lucene.apache.org > > Subject: Re: [Lucene.Net] new structure > > > > I'm going to move ahead with this stuff this weekend unless anyone > objects. > > > > On Sun, Apr 24, 2011 at 4:42 PM, Michael Herndon < > > mhern...@wickedsoftware.net> wrote: > > > > > if you celebrate Easter, Happy Easter, if not, then Happy > lucene.netclean > > > up day. > > > > > > > > > couple of questions. would it be cool if I can add a .gitignore to the > > root > > > folder? > > > > > > also would it upset anyone if I add .cmd & .sh files to the /bin > folder > > > and .xml/.build files to the /build folder ? > > > > > > and sand castle and shfb to the /lib folder? > > > > > > - Michael > > > > > > > > > On Sat, Apr 23, 2011 at 7:57 AM, Digy wrote: > > > > > >> Everything seems to be OK. > > >> +1 for removing old directory structure. > > >> > > >> Thanks Troy > > >> > > >> DIGY > > >> > > >> -Original Message- > > >> From: Troy Howard [mailto:thowar...@gmail.com] > > >> Sent: Saturday, April 23, 2011 3:07 AM > > >> To: lucene-net-...@lucene.apache.org > > >> Subject: Re: [Lucene.Net] new structure > > >> > > >> I guess by 'today' I meant 'In about 6 days'. > > >> > > >> Anyhow, I completed the commit of the new directory structure.. I did > > not > > >> delete the OLD directory structure, because they can live > side-by-side. > > >> Also, please note that I only created vs2010 solutions and upgraded > the > > >> projects to same. > > >> > > >> Please pull down the latest revision and validate these changes. If > all > > >> goes > > >> well, I'll delete the old directory structure (everything under the > 'C#' > > >> directory). > > >> > > >> Thanks, > > >> Troy > > >> > > >> On Sat, Apr 16, 2011 at 3:42 PM, Troy Howard > > wrote: > > >> > > >> > Apologize. I got a bit derailed. Will be commiting today. > > >> > On Apr 16, 2011 2:20 PM, "Prescott Nasser" > > >> wrote: > > >> > > > > >> > > > > >> > > Hey Troy any status update on the new structure? I'm hesistant to > do > > >> > updates since I know you're going to be modifying it all shortly > > >> > > > > >> > > ~P > > >> > > > > >> > > > >> > > >> > > > > > > > >
RE: [Lucene.Net] new structure
I just want to keep the 2.9.4g & trunk in par. The only divergence for now is LUCENENET-172 which will be applied to 2.9.4 eventually. DIGY -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Friday, April 29, 2011 11:08 PM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] new structure If you think it would be beneficial to have the scripts in the branch, I can do that. On Fri, Apr 29, 2011 at 3:50 PM, Digy wrote: > Would you add the same stuff to 2.9.4g branch too? > > DIGY > > -Original Message- > From: Michael Herndon [mailto:mhern...@wickedsoftware.net] > Sent: Friday, April 29, 2011 10:28 PM > To: lucene-net-...@lucene.apache.org > Subject: Re: [Lucene.Net] new structure > > I'm going to move ahead with this stuff this weekend unless anyone objects. > > On Sun, Apr 24, 2011 at 4:42 PM, Michael Herndon < > mhern...@wickedsoftware.net> wrote: > > > if you celebrate Easter, Happy Easter, if not, then Happy lucene.netclean > > up day. > > > > > > couple of questions. would it be cool if I can add a .gitignore to the > root > > folder? > > > > also would it upset anyone if I add .cmd & .sh files to the /bin folder > > and .xml/.build files to the /build folder ? > > > > and sand castle and shfb to the /lib folder? > > > > - Michael > > > > > > On Sat, Apr 23, 2011 at 7:57 AM, Digy wrote: > > > >> Everything seems to be OK. > >> +1 for removing old directory structure. > >> > >> Thanks Troy > >> > >> DIGY > >> > >> -Original Message- > >> From: Troy Howard [mailto:thowar...@gmail.com] > >> Sent: Saturday, April 23, 2011 3:07 AM > >> To: lucene-net-...@lucene.apache.org > >> Subject: Re: [Lucene.Net] new structure > >> > >> I guess by 'today' I meant 'In about 6 days'. > >> > >> Anyhow, I completed the commit of the new directory structure.. I did > not > >> delete the OLD directory structure, because they can live side-by-side. > >> Also, please note that I only created vs2010 solutions and upgraded the > >> projects to same. > >> > >> Please pull down the latest revision and validate these changes. If all > >> goes > >> well, I'll delete the old directory structure (everything under the 'C#' > >> directory). > >> > >> Thanks, > >> Troy > >> > >> On Sat, Apr 16, 2011 at 3:42 PM, Troy Howard > wrote: > >> > >> > Apologize. I got a bit derailed. Will be commiting today. > >> > On Apr 16, 2011 2:20 PM, "Prescott Nasser" > >> wrote: > >> > > > >> > > > >> > > Hey Troy any status update on the new structure? I'm hesistant to do > >> > updates since I know you're going to be modifying it all shortly > >> > > > >> > > ~P > >> > > > >> > > >> > >> > > > >
[jira] [Created] (LUCENE-3055) LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers
LUCENE-2372, LUCENE-2389 made it impossible to subclass core analyzers -- Key: LUCENE-3055 URL: https://issues.apache.org/jira/browse/LUCENE-3055 Project: Lucene - Java Issue Type: Bug Components: Analysis Affects Versions: 3.1 Reporter: Ian Soboroff LUCENE-2372 and LUCENE-2389 marked all analyzers as final. This makes ReusableAnalyzerBase useless, and makes it impossible to subclass e.g. StandardAnalyzer to make a small modification e.g. to tokenStream(). These issues don't indicate a new method of doing this. The issues don't give a reason except for design considerations, which seems a poor reason to make a backward-incompatible change -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Code Freeze on realtime_search branch
2011/4/29 Michael McCandless : > Sorry, but, no :) > > So feel free to keep working towards removing this limitation!! > > This change makes IndexWriter's flush (where it writes the added > documents in RAM to disk as a new segment) fully concurrent, so that > while one segment is being flushed (which could take a longish time, > eg on a slowish IO system), other threads are now free to continue > indexing (where they were blocked before). On computers with > substantial CPU concurrency, and fast "enough" IO systems, this change > should give a big increase in indexing throughput. > > That said, I do think this change is a step towards what you seek > (allowing multiple IndexWriters, even in separate JVMs maybe on > separate computers, to write into an index at once). > > Mike thank you for clarifying this; maybe I don't even need to remove the locking if I can run some of those participant threads in the remote nodes. I'll keep you updated, but unfortunately can't start working on it sooner. Sanne > > http://blog.mikemccandless.com > > On Fri, Apr 29, 2011 at 2:16 PM, Sanne Grinovero > wrote: >> Hello, >> this is totally awesome! >> >> Does it imply we don't need the IndexWriter lock anymore? And hence >> that people sharing the Lucene Directory across multiple JVMs can have >> both write at the same time? >> >> I had intentions to *try* removing such limitations this summer, but >> if this is the case I will spend my time testing this carefully >> instead, or if some kind of locking is still required I'd appreciate >> some pointers so that I'll be able to remove them. >> >> Regards, >> Sanne >> >> 2011/4/29 Simon Willnauer : >>> Hey folks, >>> >>> LUCENE-3023 aims to land the considerably large >>> DocumentsWriterPerThread (DWPT) refactoring on trunk. >>> During the last weeks we have put lots of efforts into cleaning the >>> code up, fixing javadocs and run test locally >>> as well as on Jenkins. We reached the point where we are able to >>> create a final patch for review and land this >>> exciting refactoring on trunk very soon. I committed the CHANGES.TXT >>> entry (also appended below) a couple of minutes ago so from now on >>> we freeze the branch for final review (Robert can you create a new >>> "final" patch and upload to LUCENE-3023). >>> Any comments should go to [1] or as a reply to this email. If there is >>> no blocker coming up we plan to reintegrate the >>> branch and commit it to trunk early next week. For those who want some >>> background what DWPT does read: [2] >>> >>> Note: this change will not change the index file format so there is no >>> need to reindex for trunk users. Yet, I will send a heads up next week >>> with an >>> overview of that has changed. >>> >>> Simon >>> >>> [1] https://issues.apache.org/jira/browse/LUCENE-3023 >>> [2] >>> http://blog.jteam.nl/2011/04/01/gimme-all-resources-you-have-i-can-use-them/ >>> >>> >>> * LUCENE-2956, LUCENE-2573, LUCENE-2324, LUCENE-2555: Changes from >>> DocumentsWriterPerThread: >>> >>> - IndexWriter now uses a DocumentsWriter per thread when indexing >>> documents. >>> Each DocumentsWriterPerThread indexes documents in its own private >>> segment, >>> and the in memory segments are no longer merged on flush. Instead, each >>> segment is separately flushed to disk and subsequently merged with normal >>> segment merging. >>> >>> - DocumentsWriterPerThread (DWPT) is now flushed concurrently based on a >>> FlushPolicy. When a DWPT is flushed, a fresh DWPT is swapped in so that >>> indexing may continue concurrently with flushing. The selected >>> DWPT flushes all its RAM resident documents do disk. Note: Segment >>> flushes >>> don't flush all RAM resident documents but only the documents private to >>> the DWPT selected for flushing. >>> >>> - Flushing is now controlled by FlushPolicy that is called for every add, >>> update or delete on IndexWriter. By default DWPTs are flushed either on >>> maxBufferedDocs per DWPT or the global active used memory. Once the >>> active >>> memory exceeds ramBufferSizeMB only the largest DWPT is selected for >>> flushing and the memory used by this DWPT is substracted from the active >>> memory and added to a flushing memory pool, which can lead to temporarily >>> higher memory usage due to ongoing indexing. >>> >>> - IndexWriter now can utilize ramBufferSize > 2048 MB. Each DWPT can >>> address >>> up to 2048 MB memory such that the ramBufferSize is now bounded by the >>> max >>> number of DWPT avaliable in the used DocumentsWriterPerThreadPool. >>> IndexWriters net memory consumption can grow far beyond the 2048 MB >>> limit if >>> the applicatoin can use all available DWPTs. To prevent a DWPT from >>> exhausting its address space IndexWriter will forcefully flush a DWPT if >>> its >>> hard memory limit is exceeded. The RAMPerThreadHardLimitMB can be >>> controlled >>> via IndexWriterConfig and defaults to 1945 MB.
Re: [Lucene.Net] new structure
I'm going to move ahead with this stuff this weekend unless anyone objects. On Sun, Apr 24, 2011 at 4:42 PM, Michael Herndon < mhern...@wickedsoftware.net> wrote: > if you celebrate Easter, Happy Easter, if not, then Happy lucene.net clean > up day. > > > couple of questions. would it be cool if I can add a .gitignore to the root > folder? > > also would it upset anyone if I add .cmd & .sh files to the /bin folder > and .xml/.build files to the /build folder ? > > and sand castle and shfb to the /lib folder? > > - Michael > > > On Sat, Apr 23, 2011 at 7:57 AM, Digy wrote: > >> Everything seems to be OK. >> +1 for removing old directory structure. >> >> Thanks Troy >> >> DIGY >> >> -Original Message- >> From: Troy Howard [mailto:thowar...@gmail.com] >> Sent: Saturday, April 23, 2011 3:07 AM >> To: lucene-net-...@lucene.apache.org >> Subject: Re: [Lucene.Net] new structure >> >> I guess by 'today' I meant 'In about 6 days'. >> >> Anyhow, I completed the commit of the new directory structure.. I did not >> delete the OLD directory structure, because they can live side-by-side. >> Also, please note that I only created vs2010 solutions and upgraded the >> projects to same. >> >> Please pull down the latest revision and validate these changes. If all >> goes >> well, I'll delete the old directory structure (everything under the 'C#' >> directory). >> >> Thanks, >> Troy >> >> On Sat, Apr 16, 2011 at 3:42 PM, Troy Howard wrote: >> >> > Apologize. I got a bit derailed. Will be commiting today. >> > On Apr 16, 2011 2:20 PM, "Prescott Nasser" >> wrote: >> > > >> > > >> > > Hey Troy any status update on the new structure? I'm hesistant to do >> > updates since I know you're going to be modifying it all shortly >> > > >> > > ~P >> > > >> > >> >> >
[jira] [Created] (SOLR-2482) DataImportHandler; reload-config; response in case of failure & further requests
DataImportHandler; reload-config; response in case of failure & further requests Key: SOLR-2482 URL: https://issues.apache.org/jira/browse/SOLR-2482 Project: Solr Issue Type: Improvement Components: contrib - DataImportHandler, web gui Reporter: Stefan Matheis (steffkes) Priority: Minor Attachments: reload-config-error.html Reloading while the config-file is valid is completely fine, but if the config is broken - the Response is plain HTML, containing the full stacktrace (see attachment). further requests contain a {{status}} Element with ??DataImportHandler started. Not Initialized. No commands can be run??, but respond with a HTTP-Status 200 OK :/ Would be nice, if: * the response in case of error could also be xml formatted * contain the exception message (in my case ??The end-tag for element type "entity" must end with a '>' delimiter.??) in a seperate field * use a better/correct http-status for the latter mentioned requests, i would suggest {{503 Service Unavailable}} So we are able to display to error-message to the user, while the config gets broken - and for the further requests we could rely on the http-status and have no need to check the content of the xml-response. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-2482) DataImportHandler; reload-config; response in case of failure & further requests
[ https://issues.apache.org/jira/browse/SOLR-2482?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stefan Matheis (steffkes) updated SOLR-2482: Attachment: reload-config-error.html > DataImportHandler; reload-config; response in case of failure & further > requests > > > Key: SOLR-2482 > URL: https://issues.apache.org/jira/browse/SOLR-2482 > Project: Solr > Issue Type: Improvement > Components: contrib - DataImportHandler, web gui >Reporter: Stefan Matheis (steffkes) >Priority: Minor > Attachments: reload-config-error.html > > > Reloading while the config-file is valid is completely fine, but if the > config is broken - the Response is plain HTML, containing the full stacktrace > (see attachment). further requests contain a {{status}} Element with > ??DataImportHandler started. Not Initialized. No commands can be run??, but > respond with a HTTP-Status 200 OK :/ > Would be nice, if: > * the response in case of error could also be xml formatted > * contain the exception message (in my case ??The end-tag for element type > "entity" must end with a '>' delimiter.??) in a seperate field > * use a better/correct http-status for the latter mentioned requests, i would > suggest {{503 Service Unavailable}} > So we are able to display to error-message to the user, while the config gets > broken - and for the further requests we could rely on the http-status and > have no need to check the content of the xml-response. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Code Freeze on realtime_search branch
Sorry, but, no :) So feel free to keep working towards removing this limitation!! This change makes IndexWriter's flush (where it writes the added documents in RAM to disk as a new segment) fully concurrent, so that while one segment is being flushed (which could take a longish time, eg on a slowish IO system), other threads are now free to continue indexing (where they were blocked before). On computers with substantial CPU concurrency, and fast "enough" IO systems, this change should give a big increase in indexing throughput. That said, I do think this change is a step towards what you seek (allowing multiple IndexWriters, even in separate JVMs maybe on separate computers, to write into an index at once). Mike http://blog.mikemccandless.com On Fri, Apr 29, 2011 at 2:16 PM, Sanne Grinovero wrote: > Hello, > this is totally awesome! > > Does it imply we don't need the IndexWriter lock anymore? And hence > that people sharing the Lucene Directory across multiple JVMs can have > both write at the same time? > > I had intentions to *try* removing such limitations this summer, but > if this is the case I will spend my time testing this carefully > instead, or if some kind of locking is still required I'd appreciate > some pointers so that I'll be able to remove them. > > Regards, > Sanne > > 2011/4/29 Simon Willnauer : >> Hey folks, >> >> LUCENE-3023 aims to land the considerably large >> DocumentsWriterPerThread (DWPT) refactoring on trunk. >> During the last weeks we have put lots of efforts into cleaning the >> code up, fixing javadocs and run test locally >> as well as on Jenkins. We reached the point where we are able to >> create a final patch for review and land this >> exciting refactoring on trunk very soon. I committed the CHANGES.TXT >> entry (also appended below) a couple of minutes ago so from now on >> we freeze the branch for final review (Robert can you create a new >> "final" patch and upload to LUCENE-3023). >> Any comments should go to [1] or as a reply to this email. If there is >> no blocker coming up we plan to reintegrate the >> branch and commit it to trunk early next week. For those who want some >> background what DWPT does read: [2] >> >> Note: this change will not change the index file format so there is no >> need to reindex for trunk users. Yet, I will send a heads up next week >> with an >> overview of that has changed. >> >> Simon >> >> [1] https://issues.apache.org/jira/browse/LUCENE-3023 >> [2] >> http://blog.jteam.nl/2011/04/01/gimme-all-resources-you-have-i-can-use-them/ >> >> >> * LUCENE-2956, LUCENE-2573, LUCENE-2324, LUCENE-2555: Changes from >> DocumentsWriterPerThread: >> >> - IndexWriter now uses a DocumentsWriter per thread when indexing documents. >> Each DocumentsWriterPerThread indexes documents in its own private >> segment, >> and the in memory segments are no longer merged on flush. Instead, each >> segment is separately flushed to disk and subsequently merged with normal >> segment merging. >> >> - DocumentsWriterPerThread (DWPT) is now flushed concurrently based on a >> FlushPolicy. When a DWPT is flushed, a fresh DWPT is swapped in so that >> indexing may continue concurrently with flushing. The selected >> DWPT flushes all its RAM resident documents do disk. Note: Segment >> flushes >> don't flush all RAM resident documents but only the documents private to >> the DWPT selected for flushing. >> >> - Flushing is now controlled by FlushPolicy that is called for every add, >> update or delete on IndexWriter. By default DWPTs are flushed either on >> maxBufferedDocs per DWPT or the global active used memory. Once the active >> memory exceeds ramBufferSizeMB only the largest DWPT is selected for >> flushing and the memory used by this DWPT is substracted from the active >> memory and added to a flushing memory pool, which can lead to temporarily >> higher memory usage due to ongoing indexing. >> >> - IndexWriter now can utilize ramBufferSize > 2048 MB. Each DWPT can address >> up to 2048 MB memory such that the ramBufferSize is now bounded by the max >> number of DWPT avaliable in the used DocumentsWriterPerThreadPool. >> IndexWriters net memory consumption can grow far beyond the 2048 MB limit >> if >> the applicatoin can use all available DWPTs. To prevent a DWPT from >> exhausting its address space IndexWriter will forcefully flush a DWPT if >> its >> hard memory limit is exceeded. The RAMPerThreadHardLimitMB can be >> controlled >> via IndexWriterConfig and defaults to 1945 MB. >> Since IndexWriter flushes DWPT concurrently not all memory is released >> immediately. Applications should still use a ramBufferSize significantly >> lower than the JVMs avaliable heap memory since under high load multiple >> flushing DWPT can consume substantial transient memory when IO performance >> is slow relative to indexing rate. >> >> - IndexWriter#commit now doesn't blo
[jira] [Commented] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027135#comment-13027135 ] Yonik Seeley commented on LUCENE-3023: -- This looks awesome guys! I've started some ad-hoc testing via Solr. A single threaded CSV upload (bulk indexing... no real-time reopens) looks pretty much the same, and doing 2 CSV uploads at once was 36% faster (a bit apples-to-oranges since the number of resulting segments was also higher... but even still, looks like a good improvement!) > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023-svn-diff.patch, > LUCENE-3023-ws-changes.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023_CHANGES.patch, > LUCENE-3023_CHANGES.patch, LUCENE-3023_iw_iwc_jdoc.patch, > LUCENE-3023_simonw_review.patch, LUCENE-3023_svndiff.patch, > LUCENE-3023_svndiff.patch, diffMccand.py, diffSources.patch, > diffSources.patch, realtime-TestAddIndexes-3.txt, > realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3054) add assert to sorts catch broken comparators in tests
[ https://issues.apache.org/jira/browse/LUCENE-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3054: Attachment: LUCENE-3054.patch i expanded the patch to all the sorts, just to find all the wierd sorting/comparators going on. it also finds some false positives, ones that are documented as inconsistent with equals, ones in tests, etc. but we can at least look into the ones it finds. > add assert to sorts catch broken comparators in tests > - > > Key: LUCENE-3054 > URL: https://issues.apache.org/jira/browse/LUCENE-3054 > Project: Lucene - Java > Issue Type: Task >Affects Versions: 3.1 >Reporter: Robert Muir > Attachments: LUCENE-3054.patch, LUCENE-3054.patch > > > Looking at Otis's sort problem on the mailing list, he said: > {noformat} > * looked for other places where this call is made - found it in > MultiPhraseQuery$MultiPhraseWeight and changed that call from > ArrayUtil.quickSort to ArrayUtil.mergeSort > * now we no longer see SorterTemplate.quickSort in deep recursion when we do a > thread dump > {noformat} > I thought this was interesting because PostingsAndFreq's comparator > looks like it needs a tiebreaker. > I think in our sorts we should add some asserts to try to catch some of these > broken comparators. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3054) add assert to sorts catch broken comparators in tests
[ https://issues.apache.org/jira/browse/LUCENE-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Otis Gospodnetic updated LUCENE-3054: - Affects Version/s: 3.1 Btw. this is with Lucene 3.1 For full thread: http://search-lucene.com/m/ytANA59Q9G1 > add assert to sorts catch broken comparators in tests > - > > Key: LUCENE-3054 > URL: https://issues.apache.org/jira/browse/LUCENE-3054 > Project: Lucene - Java > Issue Type: Task >Affects Versions: 3.1 >Reporter: Robert Muir > Attachments: LUCENE-3054.patch > > > Looking at Otis's sort problem on the mailing list, he said: > {noformat} > * looked for other places where this call is made - found it in > MultiPhraseQuery$MultiPhraseWeight and changed that call from > ArrayUtil.quickSort to ArrayUtil.mergeSort > * now we no longer see SorterTemplate.quickSort in deep recursion when we do a > thread dump > {noformat} > I thought this was interesting because PostingsAndFreq's comparator > looks like it needs a tiebreaker. > I think in our sorts we should add some asserts to try to catch some of these > broken comparators. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Code Freeze on realtime_search branch
Hello, this is totally awesome! Does it imply we don't need the IndexWriter lock anymore? And hence that people sharing the Lucene Directory across multiple JVMs can have both write at the same time? I had intentions to *try* removing such limitations this summer, but if this is the case I will spend my time testing this carefully instead, or if some kind of locking is still required I'd appreciate some pointers so that I'll be able to remove them. Regards, Sanne 2011/4/29 Simon Willnauer : > Hey folks, > > LUCENE-3023 aims to land the considerably large > DocumentsWriterPerThread (DWPT) refactoring on trunk. > During the last weeks we have put lots of efforts into cleaning the > code up, fixing javadocs and run test locally > as well as on Jenkins. We reached the point where we are able to > create a final patch for review and land this > exciting refactoring on trunk very soon. I committed the CHANGES.TXT > entry (also appended below) a couple of minutes ago so from now on > we freeze the branch for final review (Robert can you create a new > "final" patch and upload to LUCENE-3023). > Any comments should go to [1] or as a reply to this email. If there is > no blocker coming up we plan to reintegrate the > branch and commit it to trunk early next week. For those who want some > background what DWPT does read: [2] > > Note: this change will not change the index file format so there is no > need to reindex for trunk users. Yet, I will send a heads up next week > with an > overview of that has changed. > > Simon > > [1] https://issues.apache.org/jira/browse/LUCENE-3023 > [2] > http://blog.jteam.nl/2011/04/01/gimme-all-resources-you-have-i-can-use-them/ > > > * LUCENE-2956, LUCENE-2573, LUCENE-2324, LUCENE-2555: Changes from > DocumentsWriterPerThread: > > - IndexWriter now uses a DocumentsWriter per thread when indexing documents. > Each DocumentsWriterPerThread indexes documents in its own private segment, > and the in memory segments are no longer merged on flush. Instead, each > segment is separately flushed to disk and subsequently merged with normal > segment merging. > > - DocumentsWriterPerThread (DWPT) is now flushed concurrently based on a > FlushPolicy. When a DWPT is flushed, a fresh DWPT is swapped in so that > indexing may continue concurrently with flushing. The selected > DWPT flushes all its RAM resident documents do disk. Note: Segment flushes > don't flush all RAM resident documents but only the documents private to > the DWPT selected for flushing. > > - Flushing is now controlled by FlushPolicy that is called for every add, > update or delete on IndexWriter. By default DWPTs are flushed either on > maxBufferedDocs per DWPT or the global active used memory. Once the active > memory exceeds ramBufferSizeMB only the largest DWPT is selected for > flushing and the memory used by this DWPT is substracted from the active > memory and added to a flushing memory pool, which can lead to temporarily > higher memory usage due to ongoing indexing. > > - IndexWriter now can utilize ramBufferSize > 2048 MB. Each DWPT can address > up to 2048 MB memory such that the ramBufferSize is now bounded by the max > number of DWPT avaliable in the used DocumentsWriterPerThreadPool. > IndexWriters net memory consumption can grow far beyond the 2048 MB limit > if > the applicatoin can use all available DWPTs. To prevent a DWPT from > exhausting its address space IndexWriter will forcefully flush a DWPT if > its > hard memory limit is exceeded. The RAMPerThreadHardLimitMB can be > controlled > via IndexWriterConfig and defaults to 1945 MB. > Since IndexWriter flushes DWPT concurrently not all memory is released > immediately. Applications should still use a ramBufferSize significantly > lower than the JVMs avaliable heap memory since under high load multiple > flushing DWPT can consume substantial transient memory when IO performance > is slow relative to indexing rate. > > - IndexWriter#commit now doesn't block concurrent indexing while flushing all > 'currently' RAM resident documents to disk. Yet, flushes that occur while a > a full flush is running are queued and will happen after all DWPT involved > in the full flush are done flushing. Applications using multiple threads > during indexing and trigger a full flush (eg call commmit() or open a new > NRT reader) can use significantly more transient memory. > > - IndexWriter#addDocument and IndexWriter.updateDocument can block indexing > threads if the number of active + number of flushing DWPT exceed a > safety limit. By default this happens if 2 * max number available thread > states (DWPTPool) is exceeded. This safety limit prevents applications from > exhausting their available memory if flushing can't keep up with > concurrently indexing threads. > > - IndexWriter only applies and flushes deletes if the maxBufferedDelTerms > l
[jira] [Updated] (LUCENE-3054) add assert to sorts catch broken comparators in tests
[ https://issues.apache.org/jira/browse/LUCENE-3054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3054: Attachment: LUCENE-3054.patch really ugly prototype... i expect the generics/sort policeman will want to jump in here anyway :) but it does catch that problem: {noformat} [junit] Testsuite: org.apache.lucene.index.TestCodecs [junit] Testcase: testSepPositionAfterMerge(org.apache.lucene.index.TestCodecs):FAILED [junit] insane comparator for: org.apache.lucene.search.PhraseQuery$PostingsAndFreq {noformat} > add assert to sorts catch broken comparators in tests > - > > Key: LUCENE-3054 > URL: https://issues.apache.org/jira/browse/LUCENE-3054 > Project: Lucene - Java > Issue Type: Task >Reporter: Robert Muir > Attachments: LUCENE-3054.patch > > > Looking at Otis's sort problem on the mailing list, he said: > {noformat} > * looked for other places where this call is made - found it in > MultiPhraseQuery$MultiPhraseWeight and changed that call from > ArrayUtil.quickSort to ArrayUtil.mergeSort > * now we no longer see SorterTemplate.quickSort in deep recursion when we do a > thread dump > {noformat} > I thought this was interesting because PostingsAndFreq's comparator > looks like it needs a tiebreaker. > I think in our sorts we should add some asserts to try to catch some of these > broken comparators. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3054) add assert to sorts catch broken comparators in tests
add assert to sorts catch broken comparators in tests - Key: LUCENE-3054 URL: https://issues.apache.org/jira/browse/LUCENE-3054 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-3054.patch Looking at Otis's sort problem on the mailing list, he said: {noformat} * looked for other places where this call is made - found it in MultiPhraseQuery$MultiPhraseWeight and changed that call from ArrayUtil.quickSort to ArrayUtil.mergeSort * now we no longer see SorterTemplate.quickSort in deep recursion when we do a thread dump {noformat} I thought this was interesting because PostingsAndFreq's comparator looks like it needs a tiebreaker. I think in our sorts we should add some asserts to try to catch some of these broken comparators. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3051) don't call SegmentInfo.sizeInBytes for the merging segments
[ https://issues.apache.org/jira/browse/LUCENE-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027095#comment-13027095 ] Michael McCandless commented on LUCENE-3051: Thanks Simon. bq. mike patch looks good but are we sure we are not accessing the 'live' SI somewhere down the path in unsynced context? Well, we do pass the live info to readPool.get, which could then pass it to SegmentReader.get, if the reader was not already pooled. While in theory other threads could change that info (say, if we are applying deletes), I believe readerPool prevents that because if dels are being applied as a merge is kicking off they will share the same reader, and the 2nd call to get will just return that reader. Definitely somewhat iffy though... I'm pretty sure we do not access SI.sizeInBytes elsewhere in IW for these segments being merged... > don't call SegmentInfo.sizeInBytes for the merging segments > --- > > Key: LUCENE-3051 > URL: https://issues.apache.org/jira/browse/LUCENE-3051 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 4.0 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3051.patch > > > Selckin has been running Lucene's tests on the RT branch, and hit this: > {noformat} > [junit] Testsuite: org.apache.lucene.index.TestIndexWriter > [junit] Testcase: > testDeleteAllSlowly(org.apache.lucene.index.TestIndexWriter): FAILED > [junit] Some threads threw uncaught exceptions! > [junit] junit.framework.AssertionFailedError: Some threads threw uncaught > exceptions! > [junit] at > org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:535) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1246) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1175) > [junit] > [junit] > [junit] Tests run: 67, Failures: 1, Errors: 0, Time elapsed: 38.357 sec > [junit] > [junit] - Standard Error - > [junit] NOTE: reproduce with: ant test -Dtestcase=TestIndexWriter > -Dtestmethod=testDeleteAllSlowly > -Dtests.seed=-4291771462012978364:4550117847390778918 > [junit] The following exceptions were thrown by threads: > [junit] *** Thread: Lucene Merge Thread #1 *** > [junit] org.apache.lucene.index.MergePolicy$MergeException: > java.io.FileNotFoundException: _4_1.del > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:472) > [junit] Caused by: java.io.FileNotFoundException: _4_1.del > [junit] at > org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:290) > [junit] at > org.apache.lucene.store.MockDirectoryWrapper.fileLength(MockDirectoryWrapper.java:549) > [junit] at > org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:287) > [junit] at > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3280) > [junit] at > org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2956) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:379) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:447) > [junit] NOTE: test params are: codec=RandomCodecProvider: {=SimpleText, > f6=Pulsing(freqCutoff=15), f7=MockFixedIntBlock(blockSize=1606), > f8=SimpleText, f9=MockSep, f1=MockVariableIntBlock(baseBlockSize=99), > f0=MockFixedIntBlock(blockSize=1606), f3=Pulsing(freqCutoff=15), f2=MockSep, > f5=SimpleText, f4=Standard, f=MockFixedIntBlock(blockSize=1606), c=MockSep, > termVector=MockRandom, d9=MockFixedIntBlock(blockSize=1606), > d8=Pulsing(freqCutoff=15), d5=SimpleText, d4=Standard, d7=MockRandom, > d6=MockVariableIntBlock(baseBlockSize=99), d25=MockRandom, d0=MockRandom, > c29=MockFixedIntBlock(blockSize=1606), > d24=MockVariableIntBlock(baseBlockSize=99), d1=Standard, c28=Standard, > d23=SimpleText, d2=MockFixedIntBlock(blockSize=1606), c27=MockRandom, > d22=Standard, d3=MockVariableIntBlock(baseBlockSize=99), > d21=Pulsing(freqCutoff=15), d20=MockSep, > c22=MockFixedIntBlock(blockSize=1606), c21=Pulsing(freqCutoff=15), > c20=MockRandom, d29=MockFixedIntBlock(blockSize=1606), c26=Standard, > d28=Pulsing(freqCutoff=15), c25=MockRandom, d27=MockRandom, c24=MockSep, > d26=MockVariableIntBlock(baseBlockSize=99), c23=SimpleText, e9=MockRandom, > e8=MockSep, e7=SimpleText, e6=
[jira] [Commented] (LUCENE-3041) Support Query Visting / Walking
[ https://issues.apache.org/jira/browse/LUCENE-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027080#comment-13027080 ] Earwin Burrfoot commented on LUCENE-3041: - I vehemently oppose introducing the "visitor design pattern" (classic double-dispatch version) into the Query API. It is a badly broken replacement (ie, cannot be easily extended) for multiple dispatch. Also, from the looks of it (short IRC discussion), user-written visitors and rewrite() API have totally different aims. - rewrite() is very specific (it is a pre-search preparation that produces runnable query, eg expands multi-term queries into OR sequences or wrapped filters), but should work over any kinds of user-written Queries with possibly exotic behaviours (eg, take rewrite from the cache). Consequently, the logic is tightly coupled to each Query-impl innards. - user-written visitors on the other hand, may have a multitude of purporses (wildly varying logic for node handling + navigation - eg, some may want to see MTQs expanded, and some may not) over relatively fixed number of possible node types. So the best possible solution so far is to keep rewrite() asis - it serves its purporse quite well. And introduce generic reflection-based multiple-dispatch visitor that can walk any kind of hierarchies (eg, in my project I rewrite ASTs to ASTs, ASTs to Queries, and Queries to bags of Terms) so people can transform their query trees. The current patch contains a derivative of [my original version|https://gist.github.com/dfebaf79f5524e6ea8b4]. And here's a [test/example|https://gist.github.com/e5eb67d762be0bce8d28] This visitor keeps all logic on itself and thus cannot replace rewrite(). > Support Query Visting / Walking > --- > > Key: LUCENE-3041 > URL: https://issues.apache.org/jira/browse/LUCENE-3041 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Chris Male >Priority: Minor > Attachments: LUCENE-3041.patch, LUCENE-3041.patch, LUCENE-3041.patch, > LUCENE-3041.patch > > > Out of the discussion in LUCENE-2868, it could be useful to add a generic > Query Visitor / Walker that could be used for more advanced rewriting, > optimizations or anything that requires state to be stored as each Query is > visited. > We could keep the interface very simple: > {code} > public interface QueryVisitor { > Query visit(Query query); > } > {code} > and then use a reflection based visitor like Earwin suggested, which would > allow implementators to provide visit methods for just Querys that they are > interested in. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3051) don't call SegmentInfo.sizeInBytes for the merging segments
[ https://issues.apache.org/jira/browse/LUCENE-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027078#comment-13027078 ] Simon Willnauer commented on LUCENE-3051: - mike patch looks good but are we sure we are not accessing the 'live' SI somewhere down the path in unsynced context? > don't call SegmentInfo.sizeInBytes for the merging segments > --- > > Key: LUCENE-3051 > URL: https://issues.apache.org/jira/browse/LUCENE-3051 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 4.0 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3051.patch > > > Selckin has been running Lucene's tests on the RT branch, and hit this: > {noformat} > [junit] Testsuite: org.apache.lucene.index.TestIndexWriter > [junit] Testcase: > testDeleteAllSlowly(org.apache.lucene.index.TestIndexWriter): FAILED > [junit] Some threads threw uncaught exceptions! > [junit] junit.framework.AssertionFailedError: Some threads threw uncaught > exceptions! > [junit] at > org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:535) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1246) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1175) > [junit] > [junit] > [junit] Tests run: 67, Failures: 1, Errors: 0, Time elapsed: 38.357 sec > [junit] > [junit] - Standard Error - > [junit] NOTE: reproduce with: ant test -Dtestcase=TestIndexWriter > -Dtestmethod=testDeleteAllSlowly > -Dtests.seed=-4291771462012978364:4550117847390778918 > [junit] The following exceptions were thrown by threads: > [junit] *** Thread: Lucene Merge Thread #1 *** > [junit] org.apache.lucene.index.MergePolicy$MergeException: > java.io.FileNotFoundException: _4_1.del > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:472) > [junit] Caused by: java.io.FileNotFoundException: _4_1.del > [junit] at > org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:290) > [junit] at > org.apache.lucene.store.MockDirectoryWrapper.fileLength(MockDirectoryWrapper.java:549) > [junit] at > org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:287) > [junit] at > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3280) > [junit] at > org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2956) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:379) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:447) > [junit] NOTE: test params are: codec=RandomCodecProvider: {=SimpleText, > f6=Pulsing(freqCutoff=15), f7=MockFixedIntBlock(blockSize=1606), > f8=SimpleText, f9=MockSep, f1=MockVariableIntBlock(baseBlockSize=99), > f0=MockFixedIntBlock(blockSize=1606), f3=Pulsing(freqCutoff=15), f2=MockSep, > f5=SimpleText, f4=Standard, f=MockFixedIntBlock(blockSize=1606), c=MockSep, > termVector=MockRandom, d9=MockFixedIntBlock(blockSize=1606), > d8=Pulsing(freqCutoff=15), d5=SimpleText, d4=Standard, d7=MockRandom, > d6=MockVariableIntBlock(baseBlockSize=99), d25=MockRandom, d0=MockRandom, > c29=MockFixedIntBlock(blockSize=1606), > d24=MockVariableIntBlock(baseBlockSize=99), d1=Standard, c28=Standard, > d23=SimpleText, d2=MockFixedIntBlock(blockSize=1606), c27=MockRandom, > d22=Standard, d3=MockVariableIntBlock(baseBlockSize=99), > d21=Pulsing(freqCutoff=15), d20=MockSep, > c22=MockFixedIntBlock(blockSize=1606), c21=Pulsing(freqCutoff=15), > c20=MockRandom, d29=MockFixedIntBlock(blockSize=1606), c26=Standard, > d28=Pulsing(freqCutoff=15), c25=MockRandom, d27=MockRandom, c24=MockSep, > d26=MockVariableIntBlock(baseBlockSize=99), c23=SimpleText, e9=MockRandom, > e8=MockSep, e7=SimpleText, e6=MockFixedIntBlock(blockSize=1606), > e5=Pulsing(freqCutoff=15), c17=MockFixedIntBlock(blockSize=1606), > e3=Standard, d12=MockVariableIntBlock(baseBlockSize=99), > c16=Pulsing(freqCutoff=15), e4=SimpleText, > d11=MockFixedIntBlock(blockSize=1606), c19=MockSep, e1=MockSep, > d14=Pulsing(freqCutoff=15), c18=SimpleText, e2=Pulsing(freqCutoff=15), > d13=MockSep, e0=MockVariableIntBlock(baseBlockSize=99), d10=Standard, > d19=MockVariableIntBlock(baseBlockSize=99), c11=SimpleText, c10=Standard, > d16=Pulsing(freqCutoff=15), c13=MockRandom, > c12=MockVariab
[jira] [Updated] (LUCENE-3051) don't call SegmentInfo.sizeInBytes for the merging segments
[ https://issues.apache.org/jira/browse/LUCENE-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3051: --- Attachment: LUCENE-3051.patch Moves the computation of estimatedMergeBytes into mergeInit (sync'd on IW so it's safe to access the SI). > don't call SegmentInfo.sizeInBytes for the merging segments > --- > > Key: LUCENE-3051 > URL: https://issues.apache.org/jira/browse/LUCENE-3051 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 4.0 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Fix For: 3.2, 4.0 > > Attachments: LUCENE-3051.patch > > > Selckin has been running Lucene's tests on the RT branch, and hit this: > {noformat} > [junit] Testsuite: org.apache.lucene.index.TestIndexWriter > [junit] Testcase: > testDeleteAllSlowly(org.apache.lucene.index.TestIndexWriter): FAILED > [junit] Some threads threw uncaught exceptions! > [junit] junit.framework.AssertionFailedError: Some threads threw uncaught > exceptions! > [junit] at > org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:535) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1246) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1175) > [junit] > [junit] > [junit] Tests run: 67, Failures: 1, Errors: 0, Time elapsed: 38.357 sec > [junit] > [junit] - Standard Error - > [junit] NOTE: reproduce with: ant test -Dtestcase=TestIndexWriter > -Dtestmethod=testDeleteAllSlowly > -Dtests.seed=-4291771462012978364:4550117847390778918 > [junit] The following exceptions were thrown by threads: > [junit] *** Thread: Lucene Merge Thread #1 *** > [junit] org.apache.lucene.index.MergePolicy$MergeException: > java.io.FileNotFoundException: _4_1.del > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:472) > [junit] Caused by: java.io.FileNotFoundException: _4_1.del > [junit] at > org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:290) > [junit] at > org.apache.lucene.store.MockDirectoryWrapper.fileLength(MockDirectoryWrapper.java:549) > [junit] at > org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:287) > [junit] at > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3280) > [junit] at > org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2956) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:379) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:447) > [junit] NOTE: test params are: codec=RandomCodecProvider: {=SimpleText, > f6=Pulsing(freqCutoff=15), f7=MockFixedIntBlock(blockSize=1606), > f8=SimpleText, f9=MockSep, f1=MockVariableIntBlock(baseBlockSize=99), > f0=MockFixedIntBlock(blockSize=1606), f3=Pulsing(freqCutoff=15), f2=MockSep, > f5=SimpleText, f4=Standard, f=MockFixedIntBlock(blockSize=1606), c=MockSep, > termVector=MockRandom, d9=MockFixedIntBlock(blockSize=1606), > d8=Pulsing(freqCutoff=15), d5=SimpleText, d4=Standard, d7=MockRandom, > d6=MockVariableIntBlock(baseBlockSize=99), d25=MockRandom, d0=MockRandom, > c29=MockFixedIntBlock(blockSize=1606), > d24=MockVariableIntBlock(baseBlockSize=99), d1=Standard, c28=Standard, > d23=SimpleText, d2=MockFixedIntBlock(blockSize=1606), c27=MockRandom, > d22=Standard, d3=MockVariableIntBlock(baseBlockSize=99), > d21=Pulsing(freqCutoff=15), d20=MockSep, > c22=MockFixedIntBlock(blockSize=1606), c21=Pulsing(freqCutoff=15), > c20=MockRandom, d29=MockFixedIntBlock(blockSize=1606), c26=Standard, > d28=Pulsing(freqCutoff=15), c25=MockRandom, d27=MockRandom, c24=MockSep, > d26=MockVariableIntBlock(baseBlockSize=99), c23=SimpleText, e9=MockRandom, > e8=MockSep, e7=SimpleText, e6=MockFixedIntBlock(blockSize=1606), > e5=Pulsing(freqCutoff=15), c17=MockFixedIntBlock(blockSize=1606), > e3=Standard, d12=MockVariableIntBlock(baseBlockSize=99), > c16=Pulsing(freqCutoff=15), e4=SimpleText, > d11=MockFixedIntBlock(blockSize=1606), c19=MockSep, e1=MockSep, > d14=Pulsing(freqCutoff=15), c18=SimpleText, e2=Pulsing(freqCutoff=15), > d13=MockSep, e0=MockVariableIntBlock(baseBlockSize=99), d10=Standard, > d19=MockVariableIntBlock(baseBlockSize=99), c11=SimpleText, c10=Standard, > d16=Pulsing(freqCutoff=15), c13=MockRandom, > c12=MockVariableIntBlock(baseBlockSize=99),
[jira] [Commented] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027058#comment-13027058 ] Michael Busch commented on LUCENE-3023: --- Just wanted to say: you guys totally rock! Great teamwork here with all the work involved of getting the branch merged back. I'm sorry I couldn't help much in the last few weeks. > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023-svn-diff.patch, > LUCENE-3023-ws-changes.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023_CHANGES.patch, > LUCENE-3023_CHANGES.patch, LUCENE-3023_iw_iwc_jdoc.patch, > LUCENE-3023_simonw_review.patch, LUCENE-3023_svndiff.patch, > LUCENE-3023_svndiff.patch, diffMccand.py, diffSources.patch, > diffSources.patch, realtime-TestAddIndexes-3.txt, > realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3041) Support Query Visting / Walking
[ https://issues.apache.org/jira/browse/LUCENE-3041?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13027054#comment-13027054 ] David Smiley commented on LUCENE-3041: -- Yes! I enthusiastically support introducing the visitor design pattern into the Query api. I've polled the community on this before and got positive responses from a few committers but I haven't yet had the time to do anything. It's great to see you've gotten the ball rolling Chris. I haven't looked at your patch yet. Query.rewrite() is definitely a candidate for reworking in terms of this new pattern. > Support Query Visting / Walking > --- > > Key: LUCENE-3041 > URL: https://issues.apache.org/jira/browse/LUCENE-3041 > Project: Lucene - Java > Issue Type: Improvement > Components: Search >Reporter: Chris Male >Priority: Minor > Attachments: LUCENE-3041.patch, LUCENE-3041.patch, LUCENE-3041.patch, > LUCENE-3041.patch > > > Out of the discussion in LUCENE-2868, it could be useful to add a generic > Query Visitor / Walker that could be used for more advanced rewriting, > optimizations or anything that requires state to be stored as each Query is > visited. > We could keep the interface very simple: > {code} > public interface QueryVisitor { > Query visit(Query query); > } > {code} > and then use a reflection based visitor like Earwin suggested, which would > allow implementators to provide visit methods for just Querys that they are > interested in. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3053) improve test coverage for Multi*
[ https://issues.apache.org/jira/browse/LUCENE-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3053: Attachment: LUCENE-3053.patch Updated patch, fixes another false fail in xml-query-parser (http://www.selckin.be/trunk-3053-p2-0.txt) > improve test coverage for Multi* > > > Key: LUCENE-3053 > URL: https://issues.apache.org/jira/browse/LUCENE-3053 > Project: Lucene - Java > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-3053.patch, LUCENE-3053.patch, LUCENE-3053.patch > > > It seems like an easy win that when the test calls newSearcher(), > it should sometimes wrap the reader with a SlowMultiReaderWrapper. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3052) PerFieldCodecWrapper.loadTermsIndex concurrency problem
[ https://issues.apache.org/jira/browse/LUCENE-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-3052. Resolution: Fixed Committed a missing sync'd in the test's codec. > PerFieldCodecWrapper.loadTermsIndex concurrency problem > --- > > Key: LUCENE-3052 > URL: https://issues.apache.org/jira/browse/LUCENE-3052 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 4.0 >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > > Selckin's while(1) testing on RT branch hit another error: > {noformat} > [junit] Testsuite: org.apache.lucene.TestExternalCodecs > [junit] Testcase: > testPerFieldCodec(org.apache.lucene.TestExternalCodecs):Caused an > ERROR > [junit] (null) > [junit] java.lang.NullPointerException > [junit] at > org.apache.lucene.index.PerFieldCodecWrapper$FieldsReader.loadTermsIndex(PerFieldCodecWrapper.java:202) > [junit] at > org.apache.lucene.index.SegmentReader.loadTermsIndex(SegmentReader.java:1005) > [junit] at > org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:652) > [junit] at > org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:609) > [junit] at > org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:276) > [junit] at > org.apache.lucene.index.IndexWriter.applyAllDeletes(IndexWriter.java:2660) > [junit] at > org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:2651) > [junit] at > org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:381) > [junit] at > org.apache.lucene.index.IndexReader.open(IndexReader.java:316) > [junit] at > org.apache.lucene.TestExternalCodecs.testPerFieldCodec(TestExternalCodecs.java:541) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1246) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1175) > [junit] > [junit] > [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.909 sec > [junit] > [junit] - Standard Error - > [junit] NOTE: reproduce with: ant test -Dtestcase=TestExternalCodecs > -Dtestmethod=testPerFieldCodec > -Dtests.seed=-7296204858082494534:5010909751437000758 > [junit] WARNING: test method: 'testPerFieldCodec' left thread running: > merge thread: _i(4.0):Cv130 _m(4.0):Cv30 _n(4.0):cv10 into _o > [junit] RESOURCE LEAK: test method: 'testPerFieldCodec' left 1 thread(s) > running > [junit] NOTE: test params are: codec=PreFlex, locale=zh_TW, > timezone=America/Santo_Domingo > [junit] NOTE: all tests run in this JVM: > [junit] [TestDemo, TestExternalCodecs] > [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 > (64-bit)/cpus=8,threads=2,free=104153512,total=125632512 > [junit] - --- > [junit] TEST org.apache.lucene.TestExternalCodecs FAILED > [junit] Exception in thread "Lucene Merge Thread #5" > org.apache.lucene.util.ThreadInterruptedException: > java.lang.InterruptedException: sleep interrupted > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:505) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:472) > [junit] Caused by: java.lang.InterruptedException: sleep interrupted > [junit] at java.lang.Thread.sleep(Native Method) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:503) > [junit] ... 1 more > {noformat} > I suspect this is also a trunk issue, but I can't reproduce it yet. > I think this is happening because the codecs HashMap is changing (via another > thread), while .loadTermsIndex is called. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3053) improve test coverage for Multi*
[ https://issues.apache.org/jira/browse/LUCENE-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3053: Attachment: LUCENE-3053.patch Update patch: fixes false fail in TestMatchAllDocsQuery found by selckin: http://www.selckin.be/trunk-3053-0.txt > improve test coverage for Multi* > > > Key: LUCENE-3053 > URL: https://issues.apache.org/jira/browse/LUCENE-3053 > Project: Lucene - Java > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-3053.patch, LUCENE-3053.patch > > > It seems like an easy win that when the test calls newSearcher(), > it should sometimes wrap the reader with a SlowMultiReaderWrapper. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3053) improve test coverage for Multi*
[ https://issues.apache.org/jira/browse/LUCENE-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026995#comment-13026995 ] Robert Muir commented on LUCENE-3053: - Ah please ignore that one: pretty sure this one is LUCENE-3025/LUCENE-2991 all over again... it fails on trunk too. > improve test coverage for Multi* > > > Key: LUCENE-3053 > URL: https://issues.apache.org/jira/browse/LUCENE-3053 > Project: Lucene - Java > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-3053.patch > > > It seems like an easy win that when the test calls newSearcher(), > it should sometimes wrap the reader with a SlowMultiReaderWrapper. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3053) improve test coverage for Multi*
[ https://issues.apache.org/jira/browse/LUCENE-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026993#comment-13026993 ] Robert Muir commented on LUCENE-3053: - I did hit one fail: {noformat} ant test -Dtestcase=TestIndexWriterExceptions -Dtestmethod=testExceptionsDuringCommit -Dtests.seed=-2996541401386755449:-7422779128529852458 {noformat} Not sure if its windows-only, and likely unrelated, but for the seed to work you probably need to apply this patch... > improve test coverage for Multi* > > > Key: LUCENE-3053 > URL: https://issues.apache.org/jira/browse/LUCENE-3053 > Project: Lucene - Java > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-3053.patch > > > It seems like an easy win that when the test calls newSearcher(), > it should sometimes wrap the reader with a SlowMultiReaderWrapper. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3023: -- Attachment: LUCENE-3023-ws-changes.patch Here finally all whitespace changes in one patch. They will be committed, but are not included in the main patch. > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023-svn-diff.patch, > LUCENE-3023-ws-changes.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023_CHANGES.patch, > LUCENE-3023_CHANGES.patch, LUCENE-3023_iw_iwc_jdoc.patch, > LUCENE-3023_simonw_review.patch, LUCENE-3023_svndiff.patch, > LUCENE-3023_svndiff.patch, diffMccand.py, diffSources.patch, > diffSources.patch, realtime-TestAddIndexes-3.txt, > realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3053) improve test coverage for Multi*
[ https://issues.apache.org/jira/browse/LUCENE-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir updated LUCENE-3053: Attachment: LUCENE-3053.patch Here's a patch, I think i fixed the various false fails, but it would be good to 'beast' the tests a few times to see if there are any left. Also tried to make TestRegexpRandom2 meaner... > improve test coverage for Multi* > > > Key: LUCENE-3053 > URL: https://issues.apache.org/jira/browse/LUCENE-3053 > Project: Lucene - Java > Issue Type: Task >Reporter: Robert Muir >Assignee: Robert Muir > Fix For: 4.0 > > Attachments: LUCENE-3053.patch > > > It seems like an easy win that when the test calls newSearcher(), > it should sometimes wrap the reader with a SlowMultiReaderWrapper. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3053) improve test coverage for Multi*
improve test coverage for Multi* Key: LUCENE-3053 URL: https://issues.apache.org/jira/browse/LUCENE-3053 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.0 Attachments: LUCENE-3053.patch It seems like an easy win that when the test calls newSearcher(), it should sometimes wrap the reader with a SlowMultiReaderWrapper. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3023: -- Attachment: LUCENE-3023-svn-diff.patch Here is the final SVN diff. To work around some itches with SVN, the following was done: - reverted everything outside lucene sub folder - used the previously created manual diff to get a list of all changed files (using patchutils command lsdiff) - used "svn -q status | sed 's/^//' > ../svn-files.txt" to get all files affected after merge - intersect both files (lsdiff and svn status one) to find all files that are in reality unchanged, but still affected by SVN (these are all files that were added after branching - this is a known limitation of SVN. Files added after branching are "replaced" by merge reintegrate, loosing all history). Store those files in unchanged.txt - use the intersected filelist and revert everything: cat ../unchanged.txt | xargs svn revert - finally do a record-only merge again to fix mergeprops reverted by the previous revert My checkout is now ready to commit. If we have some minor problems with the patch, please wait with fixing after commit. If there are serious problems, we can fix them in realtime branch and merge manuall (I can do that later). > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023-svn-diff.patch, LUCENE-3023.patch, > LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023_CHANGES.patch, LUCENE-3023_CHANGES.patch, > LUCENE-3023_iw_iwc_jdoc.patch, LUCENE-3023_simonw_review.patch, > LUCENE-3023_svndiff.patch, LUCENE-3023_svndiff.patch, diffMccand.py, > diffSources.patch, diffSources.patch, realtime-TestAddIndexes-3.txt, > realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[HUDSON] Lucene-Solr-tests-only-3.x - Build # 7550 - Failure
Build: https://builds.apache.org/hudson/job/Lucene-Solr-tests-only-3.x/7550/ 1 tests failed. REGRESSION: org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe Error Message: Java heap space Stack Trace: java.lang.OutOfMemoryError: Java heap space at java.util.Arrays.copyOf(Arrays.java:2894) at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:117) at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:589) at java.lang.StringBuffer.append(StringBuffer.java:337) at java.text.RuleBasedCollator.getCollationKey(RuleBasedCollator.java:617) at org.apache.lucene.collation.CollationKeyFilter.incrementToken(CollationKeyFilter.java:93) at org.apache.lucene.collation.CollationTestBase.assertThreadSafe(CollationTestBase.java:304) at org.apache.lucene.collation.TestCollationKeyAnalyzer.testThreadSafe(TestCollationKeyAnalyzer.java:89) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1091) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1023) Build Log (for compile errors): [...truncated 9240 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Uwe Schindler updated LUCENE-3023: -- Attachment: LUCENE-3023.patch I merged the freezed branch again. Attached is a first patch for reviewing code changes (not SVN diff), created by the following command between 2 fresh checkouts, one of them "svn merge --reintegrate": {noformat} diff -urNb --strip-trailing-cr trunk-lusolr1 trunk-lusolr2 | filterdiff -x "*.svn*" --strip 1 --clean > LUCENE-3023.patch {noformat} This patch is not intended to be applied, its more to verify the changes (therefore all whitespace changes created by merging were excluded). > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023.patch, LUCENE-3023_CHANGES.patch, LUCENE-3023_CHANGES.patch, > LUCENE-3023_iw_iwc_jdoc.patch, LUCENE-3023_simonw_review.patch, > LUCENE-3023_svndiff.patch, LUCENE-3023_svndiff.patch, diffMccand.py, > diffSources.patch, diffSources.patch, realtime-TestAddIndexes-3.txt, > realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Highlighting in fields other than content
I think your issue may depend on the keyword field having stored="false" or the field type not defining a Tokenizer. You may find the following useful: http://wiki.apache.org/solr/FieldOptionsByUseCase My 2 cents, Tommaso 2011/4/29 Pavel Kukačka > Hello, > >I've got a (probably trivial) issue I can't resolve with Solr 3.1: > I have a document with common fields (title, keywords, content) and I'm > trying to use highlighting. >With the content field there is no problem; works normally. However, > when I search for a document via its keyword, the document is found, but > the response doesn't have the highlighted snippet - there is only an > empty node - like this: > ** > . > . > . > > > > > > > As for the highlighting params, I have set: >hl=on >hl.fl=* > > > If I just substitute the searchterm for something from the content, the > resulting response is fine - like this: > > . > . > . > > > > ustanovení těchto VOP, ZOP, smlouvy či id="highlighting">družstvem na straně jedné a > Klienty na straně druhé > na jiného > > . > . > . > > > Does anyone see what I've omitted? > > Cheers, > Pavel > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >
Re: Highlighting in fields other than content
What is your field definition for "keyword"? In particular, is it stored? This page might help. http://wiki.apache.org/solr/FieldOptionsByUseCase?highlight=%28termvector%29|%28retrieve%29|%28contents%29 Best Erick On Fri, Apr 29, 2011 at 8:56 AM, Pavel Kukačka wrote: > Hello, > > I've got a (probably trivial) issue I can't resolve with Solr 3.1: > I have a document with common fields (title, keywords, content) and I'm > trying to use highlighting. > With the content field there is no problem; works normally. However, > when I search for a document via its keyword, the document is found, but > the response doesn't have the highlighted snippet - there is only an > empty node - like this: > ** > . > . > . > > > > > > > As for the highlighting params, I have set: > hl=on > hl.fl=* > > > If I just substitute the searchterm for something from the content, the > resulting response is fine - like this: > > . > . > . > > > > ustanovení těchto VOP, ZOP, smlouvy či id="highlighting">družstvem na straně jedné a > Klienty na straně druhé > na jiného > > . > . > . > > > Does anyone see what I've omitted? > > Cheers, > Pavel > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Highlighting in fields other than content
Hello, I've got a (probably trivial) issue I can't resolve with Solr 3.1: I have a document with common fields (title, keywords, content) and I'm trying to use highlighting. With the content field there is no problem; works normally. However, when I search for a document via its keyword, the document is found, but the response doesn't have the highlighted snippet - there is only an empty node - like this: ** . . . As for the highlighting params, I have set: hl=on hl.fl=* If I just substitute the searchterm for something from the content, the resulting response is fine - like this: . . . ustanovení těchto VOP, ZOP, smlouvy či družstvem na straně jedné a Klienty na straně druhé na jiného . . . Does anyone see what I've omitted? Cheers, Pavel - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2471) Localparams not working with 2 fq parameters using qt=name
[ https://issues.apache.org/jira/browse/SOLR-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026951#comment-13026951 ] Yonik Seeley commented on SOLR-2471: bq. Is it possible to have two QT parameters in the same call to Solr? Nope, see the response from Hoss: "qt selects the request handler used, but when local params are parsed the handler is has already been choosen (there is only one handler per request)" > Localparams not working with 2 fq parameters using qt=name > -- > > Key: SOLR-2471 > URL: https://issues.apache.org/jira/browse/SOLR-2471 > Project: Solr > Issue Type: Bug >Reporter: Bill Bell > > We are having a problem with the following query. If we have two localparams > (using fq) and use QT= it does not work. > This does not find any results: > http://localhost:8983/solr/provs/select?qname=john&qspec=dent&fq={!type=dismax > qt=namespec v=$qspec}&fq={!type=dismax qt=dismaxname > v=$qname}&q=_val_:"{!type=dismax qt=namespec v=$qspec}" _val_:"{!type=dismax > qt=dismaxname > v=$qname}"&fl=specialties_desc,score,hgid,specialties_search,specialties_ngram,first_middle_last_name&wt=csv&facet=true&facet.field=specialties_desc&sort=score > desc&rows=1000&start=0 > This works okay. It returns a few results. > http://localhost:8983/solr/provs/select?qname=john&qspec=dent&fq={!type=dismax > qf=$qqf v=$qspec}&fq={!type=dismax qt=dismaxname > v=$qname}&q=_val_:"{!type=dismax qf=$qqf v=$qspec}" _val_:"{!type=dismax > qt=dismaxname v=$qname}" &qqf=specialties_ngram^1.0 > specialties_search^2.0&fl=specialties_desc,score,hgid,specialties_search,specialties_ngram,first_middle_last_name&wt=csv&facet=true&facet.field=specialties_desc&sort=score > desc&rows=1000&start=0 > We would like to use a QT for both terms but it seems there is some kind of > bug when using two localparams and dismax filters with QT. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-3052) PerFieldCodecWrapper.loadTermsIndex concurrency problem
[ https://issues.apache.org/jira/browse/LUCENE-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-3052: -- Assignee: Michael McCandless > PerFieldCodecWrapper.loadTermsIndex concurrency problem > --- > > Key: LUCENE-3052 > URL: https://issues.apache.org/jira/browse/LUCENE-3052 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 4.0 >Reporter: Michael McCandless >Assignee: Michael McCandless > Fix For: 4.0 > > > Selckin's while(1) testing on RT branch hit another error: > {noformat} > [junit] Testsuite: org.apache.lucene.TestExternalCodecs > [junit] Testcase: > testPerFieldCodec(org.apache.lucene.TestExternalCodecs):Caused an > ERROR > [junit] (null) > [junit] java.lang.NullPointerException > [junit] at > org.apache.lucene.index.PerFieldCodecWrapper$FieldsReader.loadTermsIndex(PerFieldCodecWrapper.java:202) > [junit] at > org.apache.lucene.index.SegmentReader.loadTermsIndex(SegmentReader.java:1005) > [junit] at > org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:652) > [junit] at > org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:609) > [junit] at > org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:276) > [junit] at > org.apache.lucene.index.IndexWriter.applyAllDeletes(IndexWriter.java:2660) > [junit] at > org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:2651) > [junit] at > org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:381) > [junit] at > org.apache.lucene.index.IndexReader.open(IndexReader.java:316) > [junit] at > org.apache.lucene.TestExternalCodecs.testPerFieldCodec(TestExternalCodecs.java:541) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1246) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1175) > [junit] > [junit] > [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.909 sec > [junit] > [junit] - Standard Error - > [junit] NOTE: reproduce with: ant test -Dtestcase=TestExternalCodecs > -Dtestmethod=testPerFieldCodec > -Dtests.seed=-7296204858082494534:5010909751437000758 > [junit] WARNING: test method: 'testPerFieldCodec' left thread running: > merge thread: _i(4.0):Cv130 _m(4.0):Cv30 _n(4.0):cv10 into _o > [junit] RESOURCE LEAK: test method: 'testPerFieldCodec' left 1 thread(s) > running > [junit] NOTE: test params are: codec=PreFlex, locale=zh_TW, > timezone=America/Santo_Domingo > [junit] NOTE: all tests run in this JVM: > [junit] [TestDemo, TestExternalCodecs] > [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 > (64-bit)/cpus=8,threads=2,free=104153512,total=125632512 > [junit] - --- > [junit] TEST org.apache.lucene.TestExternalCodecs FAILED > [junit] Exception in thread "Lucene Merge Thread #5" > org.apache.lucene.util.ThreadInterruptedException: > java.lang.InterruptedException: sleep interrupted > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:505) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:472) > [junit] Caused by: java.lang.InterruptedException: sleep interrupted > [junit] at java.lang.Thread.sleep(Native Method) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:503) > [junit] ... 1 more > {noformat} > I suspect this is also a trunk issue, but I can't reproduce it yet. > I think this is happening because the codecs HashMap is changing (via another > thread), while .loadTermsIndex is called. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3052) PerFieldCodecWrapper.loadTermsIndex concurrency problem
[ https://issues.apache.org/jira/browse/LUCENE-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026950#comment-13026950 ] Michael McCandless commented on LUCENE-3052: This repro line seems to work: {noformat} ant test-core -Dtestcase=TestExternalCodecs -Dtests.seed=-7296204858082494534:5010909751437000758 -Dtests.iter=200 -Dtests.iter.min=1 {noformat} > PerFieldCodecWrapper.loadTermsIndex concurrency problem > --- > > Key: LUCENE-3052 > URL: https://issues.apache.org/jira/browse/LUCENE-3052 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 4.0 >Reporter: Michael McCandless > Fix For: 4.0 > > > Selckin's while(1) testing on RT branch hit another error: > {noformat} > [junit] Testsuite: org.apache.lucene.TestExternalCodecs > [junit] Testcase: > testPerFieldCodec(org.apache.lucene.TestExternalCodecs):Caused an > ERROR > [junit] (null) > [junit] java.lang.NullPointerException > [junit] at > org.apache.lucene.index.PerFieldCodecWrapper$FieldsReader.loadTermsIndex(PerFieldCodecWrapper.java:202) > [junit] at > org.apache.lucene.index.SegmentReader.loadTermsIndex(SegmentReader.java:1005) > [junit] at > org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:652) > [junit] at > org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:609) > [junit] at > org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:276) > [junit] at > org.apache.lucene.index.IndexWriter.applyAllDeletes(IndexWriter.java:2660) > [junit] at > org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:2651) > [junit] at > org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:381) > [junit] at > org.apache.lucene.index.IndexReader.open(IndexReader.java:316) > [junit] at > org.apache.lucene.TestExternalCodecs.testPerFieldCodec(TestExternalCodecs.java:541) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1246) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1175) > [junit] > [junit] > [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.909 sec > [junit] > [junit] - Standard Error - > [junit] NOTE: reproduce with: ant test -Dtestcase=TestExternalCodecs > -Dtestmethod=testPerFieldCodec > -Dtests.seed=-7296204858082494534:5010909751437000758 > [junit] WARNING: test method: 'testPerFieldCodec' left thread running: > merge thread: _i(4.0):Cv130 _m(4.0):Cv30 _n(4.0):cv10 into _o > [junit] RESOURCE LEAK: test method: 'testPerFieldCodec' left 1 thread(s) > running > [junit] NOTE: test params are: codec=PreFlex, locale=zh_TW, > timezone=America/Santo_Domingo > [junit] NOTE: all tests run in this JVM: > [junit] [TestDemo, TestExternalCodecs] > [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 > (64-bit)/cpus=8,threads=2,free=104153512,total=125632512 > [junit] - --- > [junit] TEST org.apache.lucene.TestExternalCodecs FAILED > [junit] Exception in thread "Lucene Merge Thread #5" > org.apache.lucene.util.ThreadInterruptedException: > java.lang.InterruptedException: sleep interrupted > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:505) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:472) > [junit] Caused by: java.lang.InterruptedException: sleep interrupted > [junit] at java.lang.Thread.sleep(Native Method) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:503) > [junit] ... 1 more > {noformat} > I suspect this is also a trunk issue, but I can't reproduce it yet. > I think this is happening because the codecs HashMap is changing (via another > thread), while .loadTermsIndex is called. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3052) PerFieldCodecWrapper.loadTermsIndex concurrency problem
[ https://issues.apache.org/jira/browse/LUCENE-3052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3052: --- Affects Version/s: 4.0 Fix Version/s: 4.0 > PerFieldCodecWrapper.loadTermsIndex concurrency problem > --- > > Key: LUCENE-3052 > URL: https://issues.apache.org/jira/browse/LUCENE-3052 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 4.0 >Reporter: Michael McCandless > Fix For: 4.0 > > > Selckin's while(1) testing on RT branch hit another error: > {noformat} > [junit] Testsuite: org.apache.lucene.TestExternalCodecs > [junit] Testcase: > testPerFieldCodec(org.apache.lucene.TestExternalCodecs):Caused an > ERROR > [junit] (null) > [junit] java.lang.NullPointerException > [junit] at > org.apache.lucene.index.PerFieldCodecWrapper$FieldsReader.loadTermsIndex(PerFieldCodecWrapper.java:202) > [junit] at > org.apache.lucene.index.SegmentReader.loadTermsIndex(SegmentReader.java:1005) > [junit] at > org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:652) > [junit] at > org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:609) > [junit] at > org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:276) > [junit] at > org.apache.lucene.index.IndexWriter.applyAllDeletes(IndexWriter.java:2660) > [junit] at > org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:2651) > [junit] at > org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:381) > [junit] at > org.apache.lucene.index.IndexReader.open(IndexReader.java:316) > [junit] at > org.apache.lucene.TestExternalCodecs.testPerFieldCodec(TestExternalCodecs.java:541) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1246) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1175) > [junit] > [junit] > [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.909 sec > [junit] > [junit] - Standard Error - > [junit] NOTE: reproduce with: ant test -Dtestcase=TestExternalCodecs > -Dtestmethod=testPerFieldCodec > -Dtests.seed=-7296204858082494534:5010909751437000758 > [junit] WARNING: test method: 'testPerFieldCodec' left thread running: > merge thread: _i(4.0):Cv130 _m(4.0):Cv30 _n(4.0):cv10 into _o > [junit] RESOURCE LEAK: test method: 'testPerFieldCodec' left 1 thread(s) > running > [junit] NOTE: test params are: codec=PreFlex, locale=zh_TW, > timezone=America/Santo_Domingo > [junit] NOTE: all tests run in this JVM: > [junit] [TestDemo, TestExternalCodecs] > [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 > (64-bit)/cpus=8,threads=2,free=104153512,total=125632512 > [junit] - --- > [junit] TEST org.apache.lucene.TestExternalCodecs FAILED > [junit] Exception in thread "Lucene Merge Thread #5" > org.apache.lucene.util.ThreadInterruptedException: > java.lang.InterruptedException: sleep interrupted > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:505) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:472) > [junit] Caused by: java.lang.InterruptedException: sleep interrupted > [junit] at java.lang.Thread.sleep(Native Method) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:503) > [junit] ... 1 more > {noformat} > I suspect this is also a trunk issue, but I can't reproduce it yet. > I think this is happening because the codecs HashMap is changing (via another > thread), while .loadTermsIndex is called. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3052) PerFieldCodecWrapper.loadTermsIndex concurrency problem
PerFieldCodecWrapper.loadTermsIndex concurrency problem --- Key: LUCENE-3052 URL: https://issues.apache.org/jira/browse/LUCENE-3052 Project: Lucene - Java Issue Type: Bug Reporter: Michael McCandless Selckin's while(1) testing on RT branch hit another error: {noformat} [junit] Testsuite: org.apache.lucene.TestExternalCodecs [junit] Testcase: testPerFieldCodec(org.apache.lucene.TestExternalCodecs): Caused an ERROR [junit] (null) [junit] java.lang.NullPointerException [junit] at org.apache.lucene.index.PerFieldCodecWrapper$FieldsReader.loadTermsIndex(PerFieldCodecWrapper.java:202) [junit] at org.apache.lucene.index.SegmentReader.loadTermsIndex(SegmentReader.java:1005) [junit] at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:652) [junit] at org.apache.lucene.index.IndexWriter$ReaderPool.get(IndexWriter.java:609) [junit] at org.apache.lucene.index.BufferedDeletesStream.applyDeletes(BufferedDeletesStream.java:276) [junit] at org.apache.lucene.index.IndexWriter.applyAllDeletes(IndexWriter.java:2660) [junit] at org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:2651) [junit] at org.apache.lucene.index.IndexWriter.getReader(IndexWriter.java:381) [junit] at org.apache.lucene.index.IndexReader.open(IndexReader.java:316) [junit] at org.apache.lucene.TestExternalCodecs.testPerFieldCodec(TestExternalCodecs.java:541) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1246) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1175) [junit] [junit] [junit] Tests run: 1, Failures: 0, Errors: 1, Time elapsed: 0.909 sec [junit] [junit] - Standard Error - [junit] NOTE: reproduce with: ant test -Dtestcase=TestExternalCodecs -Dtestmethod=testPerFieldCodec -Dtests.seed=-7296204858082494534:5010909751437000758 [junit] WARNING: test method: 'testPerFieldCodec' left thread running: merge thread: _i(4.0):Cv130 _m(4.0):Cv30 _n(4.0):cv10 into _o [junit] RESOURCE LEAK: test method: 'testPerFieldCodec' left 1 thread(s) running [junit] NOTE: test params are: codec=PreFlex, locale=zh_TW, timezone=America/Santo_Domingo [junit] NOTE: all tests run in this JVM: [junit] [TestDemo, TestExternalCodecs] [junit] NOTE: Linux 2.6.37-gentoo amd64/Sun Microsystems Inc. 1.6.0_25 (64-bit)/cpus=8,threads=2,free=104153512,total=125632512 [junit] - --- [junit] TEST org.apache.lucene.TestExternalCodecs FAILED [junit] Exception in thread "Lucene Merge Thread #5" org.apache.lucene.util.ThreadInterruptedException: java.lang.InterruptedException: sleep interrupted [junit] at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:505) [junit] at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:472) [junit] Caused by: java.lang.InterruptedException: sleep interrupted [junit] at java.lang.Thread.sleep(Native Method) [junit] at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:503) [junit] ... 1 more {noformat} I suspect this is also a trunk issue, but I can't reproduce it yet. I think this is happening because the codecs HashMap is changing (via another thread), while .loadTermsIndex is called. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: jira issues falling off the radar -- "Next" JIRA version
On Fri, Apr 29, 2011 at 12:12 AM, David Smiley (@MITRE.org) wrote: > (Comments on SOLR-2191 between Mark & I were starting to get off-topic with > respect to the issue so I am continuing the conversation here) > > A lot of JIRA issues seem to fall off the radar, IMO. I'm talking about > issues that have patches and are basically ready to go. There are multiple > ways to address this but at the moment I am going to just bring up one. > Looking at the versions in JIRA one can assign an issue to > https://issues.apache.org/jira/browse/SOLR#selectedTab=com.atlassian.jira.plugin.system.project%3Aversions-panel > I see the version named "Next", with this description: "Placeholder for > commiters to track issues that are not ready to commit, but seem close > enough to being ready to warrant focus before the next feature release". > This version and what it implies is a common pattern in use of JIRA that I > too use for projects I manage for my employer. It appears that for the 3.1 > release, nobody looked through the issues assigned to "Next", and > consequently, some issues like SOLR-2191 were forgotten despite being ready > to go. Looking through the wiki I see information on how to do a release > http://wiki.apache.org/solr/HowToRelease and release suggestions but no > information on what to do in advance of a release. I also don't see any > administrative tasks on managing the "Next" version in JIRA. So I think > either the "Next" version should be used effectively, or if that isn't going > to happen then delete this version. I agree Next is dangerous! It'd be nice if Jira could auto-magically treat Next as whatever release really is "next". EG, say we all agree 3.2 is our next release, then ideally Jira would treat all Next issues as if they were marked with 3.2. But... lacking that, maybe we really shouldn't use Next at all, and just use 3.2? Having to step through these issues and move them to the next release on releasing is also healthy, ie, it's good that we see/review them, think about why we didn't get it done on the current release, etc. > On a related note, I don't know what to make of the "1.5" version, nor what > to make of issues marked as Closed for "Next". Some house cleaning is in > order. We should clean these up. Should we just roll them over to 3.2? Mike http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2104) DIH special command $deleteDocById dosn't skip the document and doesn't increment the deleted statistics
[ https://issues.apache.org/jira/browse/SOLR-2104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026939#comment-13026939 ] Juan Pablo Mora commented on SOLR-2104: --- In Solr 3.1 doesn't update the statistics also. I think is a bug. > DIH special command $deleteDocById dosn't skip the document and doesn't > increment the deleted statistics > > > Key: SOLR-2104 > URL: https://issues.apache.org/jira/browse/SOLR-2104 > Project: Solr > Issue Type: Bug > Components: contrib - DataImportHandler >Affects Versions: 1.4, 1.4.1 >Reporter: Ephraim Ofir >Priority: Minor > > 1. Not sure it's a bug, but looks like a bug to me - if the query returns any > values other than $deleteDocById for the row you want deleted, it deletes the > row but also re-adds it with the rest of the data, so in effect the row isn't > deleted. In order to work around this issue, you have to either make sure no > data other than $deleteDocById= exists in rows to be deleted or add > $skipDoc='true' > (which I think is a little counter-intuitive, but was the better choice in my > case). My query looks something like: > SELECT u.id, >u.name, >... >IF(u.delete_flag > 0, u.id, NULL) AS $deleteDocById, >IF(u.delete_flag > 0, 'true', NULL) AS $skipDoc FROM users_tb u > 2. $deleteDocById doesn't update the statistics of deleted documents. > This has 2 downsides, the obvious one is that you don't know if/how many > documents were deleted, the not-so-obvious one is that if your import > contains only deleted items, it won't be committed automatically by DIH and > you'll have to commit it manually. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Code Freeze on realtime_search branch
Hey folks, LUCENE-3023 aims to land the considerably large DocumentsWriterPerThread (DWPT) refactoring on trunk. During the last weeks we have put lots of efforts into cleaning the code up, fixing javadocs and run test locally as well as on Jenkins. We reached the point where we are able to create a final patch for review and land this exciting refactoring on trunk very soon. I committed the CHANGES.TXT entry (also appended below) a couple of minutes ago so from now on we freeze the branch for final review (Robert can you create a new "final" patch and upload to LUCENE-3023). Any comments should go to [1] or as a reply to this email. If there is no blocker coming up we plan to reintegrate the branch and commit it to trunk early next week. For those who want some background what DWPT does read: [2] Note: this change will not change the index file format so there is no need to reindex for trunk users. Yet, I will send a heads up next week with an overview of that has changed. Simon [1] https://issues.apache.org/jira/browse/LUCENE-3023 [2] http://blog.jteam.nl/2011/04/01/gimme-all-resources-you-have-i-can-use-them/ * LUCENE-2956, LUCENE-2573, LUCENE-2324, LUCENE-2555: Changes from DocumentsWriterPerThread: - IndexWriter now uses a DocumentsWriter per thread when indexing documents. Each DocumentsWriterPerThread indexes documents in its own private segment, and the in memory segments are no longer merged on flush. Instead, each segment is separately flushed to disk and subsequently merged with normal segment merging. - DocumentsWriterPerThread (DWPT) is now flushed concurrently based on a FlushPolicy. When a DWPT is flushed, a fresh DWPT is swapped in so that indexing may continue concurrently with flushing. The selected DWPT flushes all its RAM resident documents do disk. Note: Segment flushes don't flush all RAM resident documents but only the documents private to the DWPT selected for flushing. - Flushing is now controlled by FlushPolicy that is called for every add, update or delete on IndexWriter. By default DWPTs are flushed either on maxBufferedDocs per DWPT or the global active used memory. Once the active memory exceeds ramBufferSizeMB only the largest DWPT is selected for flushing and the memory used by this DWPT is substracted from the active memory and added to a flushing memory pool, which can lead to temporarily higher memory usage due to ongoing indexing. - IndexWriter now can utilize ramBufferSize > 2048 MB. Each DWPT can address up to 2048 MB memory such that the ramBufferSize is now bounded by the max number of DWPT avaliable in the used DocumentsWriterPerThreadPool. IndexWriters net memory consumption can grow far beyond the 2048 MB limit if the applicatoin can use all available DWPTs. To prevent a DWPT from exhausting its address space IndexWriter will forcefully flush a DWPT if its hard memory limit is exceeded. The RAMPerThreadHardLimitMB can be controlled via IndexWriterConfig and defaults to 1945 MB. Since IndexWriter flushes DWPT concurrently not all memory is released immediately. Applications should still use a ramBufferSize significantly lower than the JVMs avaliable heap memory since under high load multiple flushing DWPT can consume substantial transient memory when IO performance is slow relative to indexing rate. - IndexWriter#commit now doesn't block concurrent indexing while flushing all 'currently' RAM resident documents to disk. Yet, flushes that occur while a a full flush is running are queued and will happen after all DWPT involved in the full flush are done flushing. Applications using multiple threads during indexing and trigger a full flush (eg call commmit() or open a new NRT reader) can use significantly more transient memory. - IndexWriter#addDocument and IndexWriter.updateDocument can block indexing threads if the number of active + number of flushing DWPT exceed a safety limit. By default this happens if 2 * max number available thread states (DWPTPool) is exceeded. This safety limit prevents applications from exhausting their available memory if flushing can't keep up with concurrently indexing threads. - IndexWriter only applies and flushes deletes if the maxBufferedDelTerms limit is reached during indexing. No segment flushes will be triggered due to this setting. - IndexWriter#flush(boolean, boolean) doesn't synchronized on IndexWriter anymore. A dedicated flushLock has been introduced to prevent multiple full- flushes happening concurrently. - DocumentsWriter doesn't write shared doc stores anymore. (Mike McCandless, Michael Busch, Simon Willnauer) - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-3051) don't call SegmentInfo.sizeInBytes for the merging segments
[ https://issues.apache.org/jira/browse/LUCENE-3051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless reassigned LUCENE-3051: -- Assignee: Michael McCandless > don't call SegmentInfo.sizeInBytes for the merging segments > --- > > Key: LUCENE-3051 > URL: https://issues.apache.org/jira/browse/LUCENE-3051 > Project: Lucene - Java > Issue Type: Bug >Affects Versions: 4.0 >Reporter: Michael McCandless >Assignee: Michael McCandless >Priority: Minor > Fix For: 3.2, 4.0 > > > Selckin has been running Lucene's tests on the RT branch, and hit this: > {noformat} > [junit] Testsuite: org.apache.lucene.index.TestIndexWriter > [junit] Testcase: > testDeleteAllSlowly(org.apache.lucene.index.TestIndexWriter): FAILED > [junit] Some threads threw uncaught exceptions! > [junit] junit.framework.AssertionFailedError: Some threads threw uncaught > exceptions! > [junit] at > org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:535) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1246) > [junit] at > org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1175) > [junit] > [junit] > [junit] Tests run: 67, Failures: 1, Errors: 0, Time elapsed: 38.357 sec > [junit] > [junit] - Standard Error - > [junit] NOTE: reproduce with: ant test -Dtestcase=TestIndexWriter > -Dtestmethod=testDeleteAllSlowly > -Dtests.seed=-4291771462012978364:4550117847390778918 > [junit] The following exceptions were thrown by threads: > [junit] *** Thread: Lucene Merge Thread #1 *** > [junit] org.apache.lucene.index.MergePolicy$MergeException: > java.io.FileNotFoundException: _4_1.del > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:472) > [junit] Caused by: java.io.FileNotFoundException: _4_1.del > [junit] at > org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:290) > [junit] at > org.apache.lucene.store.MockDirectoryWrapper.fileLength(MockDirectoryWrapper.java:549) > [junit] at > org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:287) > [junit] at > org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3280) > [junit] at > org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2956) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:379) > [junit] at > org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:447) > [junit] NOTE: test params are: codec=RandomCodecProvider: {=SimpleText, > f6=Pulsing(freqCutoff=15), f7=MockFixedIntBlock(blockSize=1606), > f8=SimpleText, f9=MockSep, f1=MockVariableIntBlock(baseBlockSize=99), > f0=MockFixedIntBlock(blockSize=1606), f3=Pulsing(freqCutoff=15), f2=MockSep, > f5=SimpleText, f4=Standard, f=MockFixedIntBlock(blockSize=1606), c=MockSep, > termVector=MockRandom, d9=MockFixedIntBlock(blockSize=1606), > d8=Pulsing(freqCutoff=15), d5=SimpleText, d4=Standard, d7=MockRandom, > d6=MockVariableIntBlock(baseBlockSize=99), d25=MockRandom, d0=MockRandom, > c29=MockFixedIntBlock(blockSize=1606), > d24=MockVariableIntBlock(baseBlockSize=99), d1=Standard, c28=Standard, > d23=SimpleText, d2=MockFixedIntBlock(blockSize=1606), c27=MockRandom, > d22=Standard, d3=MockVariableIntBlock(baseBlockSize=99), > d21=Pulsing(freqCutoff=15), d20=MockSep, > c22=MockFixedIntBlock(blockSize=1606), c21=Pulsing(freqCutoff=15), > c20=MockRandom, d29=MockFixedIntBlock(blockSize=1606), c26=Standard, > d28=Pulsing(freqCutoff=15), c25=MockRandom, d27=MockRandom, c24=MockSep, > d26=MockVariableIntBlock(baseBlockSize=99), c23=SimpleText, e9=MockRandom, > e8=MockSep, e7=SimpleText, e6=MockFixedIntBlock(blockSize=1606), > e5=Pulsing(freqCutoff=15), c17=MockFixedIntBlock(blockSize=1606), > e3=Standard, d12=MockVariableIntBlock(baseBlockSize=99), > c16=Pulsing(freqCutoff=15), e4=SimpleText, > d11=MockFixedIntBlock(blockSize=1606), c19=MockSep, e1=MockSep, > d14=Pulsing(freqCutoff=15), c18=SimpleText, e2=Pulsing(freqCutoff=15), > d13=MockSep, e0=MockVariableIntBlock(baseBlockSize=99), d10=Standard, > d19=MockVariableIntBlock(baseBlockSize=99), c11=SimpleText, c10=Standard, > d16=Pulsing(freqCutoff=15), c13=MockRandom, > c12=MockVariableIntBlock(baseBlockSize=99), d15=MockSep, d18=SimpleText, > c15=MockFixedIntBlock(blockSize=1606), d17=Standard, > c14=Pulsing(freqCutoff=15), b3=MockSep, b2=SimpleText, b5
[jira] [Created] (LUCENE-3051) don't call SegmentInfo.sizeInBytes for the merging segments
don't call SegmentInfo.sizeInBytes for the merging segments --- Key: LUCENE-3051 URL: https://issues.apache.org/jira/browse/LUCENE-3051 Project: Lucene - Java Issue Type: Bug Affects Versions: 4.0 Reporter: Michael McCandless Priority: Minor Fix For: 3.2, 4.0 Selckin has been running Lucene's tests on the RT branch, and hit this: {noformat} [junit] Testsuite: org.apache.lucene.index.TestIndexWriter [junit] Testcase: testDeleteAllSlowly(org.apache.lucene.index.TestIndexWriter): FAILED [junit] Some threads threw uncaught exceptions! [junit] junit.framework.AssertionFailedError: Some threads threw uncaught exceptions! [junit] at org.apache.lucene.util.LuceneTestCase.tearDown(LuceneTestCase.java:535) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1246) [junit] at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1175) [junit] [junit] [junit] Tests run: 67, Failures: 1, Errors: 0, Time elapsed: 38.357 sec [junit] [junit] - Standard Error - [junit] NOTE: reproduce with: ant test -Dtestcase=TestIndexWriter -Dtestmethod=testDeleteAllSlowly -Dtests.seed=-4291771462012978364:4550117847390778918 [junit] The following exceptions were thrown by threads: [junit] *** Thread: Lucene Merge Thread #1 *** [junit] org.apache.lucene.index.MergePolicy$MergeException: java.io.FileNotFoundException: _4_1.del [junit] at org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:507) [junit] at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:472) [junit] Caused by: java.io.FileNotFoundException: _4_1.del [junit] at org.apache.lucene.store.FSDirectory.fileLength(FSDirectory.java:290) [junit] at org.apache.lucene.store.MockDirectoryWrapper.fileLength(MockDirectoryWrapper.java:549) [junit] at org.apache.lucene.index.SegmentInfo.sizeInBytes(SegmentInfo.java:287) [junit] at org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3280) [junit] at org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:2956) [junit] at org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:379) [junit] at org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:447) [junit] NOTE: test params are: codec=RandomCodecProvider: {=SimpleText, f6=Pulsing(freqCutoff=15), f7=MockFixedIntBlock(blockSize=1606), f8=SimpleText, f9=MockSep, f1=MockVariableIntBlock(baseBlockSize=99), f0=MockFixedIntBlock(blockSize=1606), f3=Pulsing(freqCutoff=15), f2=MockSep, f5=SimpleText, f4=Standard, f=MockFixedIntBlock(blockSize=1606), c=MockSep, termVector=MockRandom, d9=MockFixedIntBlock(blockSize=1606), d8=Pulsing(freqCutoff=15), d5=SimpleText, d4=Standard, d7=MockRandom, d6=MockVariableIntBlock(baseBlockSize=99), d25=MockRandom, d0=MockRandom, c29=MockFixedIntBlock(blockSize=1606), d24=MockVariableIntBlock(baseBlockSize=99), d1=Standard, c28=Standard, d23=SimpleText, d2=MockFixedIntBlock(blockSize=1606), c27=MockRandom, d22=Standard, d3=MockVariableIntBlock(baseBlockSize=99), d21=Pulsing(freqCutoff=15), d20=MockSep, c22=MockFixedIntBlock(blockSize=1606), c21=Pulsing(freqCutoff=15), c20=MockRandom, d29=MockFixedIntBlock(blockSize=1606), c26=Standard, d28=Pulsing(freqCutoff=15), c25=MockRandom, d27=MockRandom, c24=MockSep, d26=MockVariableIntBlock(baseBlockSize=99), c23=SimpleText, e9=MockRandom, e8=MockSep, e7=SimpleText, e6=MockFixedIntBlock(blockSize=1606), e5=Pulsing(freqCutoff=15), c17=MockFixedIntBlock(blockSize=1606), e3=Standard, d12=MockVariableIntBlock(baseBlockSize=99), c16=Pulsing(freqCutoff=15), e4=SimpleText, d11=MockFixedIntBlock(blockSize=1606), c19=MockSep, e1=MockSep, d14=Pulsing(freqCutoff=15), c18=SimpleText, e2=Pulsing(freqCutoff=15), d13=MockSep, e0=MockVariableIntBlock(baseBlockSize=99), d10=Standard, d19=MockVariableIntBlock(baseBlockSize=99), c11=SimpleText, c10=Standard, d16=Pulsing(freqCutoff=15), c13=MockRandom, c12=MockVariableIntBlock(baseBlockSize=99), d15=MockSep, d18=SimpleText, c15=MockFixedIntBlock(blockSize=1606), d17=Standard, c14=Pulsing(freqCutoff=15), b3=MockSep, b2=SimpleText, b5=Standard, b4=MockRandom, b7=MockVariableIntBlock(baseBlockSize=99), b6=MockFixedIntBlock(blockSize=1606), d50=MockFixedIntBlock(blockSize=1606), b9=Pulsing(freqCutoff=15), b8=MockSep, d43=MockSep, d42=SimpleText, d41=MockFixedIntBlock(blockSize=1606), d40=Pulsing(freqCutoff=15), d47=MockVariableIntBlock(baseBlockSize=99), d46=MockFixedIntBlock(blockSize=1606), b0=MockVariableIntBlock(baseBlockSize=99), d45=Standard, b1
[jira] [Commented] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026932#comment-13026932 ] Simon Willnauer commented on LUCENE-3023: - I committed the CHANGES.TXT patch to branch. I think we should freeze the branch now so robert can create a last final patch. We should let that patch linger around for a while, yet I plan to commit this to trunk on monday. Good work everybody! > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023_CHANGES.patch, LUCENE-3023_CHANGES.patch, > LUCENE-3023_iw_iwc_jdoc.patch, LUCENE-3023_simonw_review.patch, > LUCENE-3023_svndiff.patch, LUCENE-3023_svndiff.patch, diffMccand.py, > diffSources.patch, diffSources.patch, realtime-TestAddIndexes-3.txt, > realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3023: --- Attachment: LUCENE-3023_CHANGES.patch Small edits to Simon's CHANGES entry. > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023_CHANGES.patch, LUCENE-3023_CHANGES.patch, > LUCENE-3023_iw_iwc_jdoc.patch, LUCENE-3023_simonw_review.patch, > LUCENE-3023_svndiff.patch, LUCENE-3023_svndiff.patch, diffMccand.py, > diffSources.patch, diffSources.patch, realtime-TestAddIndexes-3.txt, > realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless updated LUCENE-3023: --- Attachment: diffSources.patch Iteration on diffSources.py -- adds usage line, copyright header. I think it's ready to be committed! > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023_CHANGES.patch, LUCENE-3023_iw_iwc_jdoc.patch, > LUCENE-3023_simonw_review.patch, LUCENE-3023_svndiff.patch, > LUCENE-3023_svndiff.patch, diffMccand.py, diffSources.patch, > diffSources.patch, realtime-TestAddIndexes-3.txt, > realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3023: Attachment: LUCENE-3023_CHANGES.patch here is my first cut at CHANGES.TXT for landing on trunk. Review would be much appreciated. > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023_CHANGES.patch, LUCENE-3023_iw_iwc_jdoc.patch, > LUCENE-3023_simonw_review.patch, LUCENE-3023_svndiff.patch, > LUCENE-3023_svndiff.patch, diffMccand.py, diffSources.patch, > realtime-TestAddIndexes-3.txt, realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3023) Land DWPT on trunk
[ https://issues.apache.org/jira/browse/LUCENE-3023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13026894#comment-13026894 ] Simon Willnauer commented on LUCENE-3023: - bq. I put it under a new 'dev-tools/scripts' dir... +1 mike can you add a little doc string to the script explaining what it does and how to use it? I think we should also have a wiki page that explains how to reintegrate a branch just like we have one for merging changes into a branch. > Land DWPT on trunk > -- > > Key: LUCENE-3023 > URL: https://issues.apache.org/jira/browse/LUCENE-3023 > Project: Lucene - Java > Issue Type: Task >Affects Versions: CSF branch, 4.0 >Reporter: Simon Willnauer >Assignee: Simon Willnauer > Fix For: 4.0 > > Attachments: LUCENE-3023.patch, LUCENE-3023.patch, LUCENE-3023.patch, > LUCENE-3023_iw_iwc_jdoc.patch, LUCENE-3023_simonw_review.patch, > LUCENE-3023_svndiff.patch, LUCENE-3023_svndiff.patch, diffMccand.py, > diffSources.patch, realtime-TestAddIndexes-3.txt, > realtime-TestAddIndexes-5.txt, > realtime-TestIndexWriterExceptions-assert-6.txt, > realtime-TestIndexWriterExceptions-npe-1.txt, > realtime-TestIndexWriterExceptions-npe-2.txt, > realtime-TestIndexWriterExceptions-npe-4.txt, > realtime-TestOmitTf-corrupt-0.txt > > > With LUCENE-2956 we have resolved the last remaining issue for LUCENE-2324 so > we can proceed landing the DWPT development on trunk soon. I think one of the > bigger issues here is to make sure that all JavaDocs for IW etc. are still > correct though. I will start going through that first. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org