[ 
https://issues.apache.org/jira/browse/LUCENENET-600?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16497674#comment-16497674
 ] 

Jens Melgaard commented on LUCENENET-600:
-----------------------------------------

Another option would be to use ANTLR4 to generate a parser. I wanted to add 
that information because I have been looking for an ANTLR4 grammar for Lucene 
for ages and it was rather difficult to find.

In the end I stumbled on [https://github.com/lrowe/lucenequery] 

I have been struggling to integrate it into the Visual Studio + .Net Standard 
tool chain, but managed to get it working well enough to at least be able to 
build it, however it certainly not a pleasant development experience (yet)... 
According to [https://github.com/tunnelvisionlabs/antlr4cs] it should work 
better if I bothered installing java etc to do the generation etc... But I 
didn't bother...

Project example can be seen here: 
[https://github.com/dotJEM/json-index/tree/Lucene-v4.8/DotJEM.Json.Index/DotJEM.Json.Index.QueryParsers]

(The scope here is quite a bit broader than just a Query parser, and it is only 
partially inspired by the lrowe grammar, the main idea was just to get 
something more simple to work to begin with)

 

> Creating an IndexWriter with a RAMDirectory causes two exceptions to be thrown
> ------------------------------------------------------------------------------
>
>                 Key: LUCENENET-600
>                 URL: https://issues.apache.org/jira/browse/LUCENENET-600
>             Project: Lucene.Net
>          Issue Type: Bug
>          Components: Lucene.Net Core
>    Affects Versions: Lucene.Net 4.8.0
>            Reporter: Howard van Rooijen
>            Priority: Minor
>
> I have a document scoring algorithm built on top of Lucene. I've just 
> upgraded it to the 4.8.0-beta00005 packages (great job by the way).
> We essentially create an in memory index for a single document in order to do 
> some parsing / processing / scoring / classification.
> I noticed while running our test suite that the CPU was spiking and also 
> noticed that a large number of first chance exceptions were being generated 
> by these two lines of code:
> {{var directory = new RAMDirectory();}}
> {{var indexWriter = new IndexWriter(directory, new 
> IndexWriterConfig(LuceneVersion.LUCENE_48, new 
> ScorableDocumentAnalyzer(LuceneVersion.LUCENE_48)));}}
> The first exception is:
> {{'System.IO.FileNotFoundException' in Lucene.Net.dll ("segments.gen"). }}
> The second exception is:
> {{'Lucene.Net.Index.IndexNotFoundException' in Lucene.Net.dll ("no segments* 
> file found in RAMDirectory@21af1a5 
> lockFactory=Lucene.Net.Store.SingleInstanceLockFactory:}}
> Based on reading / research, I believer this is because the RAMDirectory is 
> initialised to be null, and when the IndexWriter is created it tries to query 
> the RAMDirectory and FileNotFoundException is thrown.
> Is it possible to either initialized as empty rather than null - i.e. reading 
> the directory would not throw an exception - this might involve trying to add 
> an "segments.gen" entry and a matching "segments_n" segmentinfo entry, 
> alternatively is it possible not to throw an exception in this use case? 
> Or do you have a suggestion for how it would be possible to manually 
> initialise the RAMDirectory before passing it to the IndexWriter?
> Because these two lines are being called per request - we're seeing 2 
> exceptions per request - this seems like an expensive way of initialising an 
> IndexWriter. We've already had to replace QueryParser with SimpleQueryParser 
> because QueryParser was throwing 50+ exception internally when being 
> instantiated.
> If anyone can point me in the right direction, I'd be more than happy to try 
> and create a fix / PR. But I'm wondering as RAMDirectory is often used for 
> unit testing scenarios - does anyone have any deep knowledge about why this 
> current behaviour is the default behaviour? 
> Many Thanks,
> Howard
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to