RE: Test failure question
Hi, testBarelyCloseEnough(), testExact(), testMulipleTerms(), etc? If so, then the NUnit is not doing this. I tested by outputing to stdout. NUnit calls setUp before each test and calls tearDown after each test. Add Console.WriteLine and see the result. Let me show: -- [TestFixture] public class TestPhraseQuery{ [SetUp] protected void SetUp() { directory = new RAMDirectory(); IndexWriter writer = new IndexWriter(directory, new WhitespaceAnalyzer(), true); ... Console.WriteLine(set up); } [TearDown] protected void TearDown() { searcher.Close(); directory.Close(); Console.WriteLine(tear down); } [Test] public void TestNotCloseEnough() { query.SetSlop(2); . MockAssert.AreEqual(0, hits.Length()); Console.WriteLine(not close); } -- The output: --- set up barely tear down set up tear down ... Pasha Bizhan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Benchmarking results
Hi, From: Marvin Humphrey [mailto:[EMAIL PROTECTED] The test corpus was Reuters-21578, Distribution 1.0. Reuters-21578 is available from David D. Lewis' professional home page, currently: http://www.research.att.com/~lewis The correct link is http://www.daviddlewis.com/resources/testcollections/reuters21578/ Pasha Bizhan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Test corpus
Hi, From: Marvin Humphrey [mailto:[EMAIL PROTECTED] I'm looking for a test corpus to use for some benchmarking and parsing tests. I can whip one up myself, but it would be nice to use something standardized. I'd like something that doesn't require a license/fee, so that other people can run the same tests. At least 1000 docs, a few hundred words each. Any suggestions? See Corpora section at http://wiki.apache.org/jakarta-lucene/Resources Pasha Bizhan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
LUCENE-460
Hi, Question about latest cvs changes and hashcodes. http://issues.apache.org/jira/browse/LUCENE-460 Could anybody explain the magic numbers? 0x6634D93C,0x2742E74A and other. Any special meaning? Is this documented anywhere? Pasha - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Advanced query language
Hi, From: Erik Hatcher [mailto:[EMAIL PROTECTED] MoreLikeThis minNumberShouldMatch=3 maxQueryTerms=30 We're back to MoreLikeThis - it's not currently a Query subclass. How do you envision this sort of thing fitting in if it's not a Query? But MoreLikeThis class produces a Query. It's similar to google define: search. I think goolge handle such queries and then redirect search to somewhere. And QueryParser can handle such searches too and use an alternative logic to create Query. For example, we can extend the QueryParser by special (syntax) handlers which will be create the Query. Something lke this: -- class LikeHandler {}; LikeHandler likeHandler = new LikeHandler(...); string queryString = like:(red quick fox); Query q = QueryParser.parse(queryString, analyzer, likeHandler); -- QueryParser scan the input, find special command (like:) and then find the handler for this command. If the handler exists the QP call it to create the Query. Disadvantages are present. Pasha Bizhan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: Advanced query language
Hi, From: markharw00d [mailto:[EMAIL PROTECTED] Re: MoreLikeThis queries. Yes, they can be usefully wrapped as queries (see attached simple example). In fact it was my attempts at bastardising QueryParser to support them that brought home it's limitations. I ended up with a subclass hack that (mis)used the field name to parse a query string like:123 where 123 was a doc id. With the QueryParser syntax I was not able to pass other parameters which MoreLikeThis could usefully use to control the behaviour of this query type eg choice of fieldname(s) used, max number of terms generated, minNumberShouldTerms to match etc etc. With the _current_ QP syntax. In refer to my previous letter about syntax handlers you would be able to pass the parameters to handler. string query = like(param1, param2,...): (bla-bla-bla); A syntax of parameters isn't signifant to QP. QP do not need to know anything about parameter's syntax. string query=like(percentTermsToMatch=0.25f,docId=44,...):... ; Or string query=like(0.25f,44): ... This is not unusual, each query type has potentially multiple optional parameters that tweak it's behaviour. If I don't have a query language that names the parameters explicitly (say, XML) I end up having to define what looks like a function with a long list of parameters: like (123,,,4,,,). Ack. Exactly. Pasha Bizhan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-474) High Frequency Terms/Phrases at the Index level
[ http://issues.apache.org/jira/browse/LUCENE-474?page=comments#action_12358629 ] Pasha Bizhan commented on LUCENE-474: - Look for the HighFreqTerms package in contib area: http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/contrib/miscellaneous/src/java/org/apache/lucene/misc/HighFreqTerms.java?rev=164963view=log High Frequency Terms/Phrases at the Index level --- Key: LUCENE-474 URL: http://issues.apache.org/jira/browse/LUCENE-474 Project: Lucene - Java Type: New Feature Versions: 1.4 Reporter: Suri Babu B We should be able to find the all the high frequncy terms/phrases ( where frequency is the search criteria / benchmark) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
[jira] Commented: (LUCENE-474) High Frequency Terms/Phrases at the Index level
[ http://issues.apache.org/jira/browse/LUCENE-474?page=comments#action_12358643 ] Pasha Bizhan commented on LUCENE-474: - I understand what is high freq terms. But what is high freq phrases? Could you please explain your index structure? High Frequency Terms/Phrases at the Index level --- Key: LUCENE-474 URL: http://issues.apache.org/jira/browse/LUCENE-474 Project: Lucene - Java Type: New Feature Versions: 1.4 Reporter: Suri Babu B We should be able to find the all the high frequncy terms/phrases ( where frequency is the search criteria / benchmark) -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
RE: class for delete/add access to an index
Hi, From: Daniel Naber [mailto:[EMAIL PROTECTED] What do you think? If this gets accepted, it also needs a better name. Please also add an api for searching like this: http://searchblackbox.com/sdk/api/SearchBlackBox.SearchEngine.ExecuteSearch. html Pasha Bizhan - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: Questions about DeleteFile method
Hi, George Aroush [EMAIL PROTECTED] wrote: All: Speaking of my port work for 1.9 RC1, I don't have a clear idea what to do about java.util.zip. There is no equivalent in .NET and it is being used in Lucene 1.9 RC1 for Index.FieldsWriter and Index.FieldsReader. Any suggestion? SharpZLib. We use it for our port :)) Current tests for compatibility works well but we have not the final results at present. Pasha Bizhan http://lucenedotnet.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
Re: java.util.zip (was Questions about DeleteFile method)
Hi, Monsur Hossain [EMAIL PROTECTED] wrote: Hmm, but upon first look I don't see a direct analog to the Inflater/Deflater methods. using ICSharpCode.SharpZipLib.Zip; using ICSharpCode.SharpZipLib.Zip.Compression; // Create the compressor with highest level of compression Deflater compressor = new Deflater(); compressor.SetLevel(Deflater.BEST_COMPRESSION); and etc Pasha Bizhan http://lucenedotnet.com - To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]