RE: [Lucene.Net] [jira] [Created] (LUCENENET-469) Convert Java Iterator classes to implement IEnumerableT
Hi Andy, I have used similar extension methods for a long time. What I like especially in extension methods is that they don't require changes in Lucene.Net kernel and make further ports of Lucene.java independent from .Net structures. +1 for a Lucene.Net extensions in contrib. Even a +1 for a Lucene.Net.Extensions for the core. DIGY -Original Message- From: Andy Pook [mailto:andy.p...@gmail.com] Sent: Friday, June 08, 2012 10:26 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] [jira] [Created] (LUCENENET-469) Convert Java Iterator classes to implement IEnumerableT If we don't want to add IEnumerable (though it seems that IEnumerable could be added in parallel with the existing pattern) could we add a bunch of extension methods? Something like the following... {noformat} public static class LuceneExtensions { public static IEnumerableTerm GetEnumerable(this TermEnum termEnum) { yield return termEnum.Term(); while (termEnum.Next()) yield return termEnum.Term(); } } {noformat} Then you can... {noformat} foreach(var e in myTernEnum.GetEnumerable()) { // do stuff with e } {noformat} Not as elegant as a direct implementation but gives easy enough access to foreach sematics. The second option is to realize that you don't need to explicitly implement IEnumerable. You just need a GetEnumerator method. So just add... {noformat} public IEnumerableTerm GetEnumerator() { yield return Term(); while (Next()) yield return Term(); } {noformat} Now you get nice foreach sematics without even mentioning IEnumerable. Compiler magic is your friend :-) BTW: Dispose() is only called automatically when exiting a using block. Exiting a foreach will not. Cheers, Andy On 24 January 2012 06:37, Christopher Currens (Created) (JIRA) j...@apache.org wrote: Convert Java Iterator classes to implement IEnumerableT - Key: LUCENENET-469 URL: https://issues.apache.org/jira/browse/LUCENENET-469 Project: Lucene.Net Issue Type: Sub-task Components: Lucene.Net Contrib, Lucene.Net Core Affects Versions: Lucene.Net 2.9.4, Lucene.Net 3.0.3, Lucene.Net 2.9.4g Environment: all Reporter: Christopher Currens Fix For: Lucene.Net 3.0.3 The Iterator pattern in Java is equivalent to IEnumerable in .NET. Classes that were directly ported in Java using the Iterator pattern, cannot be used with Linq or foreach blocks in .NET. {{Next()}} would be equivalent to .NET's {{MoveNext()}}, and in the below case, {{Term()}} would be as .NET's {{Current}} property. In cases as below, it will require {{TermEnum}} to become an abstract class with {{Term}} and {{DocFreq}} properties, which would be returned from another class or method that implemented {{IEnumerableTermEnum}}. {noformat} public abstract class TermEnum : IDisposable { public abstract bool Next(); public abstract Term Term(); public abstract int DocFreq(); public abstract void Close(); public abstract void Dispose(); } {noformat} would instead look something like: {noformat} public class TermFreq { public abstract Term { get; } public abstract int { get; } } public abstract class TermEnum : IEnumerableTermFreq, IDisposable { // ... } {noformat} Keep in mind that it is important that if the class being converted implements {{IDisposable}}, the class that is enumerating the terms (in this case {{TermEnum}}) should inherit from both {{IEnumerableT}} *and* {{IDisposable}}. This won't be any change to the user, as the compiler automatically calls {{IDisposable}} when used in a {{foreach}} loop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.js pa For more information on JIRA, see: http://www.atlassian.com/software/jira - Checked by AVG - www.avg.com Version: 2012.0.2177 / Virus Database: 2433/5056 - Release Date: 06/08/12
RE: Welcome Itamar Syn-Hershko as a new committer
Welcome Itamar! DIGY -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Wednesday, May 23, 2012 12:06 AM To: lucene-net-dev@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: Welcome Itamar Syn-Hershko as a new committer Hey all, I'd like to officially welcome Itamar as a new committer. I know the community appreciates the work you've been doing with the Spatial contrib project and the past help you've provided on the mailing lists. Please join me in welcoming Itamar, ~Prescott - Checked by AVG - www.avg.com Version: 2012.0.1913 / Virus Database: 2425/5015 - Release Date: 05/22/12
[jira] [Closed] (LUCENENET-486) Wildcard queries are not analyzed
[ https://issues.apache.org/jira/browse/LUCENENET-486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy closed LUCENENET-486. -- Resolution: Not A Problem http://wiki.apache.org/lucene-java/LuceneFAQ#Are_Wildcard.2C_Prefix.2C_and_Fuzzy_queries_case_sensitive.3F Wildcard queries are not analyzed - Key: LUCENENET-486 URL: https://issues.apache.org/jira/browse/LUCENENET-486 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Contrib, Lucene.Net Core Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4 Environment: Windows 7, Visual Studio 2010, .net 4.0 Reporter: Björn Attachments: LuceneTest.zip The lucene 'QueryParser' doesn't analyze wildcard querys. The function 'GetPrefixQuery'(QueryParser.cs) returns the string without any analyzation. I have performed some queries to show the problem. The analyzer is the 'Contrib.Analyzers.DE.GermanAnalyzer' -- indexed word: 'Häuser'; in the index stemmed as: 'hau' -- query: Hau*; hit: yes query: Hause*; hit: no; This should be a hit. -- indexed word: 'Angebote'; in the index stemmed as: 'angebo' -- query: Angebo*; hit: yes query: Angebot*; hit: no; This should be a hit. query: Angebote*; hit: no; This should be a hit. -- indexed word: 'Björn'; in the index stemmed as: 'bjor' -- query: Bjor*; hit: yes query: Björ*; hit: no; This should be a hit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Closed] (LUCENENET-486) Wildcard queries are not analyzed
[ https://issues.apache.org/jira/browse/LUCENENET-486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy closed LUCENENET-486. -- Resolution: Won't Fix Wildcard queries are not analyzed - Key: LUCENENET-486 URL: https://issues.apache.org/jira/browse/LUCENENET-486 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Contrib, Lucene.Net Core Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4 Environment: Windows 7, Visual Studio 2010, .net 4.0 Reporter: Björn Attachments: LuceneTest.zip The lucene 'QueryParser' doesn't analyze wildcard querys. The function 'GetPrefixQuery'(QueryParser.cs) returns the string without any analyzation. I have performed some queries to show the problem. The analyzer is the 'Contrib.Analyzers.DE.GermanAnalyzer' -- indexed word: 'Häuser'; in the index stemmed as: 'hau' -- query: Hau*; hit: yes query: Hause*; hit: no; This should be a hit. -- indexed word: 'Angebote'; in the index stemmed as: 'angebo' -- query: Angebo*; hit: yes query: Angebot*; hit: no; This should be a hit. query: Angebote*; hit: no; This should be a hit. -- indexed word: 'Björn'; in the index stemmed as: 'bjor' -- query: Bjor*; hit: yes query: Björ*; hit: no; This should be a hit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (LUCENENET-486) Wildcard queries are not analyzed
[ https://issues.apache.org/jira/browse/LUCENENET-486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13257792#comment-13257792 ] Digy commented on LUCENENET-486: bq. Of course I have solved the problem for myself. Then no problem DIGY Wildcard queries are not analyzed - Key: LUCENENET-486 URL: https://issues.apache.org/jira/browse/LUCENENET-486 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Contrib, Lucene.Net Core Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4 Environment: Windows 7, Visual Studio 2010, .net 4.0 Reporter: Björn Attachments: LuceneTest.zip The lucene 'QueryParser' doesn't analyze wildcard querys. The function 'GetPrefixQuery'(QueryParser.cs) returns the string without any analyzation. I have performed some queries to show the problem. The analyzer is the 'Contrib.Analyzers.DE.GermanAnalyzer' -- indexed word: 'Häuser'; in the index stemmed as: 'hau' -- query: Hau*; hit: yes query: Hause*; hit: no; This should be a hit. -- indexed word: 'Angebote'; in the index stemmed as: 'angebo' -- query: Angebo*; hit: yes query: Angebot*; hit: no; This should be a hit. query: Angebote*; hit: no; This should be a hit. -- indexed word: 'Björn'; in the index stemmed as: 'bjor' -- query: Bjor*; hit: yes query: Björ*; hit: no; This should be a hit. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: Wildcard queries are not analyzed
GetPrefixQuery doesn't use analyzers, and it is a well known issue of Lucene. Suppose a hypothetical analyzer(with stemming) which stems 'went' as 'go' and you want to search 'wentworth miller'. A search like 'went*' would be converted to 'go*' which i guess wouldn't be what you want. DIGY -Original Message- From: Björn Kremer [mailto:b...@patorg.de] Sent: Tuesday, April 17, 2012 12:59 PM To: lucene-net-dev@lucene.apache.org Subject: Wildcard queries are not analyzed Hello, maybe I have found a little lucene problem: Wildcard queries are not analyzed correctly. I'm using the german analyzer with the 'GermanDIN2Stemmer'. In the lucene-index my name('Björn') is stored as 'bjorn'. If I performe a wildcard query like 'björ*' the function 'GetPrefixQuery' does not analyze the search term. So the query result is 'björ*' instead of 'bjor*'. (björ* = no match, bjor* = match) Thank You Björn - Checked by AVG - www.avg.com Version: 2012.0.1913 / Virus Database: 2411/4940 - Release Date: 04/16/12
RE: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2)
+1 DIGY -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Thursday, January 26, 2012 1:56 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2) Thanks for the +1, we need one more vote here, then Stefan will be comfortable giving us a plus one, which will give us two plus ones in general, and ill only have to beg for one more :) Sent from my Windows Phone From: Michael Herndon Sent: 1/25/2012 11:15 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2) verified tests pass and checksums match. so +1 @P, I remember that thread. Those guys stay busy though and devopt mentality is different than a devs. Our needs probably exceed what the svn CMS is meant for due to documentation. I am curious if infra allows for or would allow us to throw up a static mono/asp.net mvc in the future just so that we could dog food the site with search using Lucene.Net and then have it index certain pages or sites (wiki, tutorials, static site, docs). We'll probably need to dig out our CMS options again and weight against short term and long term goals. On Wed, Jan 25, 2012 at 12:31 PM, Prescott Nasser geobmx...@hotmail.comwrote: You know even making a small change to the website like updating the news takes like 30 minutes to run now because of all the files. Its absolutely ridiculous. I got chided by the CMS group, yet when asked how do we put documentation online with the new system there were crickets. Sent from my Windows Phone From: Michael Herndon Sent: 1/25/2012 8:26 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2) I was not able to download the binaries till this morning. The wiki was also having issues. I ran rat on the the released source, that seems fine. did a compare on src zip and the tag. it matches. The only things I saw are nit picks. in the ReadMe the link should point to its respective tag instead of RC3 for just 2_9_4 https://svn.apache.org/repos/asf/incubator/lucene.net/tags/Lucene.Net_2_9_4_RC3/lib/should be when releasing the source in the future, we should either include a script that pulls the lib for the developers who want to compile from source inside a tag when the project is built using the solution. Or we should invest into using something like nuget for dependencies so that the dependencies are automatically fetched somehow and we can remove those from svn/scm altogether. the source currently violates the don't make me think about it principle. I know we all dislike chms, but until we figure out a better way of posting the generated msdn documentation online, we should include that in releases as well. The static website version generates a high number of static html files and our current CMS requires that those files are pushed into SVN which just is not feasible. Committing that all at once will choke infra's setup (and if they hired ninjas to pay us a visit, I probably wouldn't blame them) and doing partial commits is just borderline insanity. Just waiting on the all the tests to finish running. http://xkcd.com/303/ On Wed, Jan 25, 2012 at 6:29 AM, Stefan Bodewig bode...@apache.org wrote: On 2012-01-25, Michael Herndon wrote: Stefan what did you use to check the eof of files for svn? Pretty much a long and boring manual process. I did something like find . -name \*.cs -print0 | xargs -0 -e svn ps svn:eol-style native i.e. tried to set the eol-style property on all C# source files. This won't do anything if the property is set and tell you it has changed something in svn status if it the property hasn't been set before. svn will also fail if the file in question contains inconsistent line ends, this is the case for the NUnit doc files and even some of Lucene.NET sources. Repeat for all other file extension that should map to text files. I'm setting up RAT on my local. Are there any other tools that you or ASF recommends in general to validate releases? I think Sebb has a bunch of scripts he uses, but never bothered to look them up. If so, they'd be inside the comitters svn repo. For this release you don't even need to check line-feeds, the properties have not been set on all files. The patch I provided a while ago only applied to trunk. To me this is no reason to stop the release, in particular since most files have Windows line-ends and Prescott built the release on Windows so the files would be the same with and without svn:eol-style anyway. I intend to provide a new patch for the 3.0.3 branch once
RE: [Lucene.Net] Extracting IsStored fields from a search result
Hi Chris, There are many methods of IndexReader you can make use of. For ex, to get the field names in the index GetFieldNames or to get the terms Terms. I posted here (http://pastebin.com/aiCy26mj) a simple example to get all terms in the field myfield. DIGY -Original Message- From: Chris Lomax [mailto:chri...@helloevery1.com] Sent: Wednesday, January 11, 2012 1:27 PM To: lucene-net-dev@lucene.apache.org Subject: [Lucene.Net] Extracting IsStored fields from a search result I am trying to extract anything that is a field IsStored for use in filtering lists e.g. When you access an ecommerce site and you get the filter lists down the left hand side, I wanted to store anything that I wanted filtering and when I performed a search I wanted to create a lookup list from these IsStored values. I started something yesterday where I iterated over the fields returned from the search results and picked out anything that was IsStored but I feel there must be a more efficient way of doing this. I can not find any documentation relating to this type of extraction. Could anyone offer any advice on this? Also, the dll does not work on vb.net as there are fields named the same in the classes. For example, ScoreDocs in the Lucene.Net.Search.TopDocs has a variable and a property declaration but they are both set to Public so .Net cannot see the property. I had to modify the class so the variable was private. Thanks Chris - Checked by AVG - www.avg.com Version: 2012.0.1901 / Virus Database: 2109/4736 - Release Date: 01/11/12
[Lucene.Net] [jira] [Commented] (LUCENENET-463) Would like to be able to use a SimpleSpanFragmenter for extrcting whole sentances
[ https://issues.apache.org/jira/browse/LUCENENET-463?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13180965#comment-13180965 ] Digy commented on LUCENENET-463: My guess: Either it is not widely used as you think, or no one is willing to port it :) DIGY Would like to be able to use a SimpleSpanFragmenter for extrcting whole sentances -- Key: LUCENENET-463 URL: https://issues.apache.org/jira/browse/LUCENENET-463 Project: Lucene.Net Issue Type: New Feature Components: Lucene.Net Contrib Reporter: Steven Priority: Minor This is described in the Java version, but it does not seem to be in the dot.net port, is there a reason for this as I would have imagined lots of people doing document work would want it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
When I started that g branch, I had no intention to change the API, but at the end it resulted in a few changes like StopAnalyzer(Liststring stopWords), Query.ExtractTerms(ICollectionstring) etc. But I think, a drop-in replacement will work for most of the Lucene.Net users (Of course some contribs have been also modified accordingly) Changing arraylists/collections with generic counterparts, GetEnumerator's with foreach, AnonymousClass's with Func or Action's and Fixing LUCENENET-172 are things most people would not notice. This g version includes also some other patches that were fixed for .GE.(=) Lucene3.1 (Which? I have to rework on my commits) So, there isn't much change in API, more changes for developers and more stable code(At least I think so, since I use this g version in production env. for months without any problem. For short, 2.9.4g is a superset of 2.9.4 in bugfix level) As a result, creating a new branch for .Net friendly Lucene.Net or continuing on this branch is just a matter of taste. DIGY -Original Message- From: Scott Lombard [mailto:lombardena...@gmail.com] Sent: Thursday, December 29, 2011 5:05 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g I don't see the g branch differing all that much from the line-by-line port. All the g branch does is change some data types as generics, but line by line the code the same once the generics are declared. I don't see 2.9.4g being any closer to a .NET style version than 2.9.4. While it does generics use for list style variable types the underlying classes are still the same and all of the problems with 2.9.4 not being .NET enough would be true in 2.9.4g. I would have to refer to Digy on if it changes how an end user interacts with Lucene.NET. If it does not affect how the end user interacts with Lucene.NET then I think we should merge it into the Trunk and go from there on 3.0.3. Scott -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Wednesday, December 28, 2011 8:28 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g Any reason we can't continue this g branch and make it more and more .net like? I was thinking about what we've expressed at goals - we want a line by line port - it's easy to maintain parity with java and easy to compare. We also want a more .NET version - the g branch gets this started - although it's not as .Net as people want (I think). What if we used the g branch as our .Net version and continued to make it more .Net like? and kept the trunk as the line by line? The G branch seems like a good start to the more .Net version anyway - we might as well build off of that? From: digyd...@gmail.com To: lucene-net-...@lucene.apache.org Date: Thu, 29 Dec 2011 02:45:23 +0200 Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4gbut I guess the future of 2.9.4g depends on the extent that it is becoming more .NET like My intention while I was creating that branch was just to make 2.9.4 a little bit more .Net like(+ maybe some performance). I used many codes from 3.0.3 Java. So it is somewhere between 2.9.4 3.0.3 But I didn't think it as a separate branch to evolve on its own path. It is(or I think it is) the final version of 2.9 DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Wednesday, December 28, 2011 9:20 PM To: lucene-net-...@lucene.apache.org Cc: lucene-net-u...@lucene.apache.org Subject: Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g One of the benefits of moving forward with the conversion of the Java Lucene, is that they're using more recent versions of Java that support things like generics and enums, so the direct port is getting more and more like .NET, though not in all respects of course. I'm of the mind, though, that one of the larger annoyances, Iterables, should be converted to Enumerables in the direct port. It makes it a pain to use it in .NET without it inheriting from IEnumerable, since it can't be used in a foreach loop or with linq. Also, since the direct port isn't perfect anyway, it seems a port of the IDEA of iterating would be more in the spirit of what we're trying to accomplish, since the code would pretty much be the same, just with different method names. I sort of got off topic there for a second, but I guess the future of 2.9.4g depends on the extent that it is becoming more .NET like. Obviously, while java is starting to use similar constructs that we have in .NET, it will never be perfect. Admittedly, I haven't looked at 2.9.4g in a little while, so I'm not sure how much it now differs from 3.x, since there's a relatively large change there already. Thanks
RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
I forgot to mention, 2.9.4g implements IDisposable for many of the classes that has a Close() method which can be thought as .Net friendly API. DIGY -Original Message- From: Scott Lombard [mailto:lombardena...@gmail.com] Sent: Thursday, December 29, 2011 5:05 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g I don't see the g branch differing all that much from the line-by-line port. All the g branch does is change some data types as generics, but line by line the code the same once the generics are declared. I don't see 2.9.4g being any closer to a .NET style version than 2.9.4. While it does generics use for list style variable types the underlying classes are still the same and all of the problems with 2.9.4 not being .NET enough would be true in 2.9.4g. I would have to refer to Digy on if it changes how an end user interacts with Lucene.NET. If it does not affect how the end user interacts with Lucene.NET then I think we should merge it into the Trunk and go from there on 3.0.3. Scott -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Wednesday, December 28, 2011 8:28 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g Any reason we can't continue this g branch and make it more and more .net like? I was thinking about what we've expressed at goals - we want a line by line port - it's easy to maintain parity with java and easy to compare. We also want a more .NET version - the g branch gets this started - although it's not as .Net as people want (I think). What if we used the g branch as our .Net version and continued to make it more .Net like? and kept the trunk as the line by line? The G branch seems like a good start to the more .Net version anyway - we might as well build off of that? From: digyd...@gmail.com To: lucene-net-...@lucene.apache.org Date: Thu, 29 Dec 2011 02:45:23 +0200 Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4gbut I guess the future of 2.9.4g depends on the extent that it is becoming more .NET like My intention while I was creating that branch was just to make 2.9.4 a little bit more .Net like(+ maybe some performance). I used many codes from 3.0.3 Java. So it is somewhere between 2.9.4 3.0.3 But I didn't think it as a separate branch to evolve on its own path. It is(or I think it is) the final version of 2.9 DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Wednesday, December 28, 2011 9:20 PM To: lucene-net-...@lucene.apache.org Cc: lucene-net-u...@lucene.apache.org Subject: Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g One of the benefits of moving forward with the conversion of the Java Lucene, is that they're using more recent versions of Java that support things like generics and enums, so the direct port is getting more and more like .NET, though not in all respects of course. I'm of the mind, though, that one of the larger annoyances, Iterables, should be converted to Enumerables in the direct port. It makes it a pain to use it in .NET without it inheriting from IEnumerable, since it can't be used in a foreach loop or with linq. Also, since the direct port isn't perfect anyway, it seems a port of the IDEA of iterating would be more in the spirit of what we're trying to accomplish, since the code would pretty much be the same, just with different method names. I sort of got off topic there for a second, but I guess the future of 2.9.4g depends on the extent that it is becoming more .NET like. Obviously, while java is starting to use similar constructs that we have in .NET, it will never be perfect. Admittedly, I haven't looked at 2.9.4g in a little while, so I'm not sure how much it now differs from 3.x, since there's a relatively large change there already. Thanks, Christopher On Thu, Dec 22, 2011 at 9:13 PM, Prescott Nasser wrote: That's a great question - I know a lot of people like the generics, and I don't really want it to disappear. I'd like to keep it in parity with the trunk. But I know we also have a goal of making Lucene.Net more .Net like (further than 2.9.4g), and I don't know how that fits in. We are a pretty small community and I know everyone has some pretty busy schedules so it takes us considerable time to make big progress. Trying to keep three different code bases probably isn't the right way to go. Date: Fri, 23 Dec 2011 13:02:03 +1100From: mitiag...@gmail.comTo: lucene-net-u...@lucene.apache.orgSubject: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g I was browsing Roadmap emails from November in Lucene developer list. Itremains unclear
RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
but I guess the future of 2.9.4g depends on the extent that it is becoming more .NET like My intention while I was creating that branch was just to make 2.9.4 a little bit more .Net like(+ maybe some performance). I used many codes from 3.0.3 Java. So it is somewhere between 2.9.4 3.0.3 But I didn't think it as a separate branch to evolve on its own path. It is(or I think it is) the final version of 2.9 DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Wednesday, December 28, 2011 9:20 PM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-u...@lucene.apache.org Subject: Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g One of the benefits of moving forward with the conversion of the Java Lucene, is that they're using more recent versions of Java that support things like generics and enums, so the direct port is getting more and more like .NET, though not in all respects of course. I'm of the mind, though, that one of the larger annoyances, Iterables, should be converted to Enumerables in the direct port. It makes it a pain to use it in .NET without it inheriting from IEnumerable, since it can't be used in a foreach loop or with linq. Also, since the direct port isn't perfect anyway, it seems a port of the IDEA of iterating would be more in the spirit of what we're trying to accomplish, since the code would pretty much be the same, just with different method names. I sort of got off topic there for a second, but I guess the future of 2.9.4g depends on the extent that it is becoming more .NET like. Obviously, while java is starting to use similar constructs that we have in .NET, it will never be perfect. Admittedly, I haven't looked at 2.9.4g in a little while, so I'm not sure how much it now differs from 3.x, since there's a relatively large change there already. Thanks, Christopher On Thu, Dec 22, 2011 at 9:13 PM, Prescott Nasser geobmx...@hotmail.comwrote: That's a great question - I know a lot of people like the generics, and I don't really want it to disappear. I'd like to keep it in parity with the trunk. But I know we also have a goal of making Lucene.Net more .Net like (further than 2.9.4g), and I don't know how that fits in. We are a pretty small community and I know everyone has some pretty busy schedules so it takes us considerable time to make big progress. Trying to keep three different code bases probably isn't the right way to go. Date: Fri, 23 Dec 2011 13:02:03 +1100 From: mitiag...@gmail.com To: lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g I was browsing Roadmap emails from November in Lucene developer list. It remains unclear in what state Lucene 3 porting is , but my question more about 2.9.4g . Is it kind of experimental dead end variation of 2.9.4 with generics ? Am I right in classifying it as more .Net like 2.9.4 which is unrelated to roadmap Lucene 3 porting effort. - Checked by AVG - www.avg.com Version: 2012.0.1901 / Virus Database: 2109/4708 - Release Date: 12/28/11
[Lucene.Net] [jira] [Updated] (LUCENENET-459) Italian stemmer (from SnowballAnalyzer) does not work
[ https://issues.apache.org/jira/browse/LUCENENET-459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-459: --- Affects Version/s: Lucene.Net 2.9.4g Fix Version/s: Lucene.Net 2.9.4g Italian stemmer (from SnowballAnalyzer) does not work - Key: LUCENENET-459 URL: https://issues.apache.org/jira/browse/LUCENENET-459 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Contrib Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Santiago M. Mola Fix For: Lucene.Net 2.9.4g Italian stemmer does not work. Consider this code: var englishAnalyzer = new SnowballAnalyzer(English); var tk = englishAnalyzer.TokenStream(text, new StringReader(horses)); var ta = (TermAttribute)tk.GetAttribute(typeof(TermAttribute)); tk.IncrementToken(); Console.WriteLine(English stemmer: horses - + ta.Term()); var italianAnalyzer = new SnowballAnalyzer(Italian); tk = italianAnalyzer.TokenStream(text, new StringReader(abbandonata)); ta = (TermAttribute)tk.GetAttribute(typeof(TermAttribute)); tk.IncrementToken(); Console.WriteLine(Italian stemmer: abbandonata - + ta.Term()); It outputs: English stemmer: horses - hors Italian stemmer: abbandonata - abbandonata While Java Lucene 2.9.4 outputs: English stemmer: horses - hors Italian stemmer: abbandonata - abbandon -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-459) Italian stemmer (from SnowballAnalyzer) does not work
[ https://issues.apache.org/jira/browse/LUCENENET-459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13175117#comment-13175117 ] Digy commented on LUCENENET-459: Fixed in 2.9.4g branch. DIGY Italian stemmer (from SnowballAnalyzer) does not work - Key: LUCENENET-459 URL: https://issues.apache.org/jira/browse/LUCENENET-459 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Contrib Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Santiago M. Mola Fix For: Lucene.Net 2.9.4g Italian stemmer does not work. Consider this code: var englishAnalyzer = new SnowballAnalyzer(English); var tk = englishAnalyzer.TokenStream(text, new StringReader(horses)); var ta = (TermAttribute)tk.GetAttribute(typeof(TermAttribute)); tk.IncrementToken(); Console.WriteLine(English stemmer: horses - + ta.Term()); var italianAnalyzer = new SnowballAnalyzer(Italian); tk = italianAnalyzer.TokenStream(text, new StringReader(abbandonata)); ta = (TermAttribute)tk.GetAttribute(typeof(TermAttribute)); tk.IncrementToken(); Console.WriteLine(Italian stemmer: abbandonata - + ta.Term()); It outputs: English stemmer: horses - hors Italian stemmer: abbandonata - abbandonata While Java Lucene 2.9.4 outputs: English stemmer: horses - hors Italian stemmer: abbandonata - abbandon -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-412) Replacing ArrayLists, Hashtables etc. with appropriate Generics.
[ https://issues.apache.org/jira/browse/LUCENENET-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-412. Resolution: Fixed 2.9.4g is ready to go DIGY Replacing ArrayLists, Hashtables etc. with appropriate Generics. Key: LUCENENET-412 URL: https://issues.apache.org/jira/browse/LUCENENET-412 Project: Lucene.Net Issue Type: Improvement Affects Versions: Lucene.Net 2.9.4 Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g Attachments: IEquatable for QuerySubclasses.patch, LUCENENET-412.patch, lucene_2.9.4g_exceptions_fix This will move Lucene.Net.2.9.4 closer to lucene.3.0.3 and allow some performance gains. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2
FYI, 2.9.4 can be compiled against .Net 2.0 with a few minor changes in CloseableThreadLocal (like uncommenting ThreadLocalT class and replacing extension-methods calls with static calls to CloseableThreadLocalExtensions) DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Wednesday, November 30, 2011 7:26 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Re: Memory Leak in 2.9.2.2 Trevor, I'm not sure if you can use 2.9.4, though, it looks like you're using VS2005 and .NET 2.0. 2.9.4 targets .NET 4.0, and I'm fairly certain we use classes only available in 4.0 (or 3.5?). However, if you can, I would suggest updating, as 2.9.4 should be a fairly stable release. The leak I'm talking about is addressed here: https://issues.apache.org/jira/browse/LUCENE-2467, and the ported code isn't available in 2.9.2, but I've confirmed the patch is in 2.9.4. It may or may not be what your issue is. You say that it was at one time working fine, I assume you mean no memory leak. I would take some time to see what else in your code has changed. Make sure you're calling Close on whatever needs to be closed (IndexWriter/IndexReader/Analyzers, etc). Unfortunately for us, memory leaks are hard to debug over email, and it's difficult for us to tell if it's any change to your code or an issue with Lucene .NET. As far as I can tell, this is the only memory leak I can find that affects 2.9.2. Thanks, Christopher On Wed, Nov 30, 2011 at 8:25 AM, Prescott Nasser geobmx...@hotmail.comwrote: We just released 2.9.4 - the website didn't update last night, so ill have to try and update it later today. But if you follow the link to download 2.9.2 dist you'll see folders for 2.9.4. I'll send an email to the user and dev lists once i get the website to update From: Trevor Watson Sent: 11/30/2011 8:14 AM To: lucene-net-dev@lucene.apache.org Subject: [Lucene.Net] Re: Memory Leak in 2.9.2.2 You said pre 2.9.3 I checked the apache lucene.net page to try to see if I could get a copy of 2.9.3, but it was never on the site, just 2.9.2.2 and 2.9.4(g). Was this an un-released version? Or am I looking in the wrong spot for updates to lucene.net? Thanks for all your help On Tue, Nov 29, 2011 at 2:59 PM, Trevor Watson powersearchsoftw...@gmail.com wrote: I can send you the dll that I am using if you would like. The documents are _mostly_ small documents. Emails and office docs size of plain text On Tuesday, November 29, 2011, Christopher Currens currens.ch...@gmail.com wrote: Do you know how big the documents are that you are trying to delete/update? I'm trying to find a copy of 2.9.2 to see if I can reproduce it. Thanks, Christopher On Tue, Nov 29, 2011 at 9:11 AM, Trevor Watson powersearchsoftw...@gmail.com wrote: Sorry for the duplicate post. I was on the road and posted both via my web mail and office mail by mistake The increase is a very gradual, the program starts at about 160,000k according to task manager (I know that's not entirely accurate, but it was the best I had at the time) and would, after adding 25,000-40,000 result in an out of memory exception (800,000k according to taskmanager). I tried building a copy of 2.9.4 to test, but could not find one that worked in visual studio 2005 I did notice using Ants memory profiler that there were a number of byte[32789] arrays that I didn't know where they came from in memory. On Monday, November 28, 2011, Christopher Currens currens.ch...@gmail.com wrote: Hi Trevor, What kind of memory increase are we talking about? Also, how big are the documents that you are indexing, the ones returned from getFileInfoDoc()? Is it putting an entire file into the index? Pre 2.9.3 versions had issues with holding onto allocated byte arrays far beyond when they were used. The memory could only be freed via closing the IndexWriter. I'm a little unclear on exactly what's happening. Are you noticing memory spike and stay constant at that level or is it a gradual increase? Is it causing your application to error, (ie OutOfMemory exception, etc)? Thanks, Christopher On Mon, Nov 28, 2011 at 5:59 PM, Trevor Watson powersearchsoftw...@gmail.com wrote: I'm attempting to use Lucene.Net v2.9.2.2 in a Visual Studio 2005 (.NET 2.0) environment. We had a piece of software that WAS working. I'm not sure what has changed however, the following code results in a memory leak in the Lucene.Net component (or a failure to clean up used memory). The code in issue is here: private void
RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2
OK, here is the code that can be compiled against .NET 2.0 http://pastebin.com/k2f7JfPd DIGY -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: Wednesday, November 30, 2011 9:26 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 DIGY, Thanks for the tip, but could you be a little more specific? Where and how are extension-methods calls replaced? For example, how would I change the CloseableThreadLocalExtensions method public static void SetT(this ThreadLocalT t, T val) { t.Value = val; } to eliminate the compile error Error 2 Cannot define a new extension method because the compiler required type 'System.Runtime.CompilerServices.ExtensionAttribute' cannot be found. Are you missing a reference to System.Core.dll? due to the SetT parameter this ThreadLocalt t ? - Neal -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Wednesday, November 30, 2011 12:27 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 FYI, 2.9.4 can be compiled against .Net 2.0 with a few minor changes in CloseableThreadLocal (like uncommenting ThreadLocalT class and replacing extension-methods calls with static calls to CloseableThreadLocalExtensions) DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Wednesday, November 30, 2011 7:26 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Re: Memory Leak in 2.9.2.2 Trevor, I'm not sure if you can use 2.9.4, though, it looks like you're using VS2005 and .NET 2.0. 2.9.4 targets .NET 4.0, and I'm fairly certain we use classes only available in 4.0 (or 3.5?). However, if you can, I would suggest updating, as 2.9.4 should be a fairly stable release. The leak I'm talking about is addressed here: https://issues.apache.org/jira/browse/LUCENE-2467, and the ported code isn't available in 2.9.2, but I've confirmed the patch is in 2.9.4. It may or may not be what your issue is. You say that it was at one time working fine, I assume you mean no memory leak. I would take some time to see what else in your code has changed. Make sure you're calling Close on whatever needs to be closed (IndexWriter/IndexReader/Analyzers, etc). Unfortunately for us, memory leaks are hard to debug over email, and it's difficult for us to tell if it's any change to your code or an issue with Lucene .NET. As far as I can tell, this is the only memory leak I can find that affects 2.9.2. Thanks, Christopher On Wed, Nov 30, 2011 at 8:25 AM, Prescott Nasser geobmx...@hotmail.comwrote: We just released 2.9.4 - the website didn't update last night, so ill have to try and update it later today. But if you follow the link to download 2.9.2 dist you'll see folders for 2.9.4. I'll send an email to the user and dev lists once i get the website to update From: Trevor Watson Sent: 11/30/2011 8:14 AM To: lucene-net-dev@lucene.apache.org Subject: [Lucene.Net] Re: Memory Leak in 2.9.2.2 You said pre 2.9.3 I checked the apache lucene.net page to try to see if I could get a copy of 2.9.3, but it was never on the site, just 2.9.2.2 and 2.9.4(g). Was this an un-released version? Or am I looking in the wrong spot for updates to lucene.net? Thanks for all your help On Tue, Nov 29, 2011 at 2:59 PM, Trevor Watson powersearchsoftw...@gmail.com wrote: I can send you the dll that I am using if you would like. The documents are _mostly_ small documents. Emails and office docs size of plain text On Tuesday, November 29, 2011, Christopher Currens currens.ch...@gmail.com wrote: Do you know how big the documents are that you are trying to delete/update? I'm trying to find a copy of 2.9.2 to see if I can reproduce it. Thanks, Christopher On Tue, Nov 29, 2011 at 9:11 AM, Trevor Watson powersearchsoftw...@gmail.com wrote: Sorry for the duplicate post. I was on the road and posted both via my web mail and office mail by mistake The increase is a very gradual, the program starts at about 160,000k according to task manager (I know that's not entirely accurate, but it was the best I had at the time) and would, after adding 25,000-40,000 result in an out of memory exception (800,000k according to taskmanager). I tried building a copy of 2.9.4 to test, but could not find one that worked in visual studio 2005 I did notice using Ants memory profiler that there were a number of byte[32789] arrays that I didn't know where they came from in memory. On Monday, November 28, 2011, Christopher Currens currens.ch...@gmail.com wrote: Hi Trevor, What kind of memory increase are we talking about
RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2
If I recall it correctly, last memory leak problem for 2.9.2 was reported in ~August from RavenDB, and it was fixed in 2.9.4(g) DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Wednesday, November 30, 2011 11:33 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Re: Memory Leak in 2.9.2.2 Trevor, Unforunately I was unable to reproduce the memory leak you're experiencing in 2.9.2. Particularly with byte[], of the 18,277 that were created, only 13 were not garbage collected, and it's likely that they are not related to Lucene (it's possible they are static, therefore would only be destroyed with the AppDomain, outside of what the profiler can trace). I tried to emulate the code you showed us and there were no signs of any allocated arrays that weren't cleaned up. That doesn't mean there isn't one in your code, but I just can't reproduce it with what you've shown us. If it's possible you can write a small program that has the same behavior, that could help us track it down. As a side note, what was a little disconcerting, though, was in 2.9.4 with the same code, it created 28,565 byte[], and there was quite a few more left uncollected (2,805 arrays). The allocations are happening in DocumentsWriter.ByteBlockAllocator, I'll have to look at it later though, to see if its even a problem. Thanks, Christopher On Wed, Nov 30, 2011 at 12:41 PM, Granroth, Neal V. neal.granr...@thermofisher.com wrote: Or maybe put the changes within a conditional compile code block? Thanks DIGY, works great. - Neal -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Wednesday, November 30, 2011 2:35 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 Probably makes for a good wiki entry Sent from my Windows Phone From: Digy Sent: 11/30/2011 12:04 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 OK, here is the code that can be compiled against .NET 2.0 http://pastebin.com/k2f7JfPd DIGY -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: Wednesday, November 30, 2011 9:26 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 DIGY, Thanks for the tip, but could you be a little more specific? Where and how are extension-methods calls replaced? For example, how would I change the CloseableThreadLocalExtensions method public static void SetT(this ThreadLocalT t, T val) { t.Value = val; } to eliminate the compile error Error 2 Cannot define a new extension method because the compiler required type 'System.Runtime.CompilerServices.ExtensionAttribute' cannot be found. Are you missing a reference to System.Core.dll? due to the SetT parameter this ThreadLocalt t ? - Neal -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Wednesday, November 30, 2011 12:27 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 FYI, 2.9.4 can be compiled against .Net 2.0 with a few minor changes in CloseableThreadLocal (like uncommenting ThreadLocalT class and replacing extension-methods calls with static calls to CloseableThreadLocalExtensions) DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Wednesday, November 30, 2011 7:26 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Re: Memory Leak in 2.9.2.2 Trevor, I'm not sure if you can use 2.9.4, though, it looks like you're using VS2005 and .NET 2.0. 2.9.4 targets .NET 4.0, and I'm fairly certain we use classes only available in 4.0 (or 3.5?). However, if you can, I would suggest updating, as 2.9.4 should be a fairly stable release. The leak I'm talking about is addressed here: https://issues.apache.org/jira/browse/LUCENE-2467, and the ported code isn't available in 2.9.2, but I've confirmed the patch is in 2.9.4. It may or may not be what your issue is. You say that it was at one time working fine, I assume you mean no memory leak. I would take some time to see what else in your code has changed. Make sure you're calling Close on whatever needs to be closed (IndexWriter/IndexReader/Analyzers, etc). Unfortunately for us, memory leaks are hard to debug over email, and it's difficult for us to tell if it's any change to your code or an issue with Lucene .NET. As far as I can tell, this is the only memory leak I can find that affects 2.9.2. Thanks, Christopher On Wed, Nov 30, 2011 at 8:25 AM, Prescott Nasser geobmx...@hotmail.comwrote: We just released 2.9.4 - the website didn't update last night, so ill have to try and update it later today. But if you follow the link to download 2.9.2
RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2
... and it was related with CloseableThreadLocal (fixed in 2.9.4(g)) which now creates compilation problem against .Net20 :) DIGY -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Thursday, December 01, 2011 12:09 AM To: 'lucene-net-dev@lucene.apache.org' Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 If I recall it correctly, last memory leak problem for 2.9.2 was reported in ~August from RavenDB, and it was fixed in 2.9.4(g) DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Wednesday, November 30, 2011 11:33 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Re: Memory Leak in 2.9.2.2 Trevor, Unforunately I was unable to reproduce the memory leak you're experiencing in 2.9.2. Particularly with byte[], of the 18,277 that were created, only 13 were not garbage collected, and it's likely that they are not related to Lucene (it's possible they are static, therefore would only be destroyed with the AppDomain, outside of what the profiler can trace). I tried to emulate the code you showed us and there were no signs of any allocated arrays that weren't cleaned up. That doesn't mean there isn't one in your code, but I just can't reproduce it with what you've shown us. If it's possible you can write a small program that has the same behavior, that could help us track it down. As a side note, what was a little disconcerting, though, was in 2.9.4 with the same code, it created 28,565 byte[], and there was quite a few more left uncollected (2,805 arrays). The allocations are happening in DocumentsWriter.ByteBlockAllocator, I'll have to look at it later though, to see if its even a problem. Thanks, Christopher On Wed, Nov 30, 2011 at 12:41 PM, Granroth, Neal V. neal.granr...@thermofisher.com wrote: Or maybe put the changes within a conditional compile code block? Thanks DIGY, works great. - Neal -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Wednesday, November 30, 2011 2:35 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 Probably makes for a good wiki entry Sent from my Windows Phone From: Digy Sent: 11/30/2011 12:04 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 OK, here is the code that can be compiled against .NET 2.0 http://pastebin.com/k2f7JfPd DIGY -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: Wednesday, November 30, 2011 9:26 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 DIGY, Thanks for the tip, but could you be a little more specific? Where and how are extension-methods calls replaced? For example, how would I change the CloseableThreadLocalExtensions method public static void SetT(this ThreadLocalT t, T val) { t.Value = val; } to eliminate the compile error Error 2 Cannot define a new extension method because the compiler required type 'System.Runtime.CompilerServices.ExtensionAttribute' cannot be found. Are you missing a reference to System.Core.dll? due to the SetT parameter this ThreadLocalt t ? - Neal -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Wednesday, November 30, 2011 12:27 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Re: Memory Leak in 2.9.2.2 FYI, 2.9.4 can be compiled against .Net 2.0 with a few minor changes in CloseableThreadLocal (like uncommenting ThreadLocalT class and replacing extension-methods calls with static calls to CloseableThreadLocalExtensions) DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Wednesday, November 30, 2011 7:26 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Re: Memory Leak in 2.9.2.2 Trevor, I'm not sure if you can use 2.9.4, though, it looks like you're using VS2005 and .NET 2.0. 2.9.4 targets .NET 4.0, and I'm fairly certain we use classes only available in 4.0 (or 3.5?). However, if you can, I would suggest updating, as 2.9.4 should be a fairly stable release. The leak I'm talking about is addressed here: https://issues.apache.org/jira/browse/LUCENE-2467, and the ported code isn't available in 2.9.2, but I've confirmed the patch is in 2.9.4. It may or may not be what your issue is. You say that it was at one time working fine, I assume you mean no memory leak. I would take some time to see what else in your code has changed. Make sure you're calling Close on whatever needs to be closed (IndexWriter/IndexReader/Analyzers, etc). Unfortunately for us, memory leaks are hard to debug over email, and it's difficult for us to tell if it's any change to your code or an issue with Lucene .NET. As far as I can tell
RE: [Lucene.Net] Roadmap
Chris, Sorry, if you took my comments about pain of porting personally. That wasn't my intension. +1 for all your changes/divergences. I made/could have made them too. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 11:45 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap Digy, I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the code in 2.9.4g as a reference for many things, particularly the Support classes. We hit many of the same issues I'm sure, I moved some of the anonymous classes into a base class where you could inject functions, though not all could be replaced, nor did I replace all that could have been. Some of our code is different, I went for the option for WeakDictionary to be completely generic, as in wrapping a generic dictionary with WeakKeyT instead of wrapping the already existing WeakHashTable in support. In hindsight, it may have just been easier to convert the WeakHashTable to generic, but alas, I'm only realizing that now. There is a problem with my WeakDictionary, specifically the function that determines when to clean/compact the dictionary and remove the dead keys. I need a better heuristic of deciding when to run the clean. That's a performance issue though. Regarding the pain of porting, I am a changed man. It's nice, in a sad way, to know that I'm not the only one who experienced those difficulties. I used to be in the camp that porting code that differed from java wouldn't be difficult at all. However, now I code corrected! It threw me a curve-ball, for sure. I DO think a line-by-line port can definitely include the things talked about below, ie the changes to Dispose and the changes to IEnumerableT. Those changes, I thing, can be made without a heavy impact on the porting process. There was one fairly large change I opted to use that differed quite a bit from Java, however, and that was the use of the TPL in ParallelMultiSearcher. It was far easier to port this way, and I don't think it affects the porting process too much. Java uses a helper class defined at the bottom of the source file that handles it, I'm simply using a built-in one instead. I just need to be careful about it, it would be really easy to get carried away with it. Thanks, Christopher On Mon, Nov 21, 2011 at 1:20 PM, Digy digyd...@gmail.com wrote: Hi Chris, First of all, thank you for your great work on 3.0.3 branch. I suppose you took 2.9.4 as a code base to make 3.0.3 port since some of your problems are the same with those I faced in 2.9.4g branch. (e.g, Support/MemoryMappedDirectory.cs (but never used in core), IDisposable, introduction of some ActionTs, FuncTs , foreach instead of GetEnumerator/MoveNext, IEquatableT, WeakDictionaryT, SetT etc. ) Since I also used 3.0.3 as a reference, maybe we can use some of 2.9.4g's code in 3.0.3 when necessary(I haven't had time to look into 3.0.3 deeply) Just to ensure the coordination, maybe you should create a new issue in JIRA, so that people send patches to that issue instead of directly commiting. @Prescott, 2.9.4g is not behind of 2.9.4 in bug fixes features level. So, It is (I think) ready for another release.(I use it in all my projects since long). PS: Hearing the pain of porting codes that greatly differ from Java made me just smile( sorry for that:( ). Be ready for responses that get beyond the criticism between With all due respect Just my $0.02 paranthesis. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 10:19 PM To: lucene-net-dev@lucene.apache.org; casper...@caspershouse.com Subject: Re: [Lucene.Net] Roadmap Some of the Lucene classes have Dispose methods, well, ones that call Close (and that Close method may or may not call base.Close(), if needed or not). Virtual dispose methods can be dangerous only in that they're easy to implement wrong. However, it shouldn't be too bad, at least with a line-by-line port, as we would make the call to the base class whenever Lucene does, and that would (should) give us the same behavior, implemented properly. I'm not aware of differences in the JVM, regarding inheritance and base methods being called automatically, particularly Close methods. Slightly unrelated, another annoyance is the use of Java Iterators vs C# Enumerables. A lot of our code is there simply because there are Iterators, but it could be converted to Enumerables. The whole HasNext, Next vs C#'s MoveNext(), Current is annoying, but it's used all over in the base code, and would have to be changed there as well. Either way, I would like to push for that before 3.0.3 is relased. IMO, small changes like this still keep the code similar to the line-by-line port, in that it doesn't add any difficulties
RE: [Lucene.Net] Roadmap
My english isn't enough to understand this answer. I hope it is not related with employee-employer relationship as in the past. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Tuesday, November 22, 2011 1:08 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap To clarify, it wasn't as much *difficult* as it was more *painful*. Above, I was inferring that it was more difficult that the rest of the code, which by comparison was easier. It wasn't painless to try and map where code changes were from the java classes into the .Net version. I prefer that style more for its readability and the niceties of working with a .Net style of Lucene, however as I said before, it slowed down significantly the porting process. I hope it didn't come across that I thought that it was bad code, because it's probably the most readable code we have in the Contrib at the moment. I want to make it clear that my intention right now is to get Lucene.Net up to date with Java. When I read the Java code, I understand its intent, and I make sure the ported code represents it. That takes enough time as it is, moving to try and figure out where the code went in Lucene.Net, since it wasn't a 1-1 map, was a MINOR annoyance, especially when you compare it to the issues I had dealing with the differences between the two languages, generics especialy. That being said, I don't have a problem with code being converted in a .Net idiomatic way, in fact, I welcome it, if it still allows the changes to be ported with minimal effort. I feel at this point in the project, there are some limitations to how far I'd like it to diverge. Anyway, my opinion, which may not be in agreement with the group as a whole, is that it would be better to bring the codebase up to date, or at least more up to date with java's, and then maintaining a version with a complete .net-concentric API. I feel this would beeasier, as porting Java's Lucene SVN commits by the week would be a relatively small workload. On Mon, Nov 21, 2011 at 2:41 PM, Troy Howard thowar...@gmail.com wrote: So, if we're getting back to the line by line port discussion... I think either side of this discussion is too extreme. For the case in point Chris just mentioned (which I'm not really sure what part was so difficult, as I ported that library in about 30 minutes from scratch)... anything is a pain if it sticks out in the middle of doing something completely different. The only reason we are able to do this line by line is due to the general similarity between Java and C#'s language syntax. If we were porting Lucene to a completely different language, that had a totally different syntax, the process would go like this: - Look at the original code, understand it's intent - Create similar code in the new language that expresses the same intent When applying changes: - Look at the original code diffs, understanding the intent of the change - Look at the ported code, and apply the changed logic's meaning in that language So, is just a different thought process. In my opinion, it's a better process because it forces the developer to actually think about the code instead of blindly converting syntax (possibly slightly incorrectly and introducing regressions). While there is a large volume of unit tests in Lucene, they are unfortunately not really the right tests and make porting much more difficult, because it's hard to verify that your ported code behaves the same because you can't just rely on the unit tests to verify your port. Therefore, it's safer to follow a process that requires the developer to delve deeply into the meaning of the code. Following a line-by-line process is convenient, but doesn't focus on meaning, which I think is more important. Thanks, Troy On Mon, Nov 21, 2011 at 2:23 PM, Christopher Currens currens.ch...@gmail.com wrote: Digy, No worries. I wasn't taking them personally. You've been doing this for a lot longer than I have, but I didn't understand you pain until I had to go through it personally. :P Have you looked at Contrib in a while? There's a lot of projects that are in Java's Contrib that are not in Lucene.Net? Is this because there are some that can't easily (if at all) be ported over to .NET or just because they've been neglected? I'm trying to get a handle on what's important to port and what isn't. Figured someone with experience could help me with a starting point over deciding where to start with everything that's missing. Thanks, Christopher On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote: Chris, Sorry, if you took my comments about pain of porting personally. That wasn't my intension. +1 for all your changes/divergences. I made/could have made them too. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent
RE: [Lucene.Net] Roadmap
Hi Chris, First of all, thank you for your great work on 3.0.3 branch. I suppose you took 2.9.4 as a code base to make 3.0.3 port since some of your problems are the same with those I faced in 2.9.4g branch. (e.g, Support/MemoryMappedDirectory.cs (but never used in core), IDisposable, introduction of some ActionTs, FuncTs , foreach instead of GetEnumerator/MoveNext, IEquatableT, WeakDictionaryT, SetT etc. ) Since I also used 3.0.3 as a reference, maybe we can use some of 2.9.4g's code in 3.0.3 when necessary(I haven't had time to look into 3.0.3 deeply) Just to ensure the coordination, maybe you should create a new issue in JIRA, so that people send patches to that issue instead of directly commiting. @Prescott, 2.9.4g is not behind of 2.9.4 in bug fixes features level. So, It is (I think) ready for another release.(I use it in all my projects since long). PS: Hearing the pain of porting codes that greatly differ from Java made me just smile( sorry for that:( ). Be ready for responses that get beyond the criticism between With all due respect Just my $0.02 paranthesis. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 10:19 PM To: lucene-net-...@lucene.apache.org; casper...@caspershouse.com Subject: Re: [Lucene.Net] Roadmap Some of the Lucene classes have Dispose methods, well, ones that call Close (and that Close method may or may not call base.Close(), if needed or not). Virtual dispose methods can be dangerous only in that they're easy to implement wrong. However, it shouldn't be too bad, at least with a line-by-line port, as we would make the call to the base class whenever Lucene does, and that would (should) give us the same behavior, implemented properly. I'm not aware of differences in the JVM, regarding inheritance and base methods being called automatically, particularly Close methods. Slightly unrelated, another annoyance is the use of Java Iterators vs C# Enumerables. A lot of our code is there simply because there are Iterators, but it could be converted to Enumerables. The whole HasNext, Next vs C#'s MoveNext(), Current is annoying, but it's used all over in the base code, and would have to be changed there as well. Either way, I would like to push for that before 3.0.3 is relased. IMO, small changes like this still keep the code similar to the line-by-line port, in that it doesn't add any difficulties in the porting process, but provides great benefits to the users of the code, to have a .NET centric API. I don't think it would violate our project desciption we have listed on our Incubator page, either. Thanks, Christopher On Mon, Nov 21, 2011 at 12:03 PM, casper...@caspershouse.com casper...@caspershouse.com wrote: +1 on the suggestion to move Close - IDisposable; not being able to use using is such a pain, and an eyesore on the code. Although it will have to be done properly, and not just have Dispose call Close (you should have proper protected virtual Dispose methods to take inheritance into account, etc). - Nick From: Christopher Currens currens.ch...@gmail.com Sent: Monday, November 21, 2011 2:56 PM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap Regarding the 3.0.3 branch I started last week, I've put in a lot of late nights and gotten far more done in a week and a half than I expected. The list of changes is very large, and fortunately, I've documented it in some files that are in the branches root of certain projects. I'll list what changes have been made so far, and some of the concerns I have about them, as well as what still needs to be done. You can read them all in detail in the files that are in the branch. All changes in 3.0.3 have been ported to the Lucene.Net and Lucene.Net.Test, except BooleanClause, LockStressTest, MMapDirectory, NIOFSDirectory, DummyConcurrentLock, NamedThreadFactory, and ThreadInterruptedException. MMapDirectory and NIOFSDirectory have never been ported in the first place for 2.9.4, so I'm not worried about those. LockStressTest is a command-line tool, porting it should be easy, but not essential to a 3.0.3 release, IMO. DummyConcurrentLock also seems unnecessary (and non-portable) for .NET, since it's based around Java's Lock class and is only used to bypass locking, which can be done by passing new Object() to the method. NamedThreadFactory I'm unsure about. It's used in ParallelMultiSearcher (in which I've opted to use the TPL), and seems to be only used for debugging, possibly testing. Either way, I'm not sure it's necessary. Also, named threads would mean we probably would have to move the class from the TPL, which greatly simplified the code and parallelization of it all, as I can't see a way to Set names for a Task. I suppose
RE: [Lucene.Net] Roadmap
Troy, I am not againt it if you can continue to understand and port so easyly. No one here -I think- wants a java-tastes code. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Tuesday, November 22, 2011 12:42 AM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap So, if we're getting back to the line by line port discussion... I think either side of this discussion is too extreme. For the case in point Chris just mentioned (which I'm not really sure what part was so difficult, as I ported that library in about 30 minutes from scratch)... anything is a pain if it sticks out in the middle of doing something completely different. The only reason we are able to do this line by line is due to the general similarity between Java and C#'s language syntax. If we were porting Lucene to a completely different language, that had a totally different syntax, the process would go like this: - Look at the original code, understand it's intent - Create similar code in the new language that expresses the same intent When applying changes: - Look at the original code diffs, understanding the intent of the change - Look at the ported code, and apply the changed logic's meaning in that language So, is just a different thought process. In my opinion, it's a better process because it forces the developer to actually think about the code instead of blindly converting syntax (possibly slightly incorrectly and introducing regressions). While there is a large volume of unit tests in Lucene, they are unfortunately not really the right tests and make porting much more difficult, because it's hard to verify that your ported code behaves the same because you can't just rely on the unit tests to verify your port. Therefore, it's safer to follow a process that requires the developer to delve deeply into the meaning of the code. Following a line-by-line process is convenient, but doesn't focus on meaning, which I think is more important. Thanks, Troy On Mon, Nov 21, 2011 at 2:23 PM, Christopher Currens currens.ch...@gmail.com wrote: Digy, No worries. I wasn't taking them personally. You've been doing this for a lot longer than I have, but I didn't understand you pain until I had to go through it personally. :P Have you looked at Contrib in a while? There's a lot of projects that are in Java's Contrib that are not in Lucene.Net? Is this because there are some that can't easily (if at all) be ported over to .NET or just because they've been neglected? I'm trying to get a handle on what's important to port and what isn't. Figured someone with experience could help me with a starting point over deciding where to start with everything that's missing. Thanks, Christopher On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote: Chris, Sorry, if you took my comments about pain of porting personally. That wasn't my intension. +1 for all your changes/divergences. I made/could have made them too. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 11:45 PM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap Digy, I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the code in 2.9.4g as a reference for many things, particularly the Support classes. We hit many of the same issues I'm sure, I moved some of the anonymous classes into a base class where you could inject functions, though not all could be replaced, nor did I replace all that could have been. Some of our code is different, I went for the option for WeakDictionary to be completely generic, as in wrapping a generic dictionary with WeakKeyT instead of wrapping the already existing WeakHashTable in support. In hindsight, it may have just been easier to convert the WeakHashTable to generic, but alas, I'm only realizing that now. There is a problem with my WeakDictionary, specifically the function that determines when to clean/compact the dictionary and remove the dead keys. I need a better heuristic of deciding when to run the clean. That's a performance issue though. Regarding the pain of porting, I am a changed man. It's nice, in a sad way, to know that I'm not the only one who experienced those difficulties. I used to be in the camp that porting code that differed from java wouldn't be difficult at all. However, now I code corrected! It threw me a curve-ball, for sure. I DO think a line-by-line port can definitely include the things talked about below, ie the changes to Dispose and the changes to IEnumerableT. Those changes, I thing, can be made without a heavy impact on the porting process. There was one fairly large change I opted to use that differed quite a bit from Java, however, and that was the use of the TPL in ParallelMultiSearcher. It was far easier to port this way, and I don't think it affects the porting process too
RE: [Lucene.Net] Final patches for rc3
What do folks think about including LUCENENET-427, LUCENENET-431, LUCENENET-433 LUCENENET-443 fixes in that release and removing Similarity.Net? DIGY -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Thursday, November 10, 2011 10:19 PM To: lucene-net-dev@lucene.apache.org Subject: [Lucene.Net] Final patches for rc3 Im reviewing all the issues before I cut rc3. Looks like I have the following: .apply lucenenet-453 patch from Stefan to fix the line endings .couple of license headers .update the readme .add an rc3 tag. Is there anything im missing? I will have some time tonight to fix these items and then prepare new files for another vote. -P Sent from my Windows Phone - Checked by AVG - www.avg.com Version: 2012.0.1869 / Virus Database: 2092/4608 - Release Date: 11/10/11
[Lucene.Net] [jira] [Updated] (LUCENENET-448) GeoHashFilteredDocIdSet does not work at all
[ https://issues.apache.org/jira/browse/LUCENENET-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-448: --- Affects Version/s: (was: Lucene.Net 2.9.4) Lucene.Net 2.9.4g Fix Version/s: (was: Lucene.Net 2.9.4) Lucene.Net 2.9.4g GeoHashFilteredDocIdSet does not work at all Key: LUCENENET-448 URL: https://issues.apache.org/jira/browse/LUCENENET-448 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Contrib Affects Versions: Lucene.Net 2.9.4g Environment: Windows 7 x64 Reporter: Jeff Johnson Labels: contrib, spatial Fix For: Lucene.Net 2.9.4g Original Estimate: 1h Remaining Estimate: 1h The GeoHashFilteredDocIdSet is assuming the values are always in the cache which is wrong. A proposed fix for the method is listed here for GeoHashDistanceFilter.cs: public GeoHashFilteredDocIdSet(DocIdSet innerSet, string[] geoHashValues, Dictionarystring, double distanceLookupCache, double lat, double lng, int docBase, double distance, Dictionaryint, double distances) : base(innerSet , (docid) = /* was: public override Match */ { String geoHash = geoHashValues[docid]; double[] coords = GeoHashUtils.Decode(geoHash); double x = coords[0]; double y = coords[1]; double cachedDistance; distanceLookupCache.TryGetValue(geoHash, out cachedDistance); double d; if (cachedDistance 0) { d = cachedDistance; } else { d = DistanceUtils.GetInstance().GetDistanceMi(lat, lng, x, y); distanceLookupCache[geoHash] = d; } if (d distance) { distances[docid + docBase] = d; return true; } return false; }) { _geoHashValues = geoHashValues; _distances = distances; _distance = distance; _docBase = docBase; _lng = lng; _lat = lat; _distanceLookupCache = distanceLookupCache; } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-448) GeoHashFilteredDocIdSet does not work at all
[ https://issues.apache.org/jira/browse/LUCENENET-448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-448. Resolution: Fixed Hi Jeff, Your fix is for 2.9.4g branch and I committed it. (Although I was hard to compare your code with the current code). Thank you. GeoHashFilteredDocIdSet does not work at all Key: LUCENENET-448 URL: https://issues.apache.org/jira/browse/LUCENENET-448 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Contrib Affects Versions: Lucene.Net 2.9.4g Environment: Windows 7 x64 Reporter: Jeff Johnson Labels: contrib, spatial Fix For: Lucene.Net 2.9.4g Original Estimate: 1h Remaining Estimate: 1h The GeoHashFilteredDocIdSet is assuming the values are always in the cache which is wrong. A proposed fix for the method is listed here for GeoHashDistanceFilter.cs: public GeoHashFilteredDocIdSet(DocIdSet innerSet, string[] geoHashValues, Dictionarystring, double distanceLookupCache, double lat, double lng, int docBase, double distance, Dictionaryint, double distances) : base(innerSet , (docid) = /* was: public override Match */ { String geoHash = geoHashValues[docid]; double[] coords = GeoHashUtils.Decode(geoHash); double x = coords[0]; double y = coords[1]; double cachedDistance; distanceLookupCache.TryGetValue(geoHash, out cachedDistance); double d; if (cachedDistance 0) { d = cachedDistance; } else { d = DistanceUtils.GetInstance().GetDistanceMi(lat, lng, x, y); distanceLookupCache[geoHash] = d; } if (d distance) { distances[docid + docBase] = d; return true; } return false; }) { _geoHashValues = geoHashValues; _distances = distances; _distance = distance; _docBase = docBase; _lng = lng; _lat = lat; _distanceLookupCache = distanceLookupCache; } -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-450) Incorrect use of StreamReader in MoreLikeThis
[ https://issues.apache.org/jira/browse/LUCENENET-450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13141418#comment-13141418 ] Digy commented on LUCENENET-450: Similarity net is obsolete and all of its functionality has been moved to Contrib/Queries as in java(including unit tests). https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/src/contrib/README.txt Incorrect use of StreamReader in MoreLikeThis - Key: LUCENENET-450 URL: https://issues.apache.org/jira/browse/LUCENENET-450 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Contrib Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 3.x Reporter: Itamar Syn-Hershko Assignee: Prescott Nasser Fix For: Lucene.Net 2.9.4 Attachments: Lucenenet-450.patch In the MoreLike this implementation (looking at the trunk), line 772 incorrectly creates a new StreamReader instead of using StringReader. This causes the following exception to be thrown, since the ctor expects a file path and not an input string: System.ArgumentException: Illegal characters in path. at System.Security.Permissions.FileIOPermission.HasIllegalCharacters(String[] str) at System.Security.Permissions.FileIOPermission.AddPathList(FileIOPermissionAccess access, AccessControlActions control, String[] pathListOrig, Boolean checkForDuplicates, Boolean needFullPath, Boolean copyPathList) at System.Security.Permissions.FileIOPermission..ctor(FileIOPermissionAccess access, AccessControlActions control, String[] pathList, Boolean checkForDuplicates, Boolean needFullPath) at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath) at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options, String msgPath, Boolean bFromProxy) at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share, Int32 bufferSize, FileOptions options) at System.IO.StreamReader..ctor(String path, Encoding encoding, Boolean detectEncodingFromByteOrderMarks, Int32 bufferSize) at System.IO.StreamReader..ctor(String path, Boolean detectEncodingFromByteOrderMarks) at Similarity.Net.MoreLikeThis.RetrieveTerms(Int32 docNum) in MoreLikeThis.cs:line 773 at Similarity.Net.MoreLikeThis.Like(Int32 docNum) in MoreLikeThis.cs:line 507 Simply replacing StreamReader with StringReader will do the job -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Closed] (LUCENENET-441) Encountered: EOF after : \\\\ during parsing a query
[ https://issues.apache.org/jira/browse/LUCENENET-441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy closed LUCENENET-441. -- Resolution: Invalid Maverick904, Please use mailing lists to ask question. Encountered: EOF after : during parsing a query -- Key: LUCENENET-441 URL: https://issues.apache.org/jira/browse/LUCENENET-441 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core Affects Versions: Lucene.Net 2.9.2 Environment: .Net Framework 4.0 Reporter: Maverick904 Cannot parse '\': Lexical error at line 1, column 4. Encountered: EOF after : |at Lucene.Net.QueryParsers.QueryParser.Parse(String query) at Lucene.Net.QueryParsers.MultiFieldQueryParser.Parse(Version matchVersion, String query, String[] fields, Occur[] flags, Analyzer analyzer) at Lucene.Net.QueryParsers.MultiFieldQueryParser.Parse(String query, String[] fields, Occur[] flags, Analyzer analyzer) Lexical error at line 1, column 4. Encountered: EOF after : | at Lucene.Net.QueryParsers.QueryParserTokenManager.GetNextToken() at Lucene.Net.QueryParsers.QueryParser.Jj_ntk() at Lucene.Net.QueryParsers.QueryParser.Modifiers() at Lucene.Net.QueryParsers.QueryParser.Query(String field) at Lucene.Net.QueryParsers.QueryParser.Parse(String query) | -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: [Lucene.Net] 2.9.4
Similarity was include, though are there any tests for this project ? Similarity is obsolete (Queries.Net replaces it has test cases). It has already been removed in 2.9.4g DIGY -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Wednesday, September 21, 2011 10:40 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 @all, I updated the build scripts to increase it's granularity. https://cwiki.apache.org/LUCENENET/build-system-scripts.html Similarity was include, though are there any tests for this project ? Some of the contrib tests are failing, I saw a few in Contrib.Highlighter just glancing at the output . I recieved some feedback Eric Woodruff. It looks like SHFB Sandcastle generate a plain file html, its been staring me in the face this whole time. I'll need to build in some targets that extract whats needed to push to site branch. Then I'll start working on nuget. @Prescott, Can the volatile fields be wrapped in a lock statement and code that access those fields with replaced with call to a property /method that wraps access to that field? On Wed, Sep 21, 2011 at 1:36 PM, Troy Howard thowar...@gmail.com wrote: I thought it was: 2.9.2 and before are 2.0 compatible 2.9.4 and before are 3.5 compatible After 2.9.4 are 4.0 compatible Thanks, Troy On Wed, Sep 21, 2011 at 10:15 AM, Michael Herndon mhern...@wickedsoftware.net wrote: if thats the case, then well need conditional statements for including ThreadLocalT On Wed, Sep 21, 2011 at 12:47 PM, Prescott Nasser geobmx...@hotmail.com wrote: I thought this was after 2.9.4 Sent from my Windows Phone -Original Message- From: Michael Herndon Sent: Wednesday, September 21, 2011 8:30 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] 2.9.4 @Robert, I believe the overwhelming consensus on the mailing list vote was to move to .NET 4.0 and drop support for previous versions. I'll take care of build scripts issue while they being refactored into smaller chunks this week. @Troy, Agreed. On Wed, Sep 21, 2011 at 8:08 AM, Robert Jordan robe...@gmx.net wrote: On 20.09.2011 23:48, Prescott Nasser wrote: Hey all seems like we are set with 2.9.4? Feedback has been positive and its been quiet. Do we feel ready to vote for a new release? I don't know if the build infrastructure is part of the release. If yes, then there is an open issue: Contrib doesn't build right now because there are some assembly name mismatches between certain *.csproj files and build/scripts/contrib.targets. The following patches should fix the issue: https://github.com/robert-j/**lucene.net/commit/** c5218bca56c19b3407648224781eec**7316994a39 https://github.com/robert-j/lucene.net/commit/c5218bca56c19b3407648224781eec 7316994a39 https://github.com/robert-j/**lucene.net/commit/** 50bad187655d59968d51d472b57c2a**40e201d663 https://github.com/robert-j/lucene.net/commit/50bad187655d59968d51d472b57c2a 40e201d663 Also, the fix for [LUCENENET-358] is basically making Lucene.Net.dll a .NET 4.0-only assembly: https://github.com/apache/**lucene.net/commit/** 23ea6f52362fc7dbce48fd012cea12**9a7350c73c https://github.com/apache/lucene.net/commit/23ea6f52362fc7dbce48fd012cea129a 7350c73c Did we agree about abandoning .NET = 3.5? Robert - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11
RE: [Lucene.Net] 2.9.4
@Robert Also, the fix for [LUCENENET-358] is basically making Lucene.Net.dll a .NET 4.0-only assembly: There is a commented part at the end of the CloseableThreadLocal which may seem familiar to you :) No harm in uncommenting it and no conditional compilation is needed. It also pass all test cases. DIGY -Original Message- From: Robert Jordan [mailto:robe...@gmx.net] Sent: Wednesday, September 21, 2011 3:09 PM To: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] 2.9.4 On 20.09.2011 23:48, Prescott Nasser wrote: Hey all seems like we are set with 2.9.4? Feedback has been positive and its been quiet. Do we feel ready to vote for a new release? I don't know if the build infrastructure is part of the release. If yes, then there is an open issue: Contrib doesn't build right now because there are some assembly name mismatches between certain *.csproj files and build/scripts/contrib.targets. The following patches should fix the issue: https://github.com/robert-j/lucene.net/commit/c5218bca56c19b3407648224781eec 7316994a39 https://github.com/robert-j/lucene.net/commit/50bad187655d59968d51d472b57c2a 40e201d663 Also, the fix for [LUCENENET-358] is basically making Lucene.Net.dll a .NET 4.0-only assembly: https://github.com/apache/lucene.net/commit/23ea6f52362fc7dbce48fd012cea129a 7350c73c Did we agree about abandoning .NET = 3.5? Robert - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11
RE: [Lucene.Net] 2.9.4
You are right in race condition NullReferenceException. but static SupportClass.WeakHashTable slots = new SupportClass.WeakHashTable(); wouldn't work since it is intented to be created in all threads not once. Would you patch it or leave it to me? Thanks, DIGY -Original Message- From: Robert Jordan [mailto:robe...@gmx.net] Sent: Thursday, September 22, 2011 1:16 AM To: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] 2.9.4 Hi Digy, On 21.09.2011 23:38, Digy wrote: @Robert Also, the fix for [LUCENENET-358] is basically making Lucene.Net.dll a .NET 4.0-only assembly: There is a commented part at the end of the CloseableThreadLocal which may seem familiar to you :) Indeed :) I've missed this comment. No harm in uncommenting it and no conditional compilation is needed. It also pass all test cases. BTW, there is an issue with this commented-out code. If Value is not accessed at least once, Dispose() will fail with a NullReferenceException. There is also a little chance for a race condition. I'd rather get rid of Init() for this code: static SupportClass.WeakHashTable slots = new SupportClass.WeakHashTable(); Robert DIGY -Original Message- From: Robert Jordan [mailto:robe...@gmx.net] Sent: Wednesday, September 21, 2011 3:09 PM To: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] 2.9.4 On 20.09.2011 23:48, Prescott Nasser wrote: Hey all seems like we are set with 2.9.4? Feedback has been positive and its been quiet. Do we feel ready to vote for a new release? I don't know if the build infrastructure is part of the release. If yes, then there is an open issue: Contrib doesn't build right now because there are some assembly name mismatches between certain *.csproj files and build/scripts/contrib.targets. The following patches should fix the issue: https://github.com/robert-j/lucene.net/commit/c5218bca56c19b3407648224781eec 7316994a39 https://github.com/robert-j/lucene.net/commit/50bad187655d59968d51d472b57c2a 40e201d663 Also, the fix for [LUCENENET-358] is basically making Lucene.Net.dll a .NET 4.0-only assembly: https://github.com/apache/lucene.net/commit/23ea6f52362fc7dbce48fd012cea129a 7350c73c Did we agree about abandoning .NET= 3.5? Robert - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11 - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11
RE: [Lucene.Net] 2.9.4
I reconsidered it and there is no race condition. A new slot will be created for each thread. But NullReferenceException bug is still there. DIGY -Original Message- From: Robert Jordan [mailto:robe...@gmx.net] Sent: Thursday, September 22, 2011 1:16 AM To: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] 2.9.4 Hi Digy, On 21.09.2011 23:38, Digy wrote: @Robert Also, the fix for [LUCENENET-358] is basically making Lucene.Net.dll a .NET 4.0-only assembly: There is a commented part at the end of the CloseableThreadLocal which may seem familiar to you :) Indeed :) I've missed this comment. No harm in uncommenting it and no conditional compilation is needed. It also pass all test cases. BTW, there is an issue with this commented-out code. If Value is not accessed at least once, Dispose() will fail with a NullReferenceException. There is also a little chance for a race condition. I'd rather get rid of Init() for this code: static SupportClass.WeakHashTable slots = new SupportClass.WeakHashTable(); Robert DIGY -Original Message- From: Robert Jordan [mailto:robe...@gmx.net] Sent: Wednesday, September 21, 2011 3:09 PM To: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] 2.9.4 On 20.09.2011 23:48, Prescott Nasser wrote: Hey all seems like we are set with 2.9.4? Feedback has been positive and its been quiet. Do we feel ready to vote for a new release? I don't know if the build infrastructure is part of the release. If yes, then there is an open issue: Contrib doesn't build right now because there are some assembly name mismatches between certain *.csproj files and build/scripts/contrib.targets. The following patches should fix the issue: https://github.com/robert-j/lucene.net/commit/c5218bca56c19b3407648224781eec 7316994a39 https://github.com/robert-j/lucene.net/commit/50bad187655d59968d51d472b57c2a 40e201d663 Also, the fix for [LUCENENET-358] is basically making Lucene.Net.dll a .NET 4.0-only assembly: https://github.com/apache/lucene.net/commit/23ea6f52362fc7dbce48fd012cea129a 7350c73c Did we agree about abandoning .NET= 3.5? Robert - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11 - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11
RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts
http://blogs.msdn.com/b/microsoft_press/archive/2010/02/03/jeffrey-richter-e xcerpt-2-from-clr-via-c-third-edition.aspx Yes, this is the trick some obfuscators use.(they use also some scrambling fxns to hide the code in resource) DIGY -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Thursday, September 22, 2011 1:36 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts @Digy, that could be done post build with ILMerge or build an additional uber assembly that stores other assemblies as a resource. http://blogs.msdn.com/b/microsoft_press/archive/2010/02/03/jeffrey-richter-e xcerpt-2-from-clr-via-c-third-edition.aspx We can add the above to the build process if that would interest people. To some nuget is just another disruption and to others its a godsend. Some might say only hipsters would use nuget, others might say the cools kids with iphones use nuget. (or android or wp7). At the end of the day nuget or combining assemblies are just channels/ways we can make it easier for various developers to consume get their hands on Lucene.Net. If anyone else has ideas along those lines and it can be automated, post it in this thread. On Wed, Sep 21, 2011 at 6:00 PM, Digy digyd...@gmail.com wrote: Even all contribs could be a single project/assembly. That way, users could reference all contribs with a single assembly. I see no harm in putting a few KB pressure on RAM :) DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Wednesday, September 21, 2011 7:32 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts While it may be a bit redundant, why couldn't there be an individual package for each piece of contrib and a Lucene.Net Contrib (All) package that drags them all down. That way users can grab just the bit they need, or if they just want to get the whole thing, grab the All package. Thanks, Troy On Tue, Sep 20, 2011 at 9:11 PM, Aaron Powell m...@aaron-powell.com wrote: I'm going to vote +1 for granular. With the RC you could look at myget and have a Lucene.Net repository on there so people can go for unstable on myget, stables on nuget. Also, I came across this article which explains how to setup a build server to automatically push to nuget/ myget which could be useful to the maintainers: http://brendanforster.com/doing-the-build-server-dance-with-nuget.html Aaron Powell MVP - Internet Explorer (Development) | FunnelWeb Team Member http://apowell.me | http://twitter.com/slace | Skype: aaron.l.powell | Github | BitBucket -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Wednesday, 21 September 2011 2:05 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts Right now there are two packages: Lucene Lucene.Contrib. My question to the community is do you wish to finer grain packages, i.e. a package for each contrib project or continue to keep it simple. +1 Granular, we just need to be good about descriptions. Another topic to converse about is would you like to see an out-of-band project nuget feed for nightly builds, branches with new or experimental features, or stable code snapshots for a projected release? Having a package for the latest RC would probably be a good idea - Checked by AVG - www.avg.com Version: 2012.0.1808 / Virus Database: 2085/4508 - Release Date: 09/20/11 - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11
RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts
I am not against it, but personally think it as a toy. I am from the generation where people used vi to write codes. DIGY -Original Message- From: Aaron Powell [mailto:m...@aaron-powell.com] Sent: Thursday, September 22, 2011 1:56 AM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts Any particular reason you guys are not interested in NuGet? Aaron Powell MVP - Internet Explorer (Development) | FunnelWeb Team Member http://apowell.me | http://twitter.com/slace | Skype: aaron.l.powell | Github | BitBucket -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Thursday, 22 September 2011 7:42 AM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts Sorry, but I feel the same as Neal. DIGY -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: Wednesday, September 21, 2011 6:08 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts No interest in Nuget whatsoever. - Neal -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Tuesday, September 20, 2011 10:57 PM To: lucene-net-dev@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts We're taking a quick poll over the next few days to see how people would like use Lucene.Net through Nuget on the developers mailing list** Currently version 2.9.2 is hosted on nuget.org, but that package was not create by the project maintainers, thus nuget is not currently set up in source. Going forward, we would like to continue what someone else started by creating nuget packages for Lucene.Net. Right now there are two packages: Lucene Lucene.Contrib. My question to the community is do you wish to finer grain packages, i.e. a package for each contrib project or continue to keep it simple. The granular approach will let you use only what you need. We can also create additional higher level packages which have dependencies on the other ones. Possibly a Lucene.Net-Essentials and Lucene.Net-Full. Or we can keep it simple and continue with only two packages. My concerns are that the granular approach might overwhelm people with choice. The simple choice might be considered bloat for importing and then installing assemblies that you might never use. Another topic to converse about is would you like to see an out-of-band project nuget feed for nightly builds, branches with new or experimental features, or stable code snapshots for a projected release? ** when you post, please respond to lucene-net-dev@lucene.apache.org. This was posted to both lists to make sure everyone subscribed to both lists has a chance to voice their use cases or concerns. - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11 - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11
RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts
Not that old :) DIGY -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Thursday, September 22, 2011 2:14 AM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts Punch cards or bust! Sent from my Windows Phone -Original Message- From: Digy Sent: Wednesday, September 21, 2011 4:06 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts I am not against it, but personally think it as a toy. I am from the generation where people used vi to write codes. DIGY -Original Message- From: Aaron Powell [mailto:m...@aaron-powell.com] Sent: Thursday, September 22, 2011 1:56 AM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts Any particular reason you guys are not interested in NuGet? Aaron Powell MVP - Internet Explorer (Development) |�FunnelWeb Team Member http://apowell.me�|�http://twitter.com/slace�| Skype: aaron.l.powell | Github | BitBucket -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Thursday, 22 September 2011 7:42 AM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts Sorry, but I feel the same as Neal. DIGY -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: Wednesday, September 21, 2011 6:08 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts No interest in Nuget whatsoever. - Neal -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Tuesday, September 20, 2011 10:57 PM To: lucene-net-dev@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts We're taking a quick poll over the next few days to see how people would like use Lucene.Net through Nuget on the developers mailing list** Currently version 2.9.2 is hosted on nuget.org, but that package was not create by the project maintainers, thus nuget is not currently set up in source. Going forward, we would like to continue what someone else started by creating nuget packages for Lucene.Net. Right now there are two packages: Lucene Lucene.Contrib. My question to the community is do you wish to finer grain packages, i.e. a package for each contrib project or continue to keep it simple. The granular approach will let you use only what you need. We can also create additional higher level packages which have dependencies on the other ones. Possibly a Lucene.Net-Essentials and Lucene.Net-Full. Or we can keep it simple and continue with only two packages. My concerns are that the granular approach might overwhelm people with choice. The simple choice might be considered bloat for importing and then installing assemblies that you might never use. Another topic to converse about is would you like to see an out-of-band project nuget feed for nightly builds, branches with new or experimental features, or stable code snapshots for a projected release? ** when you post, please respond to lucene-net-dev@lucene.apache.org. This was posted to both lists to make sure everyone subscribed to both lists has a chance to voice their use cases or concerns. - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11 - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11 - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11
RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts
Sorry, but I feel the same as Neal. DIGY -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: Wednesday, September 21, 2011 6:08 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts No interest in Nuget whatsoever. - Neal -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Tuesday, September 20, 2011 10:57 PM To: lucene-net-...@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts We're taking a quick poll over the next few days to see how people would like use Lucene.Net through Nuget on the developers mailing list** Currently version 2.9.2 is hosted on nuget.org, but that package was not create by the project maintainers, thus nuget is not currently set up in source. Going forward, we would like to continue what someone else started by creating nuget packages for Lucene.Net. Right now there are two packages: Lucene Lucene.Contrib. My question to the community is do you wish to finer grain packages, i.e. a package for each contrib project or continue to keep it simple. The granular approach will let you use only what you need. We can also create additional higher level packages which have dependencies on the other ones. Possibly a Lucene.Net-Essentials and Lucene.Net-Full. Or we can keep it simple and continue with only two packages. My concerns are that the granular approach might overwhelm people with choice. The simple choice might be considered bloat for importing and then installing assemblies that you might never use. Another topic to converse about is would you like to see an out-of-band project nuget feed for nightly builds, branches with new or experimental features, or stable code snapshots for a projected release? ** when you post, please respond to lucene-net-...@lucene.apache.org. This was posted to both lists to make sure everyone subscribed to both lists has a chance to voice their use cases or concerns. - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11
RE: [Lucene.Net] 2.9.4
I think, there was a sync problem between our eMails :( DIGY -Original Message- From: Robert Jordan [mailto:robe...@gmx.net] Sent: Thursday, September 22, 2011 1:22 AM To: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] 2.9.4 On 22.09.2011 00:16, Robert Jordan wrote: Hi Digy, On 21.09.2011 23:38, Digy wrote: @Robert Also, the fix for [LUCENENET-358] is basically making Lucene.Net.dll a .NET 4.0-only assembly: There is a commented part at the end of the CloseableThreadLocal which may seem familiar to you :) Indeed :) I've missed this comment. No harm in uncommenting it and no conditional compilation is needed. It also pass all test cases. BTW, there is an issue with this commented-out code. If Value is not accessed at least once, Dispose() will fail with a NullReferenceException. There is also a little chance for a race condition. Scratch this. The whole point of ThreadLocalT is actually Init(). Sorry for the noise :) Robert - Checked by AVG - www.avg.com Version: 2012.0.1809 / Virus Database: 2085/4510 - Release Date: 09/21/11
[Lucene.Net] [jira] [Updated] (LUCENENET-444) Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)
[ https://issues.apache.org/jira/browse/LUCENENET-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-444: --- Fix Version/s: (was: Lucene.Net 3.x) Lucene.Net 2.9.4g Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish) Key: LUCENENET-444 URL: https://issues.apache.org/jira/browse/LUCENENET-444 Project: Lucene.Net Issue Type: New Feature Reporter: Digy Priority: Trivial Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Some missing stemmers + a modified portuguese stemmer. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-444) Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)
[ https://issues.apache.org/jira/browse/LUCENENET-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-444: --- Attachment: PortugueseStemmer.cs TurkishStemmer.cs RomanianStemmer.cs HungarianStemmer.cs Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish) Key: LUCENENET-444 URL: https://issues.apache.org/jira/browse/LUCENENET-444 Project: Lucene.Net Issue Type: New Feature Reporter: Digy Priority: Trivial Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: HungarianStemmer.cs, PortugueseStemmer.cs, RomanianStemmer.cs, TurkishStemmer.cs Some missing stemmers + a modified portuguese stemmer. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-444) Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)
[ https://issues.apache.org/jira/browse/LUCENENET-444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-444. Resolution: Fixed Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish) Key: LUCENENET-444 URL: https://issues.apache.org/jira/browse/LUCENENET-444 Project: Lucene.Net Issue Type: New Feature Reporter: Digy Priority: Trivial Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: HungarianStemmer.cs, PortugueseStemmer.cs, RomanianStemmer.cs, TurkishStemmer.cs Some missing stemmers + a modified portuguese stemmer. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-444) Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish)
[ https://issues.apache.org/jira/browse/LUCENENET-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105418#comment-13105418 ] Digy commented on LUCENENET-444: committed to trunk 2.9.4g branch. Snowball stemmers (Portuguese, Hungarian, Romanian, Turkish) Key: LUCENENET-444 URL: https://issues.apache.org/jira/browse/LUCENENET-444 Project: Lucene.Net Issue Type: New Feature Reporter: Digy Priority: Trivial Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: HungarianStemmer.cs, PortugueseStemmer.cs, RomanianStemmer.cs, TurkishStemmer.cs Some missing stemmers + a modified portuguese stemmer. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-443) SpellChecker finaliser calls close regardless of if closed already
[ https://issues.apache.org/jira/browse/LUCENENET-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13105478#comment-13105478 ] Digy commented on LUCENENET-443: +1 for IDisposable in 2.9.4g (since Analyzers,Searchers,Directories,IndexReader,IndexWriter already implement it). SpellChecker finaliser calls close regardless of if closed already -- Key: LUCENENET-443 URL: https://issues.apache.org/jira/browse/LUCENENET-443 Project: Lucene.Net Issue Type: Improvement Components: Lucene.Net Contrib Affects Versions: Lucene.Net 2.9.2 Reporter: Stuart Robinson Labels: lucene, spellcheck, spellchecker The SpellChecker Class currently has no publicly visible way of accessing the closed field. It also calls close in the finaliser killing the process it is in upon GC as this can throw an exceptin. I propose two changes: Change the already existing method IsClosed() to public: public bool IsClosed() { return closed; } and add a check on this in the finaliser: ~SpellChecker() { if (!IsClosed()) this.Close(); } Ideally this class should implement IDisposable but I think this would be a bigger job than this two line change. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: [Lucene.Net] Test case for: possible infinite loop bug in portuguese snowball stemmer?
I created a working portuguese stemmer ( http://people.apache.org/~digy/PortugueseStemmerNew.cs ) from http://snowball.tartarus.org/archives/snowball-discuss/0943.html http://snowball.tartarus.org/archives/snowball-discuss/att-0943/01-SnowballC Sharp.zip Since it has a BSD license (http://snowball.tartarus.org/license.php), I don't think I can update the PortugueseStemmer.cs under contrib. DIGY -Original Message- From: Robert Stewart [mailto:robert_stew...@epam.com] Sent: Tuesday, September 13, 2011 5:55 PM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Test case for: possible infinite loop bug in portuguese snowball stemmer? Here is a test case: string text = @Califórnia; Lucene.Net.Analysis.KeywordTokenizer tokenizer = new KeywordTokenizer(new StringReader(text)); Lucene.Net.Analysis.Snowball.SnowballFilter stemmer= new Lucene.Net.Analysis.Snowball.SnowballFilter(tokenizer, Portuguese); Lucene.Net.Analysis.Token token; while ((token = stemmer.Next()) != null) { System.Console.WriteLine(tokenText); } Seems to go into infinite loop. Call to stemmer.Next() never returns. Not sure if this is the only stemmer I am having trouble with. And it does happen to us on a near daily basis. Thanks, Bob On Sep 13, 2011, at 9:37 AM, Robert Stewart wrote: Are there any known issues with snowball stemmers (portuguese in particular) going into some infinite loop? I have a problem that happens on a recurring basis where IndexWriter locks up on AddDocument and never returns (it has taken up to 3 days before we realize it), requiring manual killing of the process. It seems to happen only on portuguese documents from what I can tell so far, and the stack trace when thread is aborted is always as follows: System.Threading.ThreadAbortException: Thread was being aborted. at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, SignatureStruct sig, MethodAttributes methodAttributes, RuntimeType typeOwner) at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, Signature sig, MethodAttributes methodAttributes, RuntimeType typeOwner) at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks) at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture) at Lucene.Net.Analysis.Snowball.SnowballFilter.Next() System.SystemException: System.Threading.ThreadAbortException: Thread was being aborted. at System.RuntimeMethodHandle._InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, SignatureStruct sig, MethodAttributes methodAttributes, RuntimeType typeOwner) at System.RuntimeMethodHandle.InvokeMethodFast(IRuntimeMethodInfo method, Object target, Object[] arguments, Signature sig, MethodAttributes methodAttributes, RuntimeType typeOwner) at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture, Boolean skipVisibilityChecks) at System.Reflection.RuntimeMethodInfo.Invoke(Object obj, BindingFlags invokeAttr, Binder binder, Object[] parameters, CultureInfo culture) at Lucene.Net.Analysis.Snowball.SnowballFilter.Next() at Lucene.Net.Analysis.Snowball.SnowballFilter.Next() at Lucene.Net.Analysis.TokenStream.IncrementToken() at Lucene.Net.Index.DocInverterPerField.ProcessFields(Fieldable[] fields, Int32 count) at Lucene.Net.Index.DocFieldProcessorPerThread.ProcessDocument() at Lucene.Net.Index.DocumentsWriter.UpdateDocument(Document doc, Analyzer analyzer, Term delTerm) at Lucene.Net.Index.IndexWriter.AddDocument(Document doc, Analyzer analyzer) Is there another list of contrib/snowball issues? I have not been able to reproduce a small test case yet however. Have there been any such issues with stemmers in the past? Thanks, Bob - Checked by AVG - www.avg.com Version: 2012.0.1796 / Virus Database: 2082/4494 - Release Date: 09/13/11
RE: [Lucene.Net] 2.9.4
Good news. Thanks Itamar. DIGY -Original Message- From: itamar.synhers...@gmail.com [mailto:itamar.synhers...@gmail.com] On Behalf Of Itamar Syn-Hershko Sent: Saturday, September 10, 2011 8:23 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 We have been running some extensive tests 30hrs now against the 2.9.4 branch, and did not detect any leaks. We will have it running a few more days, if you wish to wait for more conclusive findings. On Wed, Sep 7, 2011 at 5:07 PM, Prescott Nasser geobmx...@hotmail.comwrote: 2.9.4 would make it in I assume because that will be our next official release. Sent from my Windows Phone -Original Message- From: Michael Herndon Sent: Wednesday, September 07, 2011 5:12 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 What version is going to make it to nuget? 2.9.4 or 2.9.4g? ooo totally forgot about nuget. we definitely need to get that setup. On Wed, Sep 7, 2011 at 6:46 AM, digy digy digyd...@gmail.com wrote: Since it includes some level of divergence from java I committed it to only 2.9.4g branch. https://issues.apache.org/jira/browse/LUCENE-1930 https://issues.apache.org/jira/browse/LUCENENET-431 DIGY On Wed, Sep 7, 2011 at 1:03 PM, Itamar Syn-Hershko ita...@code972.com wrote: Ok, core compiles, and all tests pass. We are now running long tests to measure memory usage among other things. There is one show stopper tho. There was a patch sent by Matt Warren for Spatial.Net, that doesn't seem to be in. See http://groups.google.com/group/ravendb/msg/7517f095810c48f3 Any chance you can get it in to 2.9.4? On Wed, Sep 7, 2011 at 1:01 AM, Itamar Syn-Hershko ita...@code972.com wrote: Ok, great, we will run RavenDB on top of 2.9.4 in the next few days and will let you know how it went. On Tue, Sep 6, 2011 at 8:59 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I can't tell if the apache git mirror is updated via scheduler or from commit hooks, but its generally stays close to being on par with svn. I'll check next time I push something to svn. But both of those items have made it to the mirror. - michael On Tue, Sep 6, 2011 at 1:44 PM, Digy digyd...@gmail.com wrote: I don't know how often github mirror is updated. These are the original locations 2.9.4 https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/ 2.9.4g https://svn.apache.org/repos/asf/incubator/lucene.net/branches/Lucene.Net_2_ 9_4g/ Both versions include ThreadLocal fix + Signing. Thanks, DIGY -Original Message- From: itamar.synhers...@gmail.com [mailto: itamar.synhers...@gmail.com ] On Behalf Of Itamar Syn-Hershko Sent: Tuesday, September 06, 2011 2:34 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 Not a problem, we will test RavenDB on a separate branch, also for potential memory leaks Digy, can you make sure the github mirror contains an updated 2.9.4 tag I can pull from, which includes the latest ThreadLocal fix + the strongly signed patch applied to it? 2011/9/6 Digy digyd...@gmail.com To avoid misunderstanding... Community==all Lucene.Net users DIGY -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Monday, September 05, 2011 11:46 PM To: 'lucene-net-dev@lucene.apache.org' Subject: RE: [Lucene.Net] 2.9.4 Not bad idea, but I would prefer community's feedback instead of testing against all projects using Lucene.Net DIGY -Original Message- From: Matt Warren [mailto:mattd...@gmail.com] Sent: Monday, September 05, 2011 11:09 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 If you want to test it against a large project you could take a look at how RavenDB uses it? At the moment it's using 2.9.2 ( https://github.com/ayende/ravendb/tree/master/SharedLibs/Sources/Lucene2.9.2 ) but if you were to recompile it against 2.9.4 and check that all it's unit-tests still run that would give you quite a large test case. On 5 September 2011 19:22, Prescott Nasser geobmx...@hotmail.com wrote: Hey All, How do people feel about the 2.9.4 code base? I've been using it for sometime, for my use cases it's be excellent. Do we feel we are ready to package this up and make it an official release? Or do we have some tasks left to take care of? ~Prescott
Re: [Lucene.Net] How to add document to more than one index (but only analyze once)?
How about indexing the new document(s) in memory using RAMDirectory then calling indexWriter.AddIndexesNoOptimize for NRT master index? DIGY On Fri, Sep 9, 2011 at 5:33 PM, Robert Stewart robert_stew...@epam.comwrote: Is it possible to add a document to more than one index at the same time, such that document fields are only analyzed one time? For instance, to add document to both a master index, and a smaller near real-time index. I would like to avoid analyzing document fields more than once but I dont see if that is possible at all using Lucene API. Thanks, Bob
Re: [Lucene.Net] 2.9.4
Since it includes some level of divergence from java I committed it to only 2.9.4g branch. https://issues.apache.org/jira/browse/LUCENE-1930 https://issues.apache.org/jira/browse/LUCENENET-431 DIGY On Wed, Sep 7, 2011 at 1:03 PM, Itamar Syn-Hershko ita...@code972.comwrote: Ok, core compiles, and all tests pass. We are now running long tests to measure memory usage among other things. There is one show stopper tho. There was a patch sent by Matt Warren for Spatial.Net, that doesn't seem to be in. See http://groups.google.com/group/ravendb/msg/7517f095810c48f3 Any chance you can get it in to 2.9.4? On Wed, Sep 7, 2011 at 1:01 AM, Itamar Syn-Hershko ita...@code972.com wrote: Ok, great, we will run RavenDB on top of 2.9.4 in the next few days and will let you know how it went. On Tue, Sep 6, 2011 at 8:59 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I can't tell if the apache git mirror is updated via scheduler or from commit hooks, but its generally stays close to being on par with svn. I'll check next time I push something to svn. But both of those items have made it to the mirror. - michael On Tue, Sep 6, 2011 at 1:44 PM, Digy digyd...@gmail.com wrote: I don't know how often github mirror is updated. These are the original locations 2.9.4 https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/ 2.9.4g https://svn.apache.org/repos/asf/incubator/lucene.net/branches/Lucene.Net_2_ 9_4g/ Both versions include ThreadLocal fix + Signing. Thanks, DIGY -Original Message- From: itamar.synhers...@gmail.com [mailto:itamar.synhers...@gmail.com ] On Behalf Of Itamar Syn-Hershko Sent: Tuesday, September 06, 2011 2:34 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 Not a problem, we will test RavenDB on a separate branch, also for potential memory leaks Digy, can you make sure the github mirror contains an updated 2.9.4 tag I can pull from, which includes the latest ThreadLocal fix + the strongly signed patch applied to it? 2011/9/6 Digy digyd...@gmail.com To avoid misunderstanding... Community==all Lucene.Net users DIGY -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Monday, September 05, 2011 11:46 PM To: 'lucene-net-dev@lucene.apache.org' Subject: RE: [Lucene.Net] 2.9.4 Not bad idea, but I would prefer community's feedback instead of testing against all projects using Lucene.Net DIGY -Original Message- From: Matt Warren [mailto:mattd...@gmail.com] Sent: Monday, September 05, 2011 11:09 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 If you want to test it against a large project you could take a look at how RavenDB uses it? At the moment it's using 2.9.2 ( https://github.com/ayende/ravendb/tree/master/SharedLibs/Sources/Lucene2.9.2 ) but if you were to recompile it against 2.9.4 and check that all it's unit-tests still run that would give you quite a large test case. On 5 September 2011 19:22, Prescott Nasser geobmx...@hotmail.com wrote: Hey All, How do people feel about the 2.9.4 code base? I've been using it for sometime, for my use cases it's be excellent. Do we feel we are ready to package this up and make it an official release? Or do we have some tasks left to take care of? ~Prescott
RE: [Lucene.Net] 2.9.4
I don't know how often github mirror is updated. These are the original locations 2.9.4 https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/ 2.9.4g https://svn.apache.org/repos/asf/incubator/lucene.net/branches/Lucene.Net_2_ 9_4g/ Both versions include ThreadLocal fix + Signing. Thanks, DIGY -Original Message- From: itamar.synhers...@gmail.com [mailto:itamar.synhers...@gmail.com] On Behalf Of Itamar Syn-Hershko Sent: Tuesday, September 06, 2011 2:34 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 Not a problem, we will test RavenDB on a separate branch, also for potential memory leaks Digy, can you make sure the github mirror contains an updated 2.9.4 tag I can pull from, which includes the latest ThreadLocal fix + the strongly signed patch applied to it? 2011/9/6 Digy digyd...@gmail.com To avoid misunderstanding... Community==all Lucene.Net users DIGY -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Monday, September 05, 2011 11:46 PM To: 'lucene-net-dev@lucene.apache.org' Subject: RE: [Lucene.Net] 2.9.4 Not bad idea, but I would prefer community's feedback instead of testing against all projects using Lucene.Net DIGY -Original Message- From: Matt Warren [mailto:mattd...@gmail.com] Sent: Monday, September 05, 2011 11:09 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 If you want to test it against a large project you could take a look at how RavenDB uses it? At the moment it's using 2.9.2 ( https://github.com/ayende/ravendb/tree/master/SharedLibs/Sources/Lucene2.9.2 ) but if you were to recompile it against 2.9.4 and check that all it's unit-tests still run that would give you quite a large test case. On 5 September 2011 19:22, Prescott Nasser geobmx...@hotmail.com wrote: Hey All, How do people feel about the 2.9.4 code base? I've been using it for sometime, for my use cases it's be excellent. Do we feel we are ready to package this up and make it an official release? Or do we have some tasks left to take care of? ~Prescott
[Lucene.Net] [jira] [Resolved] (LUCENENET-414) The definition of CharArraySet is dangerously confusing and leads to bugs when used.
[ https://issues.apache.org/jira/browse/LUCENENET-414?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-414. Resolution: Fixed Fixed. DIGY The definition of CharArraySet is dangerously confusing and leads to bugs when used. Key: LUCENENET-414 URL: https://issues.apache.org/jira/browse/LUCENENET-414 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core Affects Versions: Lucene.Net 2.9.2 Environment: Irrelevant Reporter: Vincent Van Den Berghe Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Right now, CharArraySet derives from System.Collections.Hashtable, but doesn't actually use this base type for storing elements. However, the StandardAnalyzer.STOP_WORDS_SET is exposed as a System.Collections.Hashtable. The trivial code to build your own stopword set using the StandardAnalyzer.STOP_WORDS_SET and adding your own set of stopwords like this: CharArraySet myStopWords = new CharArraySet(StandardAnalyzer.STOP_WORDS_SET, ignoreCase: false); foreach (string domainSpecificStopWord in DomainSpecificStopWords) stopWords.Add(domainSpecificStopWord); ... will fail because the CharArraySet accepts an ICollection, which will be passed the Hashtable instance of STOP_WORDS_SET: the resulting myStopWords will only contain the DomainSpecificStopWords, and not those from STOP_WORDS_SET. One workaround would be to replace the first line with this: CharArraySet stopWords = new CharArraySet(StandardAnalyzer.STOP_WORDS_SET.Count + DomainSpecificStopWords.Length, ignoreCase: false); foreach (string domainSpecificStopWord in (CharArraySet)StandardAnalyzer.STOP_WORDS_SET) stopWords.Add(domainSpecificStopWord); ... but this makes use of the implementation detail (the STOP_WORDS_SET is really an UnmodifiableCharArraySet which is itself a CharArraySet). It works because it forces the foreach() to use the correct CharArraySet.GetEnumerator(), which is defined as a new method (this has a bad code smell to it) At least 2 possibilities exist to solve this problem: - Make CharArraySet use the Hashtable instance and a custom comparator, instead of its own implementation. - Make CharArraySet use HashSetchar[], defined in .NET 4.0. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: [Lucene.Net] 2.9.4
+1 for an official release. DIGY -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Monday, September 05, 2011 9:22 PM To: lucene-net-...@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] 2.9.4 Hey All, How do people feel about the 2.9.4 code base? I've been using it for sometime, for my use cases it's be excellent. Do we feel we are ready to package this up and make it an official release? Or do we have some tasks left to take care of? ~Prescott = - Bu iletide virüs bulunamadı. AVG tarafından kontrol edildi - www.avg.com Sürüm: 2012.0.1796 / Virüs Veritabanı: 2082/4478 - Sürüm Tarihi: 05.09.2011
[Lucene.Net] [jira] [Updated] (LUCENENET-442) ParallelMultiSearcher threads don't handle all exceptions
[ https://issues.apache.org/jira/browse/LUCENENET-442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-442: --- Attachment: LUCENENET-442.patch Thanks Andy. Nice catch. I prepared a patch for 2.9.4g and will commit to 2.9.4g branch trunk soon. DIGY ParallelMultiSearcher threads don't handle all exceptions - Key: LUCENENET-442 URL: https://issues.apache.org/jira/browse/LUCENENET-442 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core Affects Versions: Lucene.Net 2.9.2 Reporter: Andy Twidle Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: LUCENENET-442.patch The ParallelMultiSearcher doesn't allow non-IOException exceptions to be managed by the calling application. LUCENENET-388 worked around one specific example of this, but any genuine Lucene exception (eg: BooleanQuery.TooManyClauses) will also fall foul of this pattern. In our specific instance we could treat the symptoms and up the max clause count, but I'm sure there will be more. Could the System.IOException be generalised to System.Exception? Or would that be too much deviation from the Java code base? -- Example stack trace of an exception thrown by a Searcher executed: Framework Version: v4.0.30319 Description: The process was terminated due to an unhandled exception. Exception Info: Lucene.Net.Search.BooleanQuery+TooManyClauses Stack: at Lucene.Net.Search.BooleanQuery.Add(Lucene.Net.Search.BooleanClause) at Lucene.Net.Search.BooleanQuery.Add(Lucene.Net.Search.Query, Occur) at Lucene.Net.Search.PrefixQuery.Rewrite(Lucene.Net.Index.IndexReader) at Lucene.Net.Search.BooleanQuery.Rewrite(Lucene.Net.Index.IndexReader) at Lucene.Net.Search.IndexSearcher.Rewrite(Lucene.Net.Search.Query) at Lucene.Net.Search.Query.Weight(Lucene.Net.Search.Searcher) at Lucene.Net.Search.Searcher.CreateWeight(Lucene.Net.Search.Query) at Lucene.Net.Search.Searcher.Search(Lucene.Net.Search.Query, Lucene.Net.Search.Filter, Lucene.Net.Search.HitCollector) at Lucene.Net.Search.Searcher.Search(Lucene.Net.Search.Query, Lucene.Net.Search.HitCollector) at Lucene.Net.Search.QueryWrapperFilter.Bits(Lucene.Net.Index.IndexReader) at Lucene.Net.Search.CachingWrapperFilter.Bits(Lucene.Net.Index.IndexReader) at Lucene.Net.Search.IndexSearcher.Search(Lucene.Net.Search.Weight, Lucene.Net.Search.Filter, Lucene.Net.Search.HitCollector) at Lucene.Net.Search.IndexSearcher.Search(Lucene.Net.Search.Weight, Lucene.Net.Search.Filter, Int32) at Lucene.Net.Search.MultiSearcherThread.Run() at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean) at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object) at System.Threading.ThreadHelper.ThreadStart() -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: [Lucene.Net] 2.9.4
To avoid misunderstanding... Community==all Lucene.Net users DIGY -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Monday, September 05, 2011 11:46 PM To: 'lucene-net-dev@lucene.apache.org' Subject: RE: [Lucene.Net] 2.9.4 Not bad idea, but I would prefer community's feedback instead of testing against all projects using Lucene.Net DIGY -Original Message- From: Matt Warren [mailto:mattd...@gmail.com] Sent: Monday, September 05, 2011 11:09 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 If you want to test it against a large project you could take a look at how RavenDB uses it? At the moment it's using 2.9.2 ( https://github.com/ayende/ravendb/tree/master/SharedLibs/Sources/Lucene2.9.2 ) but if you were to recompile it against 2.9.4 and check that all it's unit-tests still run that would give you quite a large test case. On 5 September 2011 19:22, Prescott Nasser geobmx...@hotmail.com wrote: Hey All, How do people feel about the 2.9.4 code base? I've been using it for sometime, for my use cases it's be excellent. Do we feel we are ready to package this up and make it an official release? Or do we have some tasks left to take care of? ~Prescott - Bu iletide virüs bulunamadı. AVG tarafından kontrol edildi - www.avg.com Sürüm: 2012.0.1796 / Virüs Veritabanı: 2082/4478 - Sürüm Tarihi: 05.09.2011
RE: [Lucene.Net] 2.9.4
Not bad idea, but I would prefer community's feedback instead of testing against all projects using Lucene.Net DIGY -Original Message- From: Matt Warren [mailto:mattd...@gmail.com] Sent: Monday, September 05, 2011 11:09 PM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] 2.9.4 If you want to test it against a large project you could take a look at how RavenDB uses it? At the moment it's using 2.9.2 ( https://github.com/ayende/ravendb/tree/master/SharedLibs/Sources/Lucene2.9.2 ) but if you were to recompile it against 2.9.4 and check that all it's unit-tests still run that would give you quite a large test case. On 5 September 2011 19:22, Prescott Nasser geobmx...@hotmail.com wrote: Hey All, How do people feel about the 2.9.4 code base? I've been using it for sometime, for my use cases it's be excellent. Do we feel we are ready to package this up and make it an official release? Or do we have some tasks left to take care of? ~Prescott - Bu iletide virüs bulunamadı. AVG tarafından kontrol edildi - www.avg.com Sürüm: 2012.0.1796 / Virüs Veritabanı: 2082/4478 - Sürüm Tarihi: 05.09.2011
[Lucene.Net] [jira] [Commented] (LUCENENET-358) CloseableThreadLocal memory leak in LocalDataStoreSlot (with workaround)
[ https://issues.apache.org/jira/browse/LUCENENET-358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13092541#comment-13092541 ] Digy commented on LUCENENET-358: New CloseableThreadLocal implementation and its test case committed to trunk. DIGY CloseableThreadLocal memory leak in LocalDataStoreSlot (with workaround) - Key: LUCENENET-358 URL: https://issues.apache.org/jira/browse/LUCENENET-358 Project: Lucene.Net Issue Type: Bug Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g Environment: Microsoft WIndows Server 2008 Enterprise x64. SP2. .NET Framework 4.0 Reporter: Rezgar Cadro Assignee: Digy Priority: Critical Labels: memory, CloseableThreadLocal, LocalDataStoreSlot, leak Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: CloseableThreadLocal MemoryLeak.patch, CloseableThreadLocal.diff, CloseableThreadLocal.diff, CloseableThreadLocal.patch, TestMemLeakage.zip Recently we have been suffering from a severe memory leak when executing intense open/close operations on IndexSearcher and IndexModifier. Memory profiling showed that memory is being held by LocalDataStore[] objects. After some digging, the root of the problem has been found in CloseableThreadLocal class: private System.LocalDataStoreSlot t = System.Threading.Thread.AllocateDataSlot(); What we see is that every instantiated object of CloseableThreadLocal causes new data slot allocation performed for every thread. Thread.AllocateDataSlot() does not simply allocate a new slot, replacing an old one, but enlarging an existing buffer in-thread, appending data to the end of internal LocalDataStore[] collection, which causes a severe memory leak . As long as t variable is instantiated on every object creation, and (in current class implementation) every object is used by a single thread, replacing private System.LocalDataStoreSlot t = System.Threading.Thread.AllocateDataSlot(); with simple private object dataSlot; and removing hardRefs Dictionary solves the problem and prevents memory leak. We have tried to implement the expected behavior by using [ThreadStatic] attribute instead of LocalDataStoreSlot, but the attempt failed because of unexpected exceptions being thrown. Patch can be found at Lucene.Net repository under -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-358) CloseableThreadLocal memory leak in LocalDataStoreSlot (with workaround)
[ https://issues.apache.org/jira/browse/LUCENENET-358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-358: --- Attachment: TestMemLeakage.zip TestMemLeakage.zip shows that bug. CloseableThreadLocal memory leak in LocalDataStoreSlot (with workaround) - Key: LUCENENET-358 URL: https://issues.apache.org/jira/browse/LUCENENET-358 Project: Lucene.Net Issue Type: Bug Environment: Microsoft WIndows Server 2008 Enterprise x64. SP2. .NET Framework 4.0 Reporter: Rezgar Cadro Assignee: Digy Priority: Critical Labels: memory, CloseableThreadLocal, LocalDataStoreSlot, leak Attachments: CloseableThreadLocal MemoryLeak.patch, CloseableThreadLocal.diff, CloseableThreadLocal.diff, CloseableThreadLocal.patch, TestMemLeakage.zip Recently we have been suffering from a severe memory leak when executing intense open/close operations on IndexSearcher and IndexModifier. Memory profiling showed that memory is being held by LocalDataStore[] objects. After some digging, the root of the problem has been found in CloseableThreadLocal class: private System.LocalDataStoreSlot t = System.Threading.Thread.AllocateDataSlot(); What we see is that every instantiated object of CloseableThreadLocal causes new data slot allocation performed for every thread. Thread.AllocateDataSlot() does not simply allocate a new slot, replacing an old one, but enlarging an existing buffer in-thread, appending data to the end of internal LocalDataStore[] collection, which causes a severe memory leak . As long as t variable is instantiated on every object creation, and (in current class implementation) every object is used by a single thread, replacing private System.LocalDataStoreSlot t = System.Threading.Thread.AllocateDataSlot(); with simple private object dataSlot; and removing hardRefs Dictionary solves the problem and prevents memory leak. We have tried to implement the expected behavior by using [ThreadStatic] attribute instead of LocalDataStoreSlot, but the attempt failed because of unexpected exceptions being thrown. Patch can be found at Lucene.Net repository under -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-441) Encountered: EOF after : \\\\ during parsing a query
[ https://issues.apache.org/jira/browse/LUCENENET-441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13091175#comment-13091175 ] Digy commented on LUCENENET-441: What does your query look like? What is your question? Encountered: EOF after : during parsing a query -- Key: LUCENENET-441 URL: https://issues.apache.org/jira/browse/LUCENENET-441 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Core Affects Versions: Lucene.Net 2.9.2 Environment: .Net Framework 4.0 Reporter: Maverick904 Cannot parse '\': Lexical error at line 1, column 4. Encountered: EOF after : |at Lucene.Net.QueryParsers.QueryParser.Parse(String query) at Lucene.Net.QueryParsers.MultiFieldQueryParser.Parse(Version matchVersion, String query, String[] fields, Occur[] flags, Analyzer analyzer) at Lucene.Net.QueryParsers.MultiFieldQueryParser.Parse(String query, String[] fields, Occur[] flags, Analyzer analyzer) Lexical error at line 1, column 4. Encountered: EOF after : | at Lucene.Net.QueryParsers.QueryParserTokenManager.GetNextToken() at Lucene.Net.QueryParsers.QueryParser.Jj_ntk() at Lucene.Net.QueryParsers.QueryParser.Modifiers() at Lucene.Net.QueryParsers.QueryParser.Query(String field) at Lucene.Net.QueryParsers.QueryParser.Parse(String query) | -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Issue Comment Edited] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067389#comment-13067389 ] Digy edited comment on LUCENENET-437 at 7/18/11 11:28 PM: -- bq. It ensures equality, but does not ensure inequality. Sorry but I must object again. It ensures inequality, but doesn't ensure equality.(if hashcodes are not equal objects are not also, but having the same hashcode doesn't say anything about equality) was (Author: digydigy): bq. It ensures equality, but does not ensure inequality. Sorry but I must object again. It ensures inequality, but doesn't ensure equality. Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067409#comment-13067409 ] Digy commented on LUCENENET-437: Already fixed in 2.9.4g Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-437: --- Fix Version/s: Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067361#comment-13067361 ] Digy commented on LUCENENET-437: Java Docs says: public boolean equals(Object o) Compares the specified object with this list for equality. Returns true if and only if the specified object is also a list, both lists have the same size, and *all corresponding pairs of elements in the two lists are equal* [No reference for Hashcode - DIGY]. (Two elements e1 and e2 are equal if (e1==null ? e2==null : e1.equals(e2)).) In other words, two lists are defined to be equal if they contain the same elements in the same order. This definition ensures that the equals method works properly across different implementations of the List interface. Yes, the sample was from Eric Lippert's blog to show why GetHashCode should not be used for comparing objects. bq. The issue you're describing is more of a problem with the .NET implementation of GetHashcode() rather than the correctness of using hashcode for comparison. No, the problem is not in the implementation of GetHashCode. In any implementation, you may have some unexpected collisions(since it is a 4-byte number). GetHashCode isn't meant for uniqueness or object identification. It's meant to provide a random distribution. Therefore the problem really lies in using it for equality comparison. DIGY Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067366#comment-13067366 ] Digy commented on LUCENENET-437: See even with worst implementation, Equals method should work. {code} /// private void Form1_Load(object sender, EventArgs e) { Hashtable h = new Hashtable(); MyClass m1 = new MyClass() { I = 1 }; MyClass m2 = new MyClass() { I = 2 }; h.Add(m1,1); h.Add(m2,2); System.Diagnostics.Debug.Assert(h[m2].Equals(2)); } public class MyClass { public int I; public override int GetHashCode() { return 1; } } {code} Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067375#comment-13067375 ] Digy commented on LUCENENET-437: bq. This ensures that list1.equals(list2) implies that list1.hashCode()==list2.hashCode() for any two lists, list1 and list2, as required by the general contract of Object.hashCode. but it doesn't ensure that if list1.hashCode()==list2.hashCode() then list1.equals(list2) should be true, as I showed using Eric Lippert's sample. Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067389#comment-13067389 ] Digy commented on LUCENENET-437: bq. It ensures equality, but does not ensure inequality. Sorry but I must object again. It ensures inequality, but doesn't ensure equality. Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-437: --- Affects Version/s: Lucene.Net 2.9.4g Fix Version/s: Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13066717#comment-13066717 ] Digy commented on LUCENENET-437: Using HashCode for equality comparison is not a good idea. {code} /// ListComparerstring comp = new ListComparerstring(); b = comp.Equals(new Liststring() { \uA0A2\uA0A2 }, new Liststring() { }); System.Diagnostics.Debug.Assert(b == false); b = comp.Equals(new Liststring() { \uA0A2\uA0A2 }, new Liststring() { \uA0A2\uA0A2\uA0A2\uA0A2 }); System.Diagnostics.Debug.Assert(b == false); b = new Lucene.Net.Support.EquatableListstring(){\uA0A2\uA0A2 }.Equals(new Lucene.Net.Support.EquatableListstring() {}); System.Diagnostics.Debug.Assert(b == false); new Lucene.Net.Support.EquatableListstring() { \uA0A2\uA0A2 }.Equals(new Lucene.Net.Support.EquatableListstring() { \uA0A2\uA0A2\uA0A2\uA0A2}); System.Diagnostics.Debug.Assert(b == false); /// {code} DIGY Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Reopened] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy reopened LUCENENET-437: Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Lucene.Net] Incubator Status Page
On Sun, Jul 10, 2011 at 6:24 PM, Stefan Bodewig bode...@apache.org wrote: Hi all, http://incubator.apache.org/projects/lucene.net.html contains quite a few blanks that I think we could easily fill. I intend to either add some N/A or real dates where I can during the coming week. On the IP issues part (copyright and distribution rights) I trust the Lucene PMC has been taking care of this before Lucene.NET headed back to the Incubator and after that all contributions have come either directly by people with a CLA on file or as patches via JIRA where the ASF may use this checkbox has been checked - is this correct? absolutely. For the project specific tasks I'd ask all of you to fill in whatever you feel like adding. All Lucene.NET committers should be able to modify the status page. Stefan DIGY
[Lucene.Net] [jira] [Commented] (LUCENENET-434) Remove AnonymousXXXX classes to increase readablity
[ https://issues.apache.org/jira/browse/LUCENENET-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13062351#comment-13062351 ] Digy commented on LUCENENET-434: very nice. Remove Anonymous classes to increase readablity --- Key: LUCENENET-434 URL: https://issues.apache.org/jira/browse/LUCENENET-434 Project: Lucene.Net Issue Type: Improvement Components: Lucene.Net Core Affects Versions: Lucene.Net 2.9.4g Reporter: Scott Lombard Assignee: Scott Lombard Fix For: Lucene.Net 2.9.4g Attachments: TeeSinkTokenFilter.patch Original Estimate: 168h Time Spent: 5h Remaining Estimate: 163h Replace Anonymous classes inhereted from JLCA which make the code impossible to read. Follow Digy's template to replace the single abstract method with Func or Action like in FilterCacheT from: protected abstract object MergeDeletes(IndexReader reader, object value); to: FuncIndexReader, object, object MergeDeletes; Determine a solution to the classes with more than 1 abstract method without diverging much from Java. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-432) Concurrency issues in SegmentInfo.Files() (LUCENE-2584)
[ https://issues.apache.org/jira/browse/LUCENENET-432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-432. Resolution: Fixed Fix Version/s: Lucene.Net 2.9.4 Lucene.Net 2.9.2 Assignee: Digy Patch committed to trunk 2.9.4g branch Concurrency issues in SegmentInfo.Files() (LUCENE-2584) --- Key: LUCENENET-432 URL: https://issues.apache.org/jira/browse/LUCENENET-432 Project: Lucene.Net Issue Type: Bug Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Digy Assignee: Digy Fix For: Lucene.Net 2.9.2, Lucene.Net 2.9.4 Attachments: SegmentInfo.patch The multi-threaded call of the files() in SegmentInfo could lead to the ConcurrentModificationException if one thread is not finished additions to the ArrayList (files) yet while the other thread already obtained it as cached. https://issues.apache.org/jira/browse/LUCENE-2584 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-430) Contrib.ChainedFilter
[ https://issues.apache.org/jira/browse/LUCENENET-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-430. Resolution: Fixed Instead of creating a small project, I put it into Contrib.Analyzers. Contrib.ChainedFilter - Key: LUCENENET-430 URL: https://issues.apache.org/jira/browse/LUCENENET-430 Project: Lucene.Net Issue Type: New Feature Affects Versions: Lucene.Net 2.9.4g Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g Attachments: ChainedFilter.cs, ChainedFilterTest.cs Port of lucene.Java 3.0.3's ChainedFilter its test cases. See the StackOverflow question: How to combine multiple filters within one search? http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-418) LuceneTestCase should not have a static method could throw exceptions.
[ https://issues.apache.org/jira/browse/LUCENENET-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061125#comment-13061125 ] Digy commented on LUCENENET-418: It works! Thanks. DIGY LuceneTestCase should not have a static method could throw exceptions. Key: LUCENENET-418 URL: https://issues.apache.org/jira/browse/LUCENENET-418 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Test Affects Versions: Lucene.Net 3.x Environment: Linux, OSX, etc Reporter: michael herndon Assignee: michael herndon Labels: test Fix For: Lucene.Net 2.9.4g Original Estimate: 2m Remaining Estimate: 2m Throwing an exception in a base classes for 90% tests in a static method makes it hard to debug the issue in nunit. The test results came back saying that TestFixtureSetup was causing an issue even though it was the Static Constructor causing problems and this then propagates to all the tests that stem from LuceneTestCase. The TEMP_DIR needs to be moved to a static util class as a property or even a mixin method. This caused me hours to debug and figure out the real issue as the underlying exception method never bubbled up. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Created] (LUCENENET-433) AttributeSource can have an invalid computed state (LUCENE-3042)
AttributeSource can have an invalid computed state (LUCENE-3042) Key: LUCENENET-433 URL: https://issues.apache.org/jira/browse/LUCENENET-433 Project: Lucene.Net Issue Type: Bug Reporter: Digy Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g If you work a tokenstream, consume it, then reuse it and add an attribute to it, the computed state is wrong. thus for example, clearAttributes() will not actually clear the attribute added. So in some situations, addAttribute is not actually clearing the computed state when it should. https://issues.apache.org/jira/browse/LUCENE-3042 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-433) AttributeSource can have an invalid computed state (LUCENE-3042)
[ https://issues.apache.org/jira/browse/LUCENENET-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061214#comment-13061214 ] Digy commented on LUCENENET-433: Here is the test case {code} [Test] public void Test_LUCENE_3042_LUCENENET_433() { String testString = t; Analyzer analyzer = new Lucene.Net.Analysis.Standard.StandardAnalyzer(); TokenStream stream = analyzer.ReusableTokenStream(dummy, new System.IO.StringReader(testString)); stream.Reset(); while (stream.IncrementToken()) { // consume } stream.End(); stream.Close(); AssertAnalyzesToReuse(analyzer, testString, new String[] { t }); } {code} AttributeSource can have an invalid computed state (LUCENE-3042) Key: LUCENENET-433 URL: https://issues.apache.org/jira/browse/LUCENENET-433 Project: Lucene.Net Issue Type: Bug Reporter: Digy Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g If you work a tokenstream, consume it, then reuse it and add an attribute to it, the computed state is wrong. thus for example, clearAttributes() will not actually clear the attribute added. So in some situations, addAttribute is not actually clearing the computed state when it should. https://issues.apache.org/jira/browse/LUCENE-3042 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-172) This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass
[ https://issues.apache.org/jira/browse/LUCENENET-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-172. Resolution: Fixed Assignee: Digy (was: Scott Lombard) Fixed in 2.9.4g. No fix for 2.9.4 This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass --- Key: LUCENENET-172 URL: https://issues.apache.org/jira/browse/LUCENENET-172 Project: Lucene.Net Issue Type: Improvement Components: Lucene.Net Core Affects Versions: Lucene.Net 2.3.1, Lucene.Net 2.3.2 Reporter: Ben Martz Assignee: Digy Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: lucene_2.3.1_exceptions_fix.patch, lucene_2.9.4g_exceptions_fix The java version of Lucene handles end-of-file in FastCharStream by throwing an exception. This behavior has been ported to .NET but the behavior carries an unacceptable cost in the .NET environment. This patch is based on the prior work in LUCENENET-8 and LUCENENET-11, which I gratefully acknowledge for the solution. While I understand that this patch is outside of the current project specification in that it deviates from the pure nature of the port, I believe that it is very important to make the patch available to any developer looking to leverage Lucene.Net in their project. Thanks for your consideration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-433) AttributeSource can have an invalid computed state (LUCENE-3042)
[ https://issues.apache.org/jira/browse/LUCENENET-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061304#comment-13061304 ] Digy commented on LUCENENET-433: Committed to 2.9.4g branch AttributeSource can have an invalid computed state (LUCENE-3042) Key: LUCENENET-433 URL: https://issues.apache.org/jira/browse/LUCENENET-433 Project: Lucene.Net Issue Type: Bug Reporter: Digy Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: LUCENENET-433.patch If you work a tokenstream, consume it, then reuse it and add an attribute to it, the computed state is wrong. thus for example, clearAttributes() will not actually clear the attribute added. So in some situations, addAttribute is not actually clearing the computed state when it should. https://issues.apache.org/jira/browse/LUCENE-3042 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-172) This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass
[ https://issues.apache.org/jira/browse/LUCENENET-172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13061595#comment-13061595 ] Digy commented on LUCENENET-172: Already fixed for 2.9.4g This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass --- Key: LUCENENET-172 URL: https://issues.apache.org/jira/browse/LUCENENET-172 Project: Lucene.Net Issue Type: Improvement Components: Lucene.Net Core Affects Versions: Lucene.Net 2.3.1, Lucene.Net 2.3.2 Reporter: Ben Martz Assignee: Scott Lombard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: lucene_2.3.1_exceptions_fix.patch, lucene_2.9.4g_exceptions_fix The java version of Lucene handles end-of-file in FastCharStream by throwing an exception. This behavior has been ported to .NET but the behavior carries an unacceptable cost in the .NET environment. This patch is based on the prior work in LUCENENET-8 and LUCENENET-11, which I gratefully acknowledge for the solution. While I understand that this patch is outside of the current project specification in that it deviates from the pure nature of the port, I believe that it is very important to make the patch available to any developer looking to leverage Lucene.Net in their project. Thanks for your consideration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-172) This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass
[ https://issues.apache.org/jira/browse/LUCENENET-172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-172: --- Fix Version/s: Lucene.Net 2.9.4g This patch fixes the unexceptional exceptions ecountered in FastCharStream and SupportClass --- Key: LUCENENET-172 URL: https://issues.apache.org/jira/browse/LUCENENET-172 Project: Lucene.Net Issue Type: Improvement Components: Lucene.Net Core Affects Versions: Lucene.Net 2.3.1, Lucene.Net 2.3.2 Reporter: Ben Martz Assignee: Scott Lombard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: lucene_2.3.1_exceptions_fix.patch, lucene_2.9.4g_exceptions_fix The java version of Lucene handles end-of-file in FastCharStream by throwing an exception. This behavior has been ported to .NET but the behavior carries an unacceptable cost in the .NET environment. This patch is based on the prior work in LUCENENET-8 and LUCENENET-11, which I gratefully acknowledge for the solution. While I understand that this patch is outside of the current project specification in that it deviates from the pure nature of the port, I believe that it is very important to make the patch available to any developer looking to leverage Lucene.Net in their project. Thanks for your consideration. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-431) Spatial.Net Cartesian won't find docs in radius in certain cases
[ https://issues.apache.org/jira/browse/LUCENENET-431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13060709#comment-13060709 ] Digy commented on LUCENENET-431: Thanks Olle and Matt, I committed the LUCENE-1930 patch to the 2.9.4g branch (+ added Olle's test case). (Another divergence from lucene.java; since this patch is still waiting to be applied). DIGY Spatial.Net Cartesian won't find docs in radius in certain cases Key: LUCENENET-431 URL: https://issues.apache.org/jira/browse/LUCENENET-431 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Contrib Affects Versions: Lucene.Net 2.9.4 Environment: Windows 7 x64 Reporter: Olle Jacobsen Labels: spatialsearch To replicate change Lucene.Net.Contrib.Spatial.Test.TestCartesian to the following witch should return 3 results. Line 42: private double _lat = 55.6880508001; 43: private double _lng = 13.5871808352; // This passes: 13.6271808352 73: AddPoint(writer, Within radius, 55.6880508001, 13.5717346673); 74: AddPoint(writer, Within radius, 55.6821978456, 13.6076183965); 75: AddPoint(writer, Within radius, 55.673251569, 13.5946697607); 76: AddPoint(writer, Close but not in radius, 55.8634157297, 13.5497731987); 77: AddPoint(writer, Faar away, 40.7137578228, -74.0126901936); 130: const double miles = 5.0; 156: Console.WriteLine(Distances should be 3 + distances.Count); 157: Console.WriteLine(Results should be 3 + results); 159: Assert.AreEqual(3, distances.Count); // fixed a store of only needed distances 160: Assert.AreEqual(3, results); -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-431) Spatial.Net Cartesian won't find docs in radius in certain cases
[ https://issues.apache.org/jira/browse/LUCENENET-431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-431. Resolution: Fixed Fix Version/s: Lucene.Net 2.9.4g Assignee: Digy Spatial.Net Cartesian won't find docs in radius in certain cases Key: LUCENENET-431 URL: https://issues.apache.org/jira/browse/LUCENENET-431 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Contrib Affects Versions: Lucene.Net 2.9.4 Environment: Windows 7 x64 Reporter: Olle Jacobsen Assignee: Digy Labels: spatialsearch Fix For: Lucene.Net 2.9.4g To replicate change Lucene.Net.Contrib.Spatial.Test.TestCartesian to the following witch should return 3 results. Line 42: private double _lat = 55.6880508001; 43: private double _lng = 13.5871808352; // This passes: 13.6271808352 73: AddPoint(writer, Within radius, 55.6880508001, 13.5717346673); 74: AddPoint(writer, Within radius, 55.6821978456, 13.6076183965); 75: AddPoint(writer, Within radius, 55.673251569, 13.5946697607); 76: AddPoint(writer, Close but not in radius, 55.8634157297, 13.5497731987); 77: AddPoint(writer, Faar away, 40.7137578228, -74.0126901936); 130: const double miles = 5.0; 156: Console.WriteLine(Distances should be 3 + distances.Count); 157: Console.WriteLine(Results should be 3 + results); 159: Assert.AreEqual(3, distances.Count); // fixed a store of only needed distances 160: Assert.AreEqual(3, results); -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-418) LuceneTestCase should not have a static method could throw exceptions.
[ https://issues.apache.org/jira/browse/LUCENENET-418?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13059955#comment-13059955 ] Digy commented on LUCENENET-418: It fails in both builds. LuceneTestCase should not have a static method could throw exceptions. Key: LUCENENET-418 URL: https://issues.apache.org/jira/browse/LUCENENET-418 Project: Lucene.Net Issue Type: Bug Components: Lucene.Net Test Affects Versions: Lucene.Net 3.x Environment: Linux, OSX, etc Reporter: michael herndon Assignee: michael herndon Labels: test Fix For: Lucene.Net 2.9.4g Original Estimate: 2m Remaining Estimate: 2m Throwing an exception in a base classes for 90% tests in a static method makes it hard to debug the issue in nunit. The test results came back saying that TestFixtureSetup was causing an issue even though it was the Static Constructor causing problems and this then propagates to all the tests that stem from LuceneTestCase. The TEMP_DIR needs to be moved to a static util class as a property or even a mixin method. This caused me hours to debug and figure out the real issue as the underlying exception method never bubbled up. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-430) Contrib.ChainedFilter
[ https://issues.apache.org/jira/browse/LUCENENET-430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-430: --- Attachment: ChainedFilterTest.cs ChainedFilter.cs Contrib.ChainedFilter - Key: LUCENENET-430 URL: https://issues.apache.org/jira/browse/LUCENENET-430 Project: Lucene.Net Issue Type: New Feature Affects Versions: Lucene.Net 2.9.4g Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g Attachments: ChainedFilter.cs, ChainedFilterTest.cs Port of lucene.Java 3.0.3's ChainedFilter its test cases. See the StackOverflow question: How to combine multiple filters within one search? http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Created] (LUCENENET-430) Contrib.ChainedFilter
Contrib.ChainedFilter - Key: LUCENENET-430 URL: https://issues.apache.org/jira/browse/LUCENENET-430 Project: Lucene.Net Issue Type: New Feature Affects Versions: Lucene.Net 2.9.4g Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g Attachments: ChainedFilter.cs, ChainedFilterTest.cs Port of lucene.Java 3.0.3's ChainedFilter its test cases. See the StackOverflow question: How to combine multiple filters within one search? http://stackoverflow.com/questions/6570477/multiple-filters-in-lucene-net -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
I've add a Paths Class under the Lucene.Net.Tests Util folder Since it is a Lucene.Net specific code, may Support be a better place? DIGY -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Friday, July 01, 2011 11:53 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? @Rory, Jira in the past had the ability to create sub tasks or convert a current task to a sub task. I'm guessing that apache's jira should be able to do that. @All, I've add a Paths Class under the Lucene.Net.Tests Util folder (feel free to rename, refactor as long as the tests still work) to help with working with paths in general. This should help with forming paths relative to the temp directory or the project root. NUnit's Shadow Copy tends to throw people off when trying to get the path of the current assembly being tested to build up a relative path, the Path class should help with that. - Michael On Fri, Jul 1, 2011 at 4:09 PM, Rory Plaire codekai...@gmail.com wrote: My thinking is just a separate ticket for each one. This makes the work easier to manage and gives a better sense about how much work is left as well as makes it easier to prioritize independent issues. We could link all the sub-issues to a single task / feature / whatever (that is, if JIRA has that capability). -r On Fri, Jul 1, 2011 at 12:48 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I think whatever makes sense to do. possibly create one jira for now with a running list that can be modified and possibly as people pull from that list, cross things off or create a separate ticket that links back to to the main one. On Fri, Jul 1, 2011 at 3:35 PM, Rory Plaire codekai...@gmail.com wrote: @Michael - Should that list be in JIRA? It would be easier to manage, I think... If yes, I'll happily do it. -r On Fri, Jul 1, 2011 at 8:04 AM, Michael Herndon mhern...@wickedsoftware.net wrote: * need to document what the build script does. whut grammerz? On Fri, Jul 1, 2011 at 10:52 AM, Michael Herndon mhern...@wickedsoftware.net wrote: @Rory, @All, The only tickets I currently have for those is LUCENE-419, LUCENE-418 418, I should be able to push into the 2.9.4g branch tonight. 419 is a long term goal and not as important as getting the tests fixed, of have the tests broken down into what is actually a unit test, functional test, perf or long running test. I can get into more why it needs to be done. I'll also need to make document the what build script currently does on the wiki and make a few notes about testing, like using the RAMDirectory, etc. Things that need to get done or even be discussed. * There needs to be a running list of things to do/not to do with testing. I don't know if this goes in a jira or do we keep a running list on the wiki or site for people to pick up and help with. * Tests need to run on mono and not Fail (there is a good deal of failing tests on mono, mostly due to the temp directory have the C:\ in the path). * Assert.ThrowExceptionType() needs to be used instead of Try/Catch Assert.Fail. ** * File Path combines to the temp directory need helper methods, * e,g, having this in a hundred places is bad new System.IO.FileInfo(System.IO.Path.Combine(Support.AppSettings.Get(tempDir, ), testIndex)); * We should still be testing deprecated methods, but we need to use #pragma warning disable/enable 0618 for testing those. otherwise compiler warnings are too numerous to be anywhere near helpful. * We should only be using deprecated methods in places where they are being explicitly tested, other tests that need that functionality in order to validate those tests should be re factored to use methods that are not deprecated. * Identify code that could be abstracted into test utility classes. * Infrastructure Validation tests need to be made, anything that seems like infrastructure. e.g. does the temp directory exist, does the folders that the tests use inside the temp directory exist, can we read/write to those folders. (if a ton of tests fail due to the file system, we should be able to point out that it was due to permissions or missing folders, files, etc). * Identify what classes need an interface, abstract class or inherited in order to create testing mocks. (once those classes are created, they should be documented in the wiki). ** Asset.Throws needs to replace stuff like the following. We should also be checking the messages for exceptions and make sure they make sense
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Troy, What do you mean by merging? 2.9.4g is a superset of 2.9.4 and has * bux fixes like LUCENENET-414 * new features like LUCENENET-429, MemoryMappedDirectory(although not used yet) , PartiallyTrustedAppDomain tests * improvements like LUCENENET-427, LUCENENET-418, Refactoring of SupportClass * API changes like - StopAnalyzer(Liststring stopWords) - Query.ExtractTerms(ICollectionstring) - TopDocs.TotalHits, TopDocs.ScoreDocs * readibily-changes like replacing some abstract methods with Func, while(XXX.MoveNext())s with foreachs etc. Is it something like creating a 2.9.4 tag and replacing trunk with 2.9.4g? DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Friday, July 01, 2011 12:36 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? DIGY - Re: Why do I wait.. That's mostly because I intend to make some deep changes, which would make merging the 2.9.4g branch back to trunk difficult. So, it's easier to merge those changes first. Also, I won't have enough time to make my changes until a little way in the future, but probably do have the time to put together another release, so I'll do that first because it fits with my work/life schedule. Thanks, Troy On Thu, Jun 30, 2011 at 1:19 PM, Digy digyd...@gmail.com wrote: Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the internals and index formats are significantly different including nixing the current vint file format and using byte[] array slices for Terms instead of char[]. So while porting 1 to 1 while require less knowledge or thought, its most likely going to require more hours of work. And Its definitely not going to guarantee the stability of the code or that its great code. I'd have to say that its not as stable as most would believe at the moment
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
OK. Maybe I asked wrong question, Suppose I committed IsolatedStorageDirectory only to trunk. How would you merge this branch trunk? DIGY. -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Sunday, July 03, 2011 12:28 AM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Yes. But if there are commits to trunk before that happens it's a merge. -T On Jul 2, 2011 1:53 PM, Digy digyd...@gmail.com wrote: Troy, What do you mean by merging? 2.9.4g is a superset of 2.9.4 and has * bux fixes like LUCENENET-414 * new features like LUCENENET-429, MemoryMappedDirectory(although not used yet) , PartiallyTrustedAppDomain tests * improvements like LUCENENET-427, LUCENENET-418, Refactoring of SupportClass * API changes like - StopAnalyzer(Liststring stopWords) - Query.ExtractTerms(ICollectionstring) - TopDocs.TotalHits, TopDocs.ScoreDocs * readibily-changes like replacing some abstract methods with Func, while(XXX.MoveNext())s with foreachs etc. Is it something like creating a 2.9.4 tag and replacing trunk with 2.9.4g? DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Friday, July 01, 2011 12:36 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? DIGY - Re: Why do I wait.. That's mostly because I intend to make some deep changes, which would make merging the 2.9.4g branch back to trunk difficult. So, it's easier to merge those changes first. Also, I won't have enough time to make my changes until a little way in the future, but probably do have the time to put together another release, so I'll do that first because it fits with my work/life schedule. Thanks, Troy On Thu, Jun 30, 2011 at 1:19 PM, Digy digyd...@gmail.com wrote: Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the internals and index formats are significantly different including nixing the current vint file format and using byte[] array slices for Terms instead of char[]. So while porting 1 to 1 while require less knowledge or thought, its most likely going to require more hours of work. And Its definitely not going to guarantee the stability of the code or that its great code. I'd have to say that its not as stable as most would believe at the moment. Most of the tests avoid anything that remotely looks like it knows about the DRY principle and there is a static constructor in the core test case that throws an exception if it doesn't find an environment variable TEMP which will fail 90% of the tests and nunit will be unable to give you a clear reason why. Just to name a few issues I came across working towards getting Lucene.Net into CI. I haven't even started really digging in under the covers of the code yet. So my suggestion is to chew on this a bit more and build consensus, avoid fracturing people into sides. Be open to reservations and concerns that others have and continue to address them. - Michael On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote: Although there are a lot of people using Lucene.Net, this is our contribution report for the past 5 years. https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r eport.contributions%3AcontributionreportNext=Next DIGY -Original Message- From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Thursday, June 30, 2011 8:16 PM To: Rory Plaire; lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
A) I don't to want to commit anything thats going to piss alot of people off, B) I don't want to spend time/waste time on modifications that are going to be rejected. What I've learnt from Apache Way is creating a JIRA issue if you are hesitant. If no one answers in a reasonable time(mostly), then commit. DIGY -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Thursday, June 30, 2011 11:58 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? @Troy, I've already started working towards fixing unit testing issues, and prototyping some things that sure DRY up the testing just so that I can get the tests running on mono. Those changes are currently in a private git repo, however since we don't have a CI, I'm need to make some time to manually test those on at least 3 different Os (windowx, osx, and ubuntu) before putting those back into the 2.9.4g branch. The reason being I need those in working order so that I can do a write up on pulling those from source and at least running the build script to compile everything and run the tests for you. I don't know about everyone else, but thats a starting point I look for when I go to work on something or commit something back. They should make their way back sometime this month. I think the next thing I'll do is put my money where my mouth is, spend time break down the rest of the CI tasks, then seeing how much stuff I can get documented into the wiki. The simple faceted search is a decent starting template. @Digy I agree with the talk, no work. Though coming from the outside in, I still cringe when I make any commits at the moment. (even that little .gitnore file) A) I don't to want to commit anything thats going to piss alot of people off, B) I don't want to spend time/waste time on modifications that are going to be rejected. C) it took a good deal of going through things before I felt comfortable to even making a commit. D) yes I know I just need to get over it and so does everyone else (hence the obsession with the unit tests at the moment). and I think a key to relaying people to get over it, including myself, is to make the point you had more clear across the board: *If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. * +1 because that makes feel there is more leadway to experiment and any decent effort will at least go somewhere to live and not be wasted. On Thu, Jun 30, 2011 at 4:19 PM, Digy digyd...@gmail.com wrote: Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
I can not say I like this approach, but till we find an automated way(with good results), it seems to be the only way we can use. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Friday, July 01, 2011 12:43 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Scott - The idea of the automated port is still worth doing. Perhaps it makes sense for someone more passionate about the line-by-line idea to do that work? I would say, focus on what makes sense to you. Being productive, regardless of the specific direction, is what will be most valuable. Once you start, others will join and momentum will build. That is how these things work. I like DIGY's approach too, but the problem with it is that it is a never-ending manual task. The theory behind the automated port is that it may reduce the manual work. It is complicated, but once it's built and works, it will save a lot of future development hours. If it's built in a sufficiently general manner, it could be useful for other project like Lucene.Net that want to automate a port from Java to C#. It might make sense for that to be a separate project from Lucene.Net though. -T On Thu, Jun 30, 2011 at 2:13 PM, Scott Lombard lombardena...@gmail.comwrote: Ok I think I asked the wrong question. I am trying to figure out where to put my time. I was thinking about working on the automated porting system, but when I saw the response to the .NET 4.0 discussions I started to question if that is the right direction. The community seemed to be more interested in the .NET features. The complexity of the automated tool is going to become very high and will probably end up with a line-for-line style port. So I keep asking my self is the automated tool worth it. I don't think it is. I like the method has been Digy is using for porting the code. So I guess for me the real question is Digy where did you see 2.9.4g going next and what do you need help on? Scott -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Thursday, June 30, 2011 4:20 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Although there are a lot of people using Lucene.Net, this is our contribution report for the past 5 years. https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r eport.contributions%3AcontributionreportNext=Next DIGY -Original Message- From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Thursday, June 30, 2011 8:16 PM To: Rory Plaire; lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? As someone from the nhibernate project We stopped following hibernate a while ago, and haven't regretted it We have mire features, less bugs and better code base Sent from my Windows Phone From: Rory Plaire Sent: Thursday, June 30, 2011 19:58 To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't want to drag this out much longer, but I am curious with people who hold the line-by-line sentiment - are you NHibernate users? -r On Thu, Jun 30, 2011 at 2:39 AM, Noel Lysaght lysag...@hotmail.com wrote: Can I just plug in my bit and say I agree 100% with what Moray has outlined below. If we move away from the line by line port then over time we'll loose out on the momentum that is Lucene and the improvements that they make. It is only if the Lucene.NET community has expertise in search, a deep knowledge of the project and the community can guarantee that the knowledge will survive members coming and going should such a consideration be give. When Lucene.NET has stood on it's feet for a number of years after it has moved out of Apache incubation should consideration be given to abandoning a line by line port. By all means extend and wrap the libraries in .NET equivalents and .NET goodness like LINQ (we do this internally in our company at the moment); but leave the core of the project on a line by line port. Just my tu-pence worth. Kind Regards Noel -Original Message- From: Moray McConnachie Sent: Thursday, June 30, 2011 10:25 AM To: lucene-net-user@lucene.apache.**orglucene-net-u...@lucene.apache.org Cc: lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't think I'm as hard core on this as Neal, but remember: the history of the Lucene.NET project is that all the intellectual work, all the understanding of search, all the new features come from the Lucene Java folks. Theirs is an immensely respected project, and I trust them to add new features that will be well-tested and well-researched, and to have a decent roadmap which I can trust they will execute on. Now I know there's been an influx of capable developers to Lucene.NET who are ready, willing and (I'm going to assume) able to add a lot more value in a generic .NET implementation as they change it. But it'll take a while before I trust a .NET dedicated framework which is significantly diverged from Java in the way I do the line-by-line version. And at what stage is it not just not a line-by-line port, but not a port at all? At the same time, I recognise that if this project is going to continue, and attract good developers, it has to change in this direction. So that said, I can see why a line-by-line port might not be sustainable. And most people don't need it. But most of us using Lucene in production systems do need a system that we can trust and rely on. So let me chime in with someone else's plea, to keep the general structure close to Lucene, to keep the same general objects and inheritance set-up, and to keep the same method names, even if you add other methods and classes to provide additional functionality. ABSOLUTELY the same file formats. End users benefit a lot from a high degree of similarity, with good documentation and help being available from the Java community. Yours, Moray --**--- Moray McConnachie Director of IT+44 1865 261 600 Oxford Analytica http://www.oxan.com -Original Message- From: Granroth, Neal V. [mailto:neal.granroth@**thermofisher.comneal.granr...@thermofisher.com ] Sent: 29 June 2011 20:47 To: lucene-net-user@lucene.apache.**orglucene-net-u...@lucene.apache.org Cc: lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? This is has been discussed many times. Lucene.NET is not valid, the code cannot be trusted, if it is not a line-by-line port. It ceases to be Lucene. - Neal -Original Message- From: Scott Lombard [mailto:lombardenator@gmail.**comlombardena...@gmail.com ] Sent: Wednesday, June 29, 2011 1:58 PM To: lucene-net-dev@lucene.apache.**org lucene-net-...@lucene.apache.org; lucene-net-user@lucene.apache
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Hi Rory, I agree with you in theory. But collecting people to work on a project is not easy as giving advise. Till now, line-by-line port have seemed to be the best with a limited human source. Would you be willing to work on your approach and maintain newer Lucene.Net releases? DIGY -Original Message- From: Rory Plaire [mailto:codekai...@gmail.com] Sent: Thursday, June 30, 2011 1:06 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? For what it's worth, I've participated in a number of projects which have been ported from Java to .Net with varying levels of translation into the native style and functionalty of the .Net framework. The largest are NTS, a JTS port and NHibernate, a Java Hibernate port. My experience is that a line-by-line port isn't as valuable as people would imagine. Even if we discount the reality that a line-by-line port is really unachievable due to various differences between the frameworks, keeping even identical code in sync will always take some work: full automation on this large of a project is infeasible. During manual effort, therefore, making readable changes to the code is really not that much more work. For update maintenance, porting over code from recent versions of both projects to the .Net versions, and .Nettifying that code is little trouble. Since both projects use source control, it's easy to see the changes and translate them. When it comes to debugging issues, in NTS or NHibernate, I go to the Java sources, and even if the classes were largely rewritten to take advantage of IEnumerable or generics or structures, running unit tests, tracing the code, and seeing the output of each has always been straightforward. Since I'm using .Net, I'd want the Lucene.Net project to be more .Net than a line-by-line port of Java, in order to take advantage of the Framework as well as provide a better code base for .Net developers to maintain. If large .Net projects ported from Java do this, and have found considerable success, it is, in my view, a well-proven practice and shouldn't be avoided due to uncertainty of how the resulting code should work. Ultimately, that is what unit tests are for, anyway.
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
As a Lucene.Net user I wouldn't care whether it is line-by-line port or not. But as a contributer, I would prefer a parallel code that makes the life easier for manual ports of new releases(until this process is automated) PS: I presume no one thinks of functional or index-level incompatibility. DIGY -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: Wednesday, June 29, 2011 10:47 PM To: lucene-net-u...@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? This is has been discussed many times. Lucene.NET is not valid, the code cannot be trusted, if it is not a line-by-line port. It ceases to be Lucene. - Neal -Original Message- From: Scott Lombard [mailto:lombardena...@gmail.com] Sent: Wednesday, June 29, 2011 1:58 PM To: lucene-net-...@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? After the large community response about moving the code base from .Net 2.0 to Net 4.0 I am trying to figure out what is the need for a line-by-line port. Starting with Digy's excellent work on the conversion to generics a priority of the 2.9.4g release is the 2 packages would not be interchangeable. So faster turnaround from a java release won't matter to non line-by-line users they will have to wait until the updates are made to the non line-by-line code base. My question is there really a user base for the line-by-line port? Anyone have a comment? Scott
[Lucene.Net] [jira] [Closed] (LUCENENET-428) How to do that the results are displayed in the first original tokens and them with synonyms?
[ https://issues.apache.org/jira/browse/LUCENENET-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy closed LUCENENET-428. -- Resolution: Invalid Please post questions to the mailing list, not in JIRA How to do that the results are displayed in the first original tokens and them with synonyms? - Key: LUCENENET-428 URL: https://issues.apache.org/jira/browse/LUCENENET-428 Project: Lucene.Net Issue Type: Wish Components: Lucene.Net Core Affects Versions: Lucene.Net 2.9.4 Environment: .net 4.0 Reporter: Vladimir How to do that the results are displayed in the first original tokens and them with synonyms? My Analyzer(part) : public override TokenStream TokenStream(string fieldName, TextReader reader) { TokenStream result = new StandardTokenizer(reader); result = new LowerCaseFilter(result); result = new StopFilter(result, stoptable); result = new SynonymFilter(result, synonymEngine); result = new ExtendedRussianStemFilter(result, charset); return result; } My SynonymFilter : internal class SynonymFilter : TokenFilter { private readonly ISynonymEngine engine; private readonly QueueToken synonymTokenQueue = new QueueToken(); public SynonymFilter(TokenStream tokenStream, ISynonymEngine engine) : base(tokenStream) { this.engine = engine; } public override Token Next() { if (synonymTokenQueue.Count 0) { return synonymTokenQueue.Dequeue(); } Token t = input.Next(); if (t == null) return null; if (t.Type() == SYNONYM) return t; IEnumerablestring synonyms = engine.GetSynonyms(t.TermText()); if (synonyms == null) { return t; } foreach (string syn in synonyms) { if (!t.TermText().Equals(syn)) { var synToken = new Token(syn, t.StartOffset(), t.EndOffset(), SYNONYM); synToken.SetPositionIncrement(0); synonymTokenQueue.Enqueue(synToken); } } return t; } } Thanks! -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Hi Scott, Please avoid crossposting(as I do now). Since when I reply to your eMail, it goes to one of the lists and thread is splitted into two. It may be good for announcements but not for discussions. DIGY -Original Message- From: Scott Lombard [mailto:lombardena...@gmail.com] Sent: Wednesday, June 29, 2011 9:58 PM To: lucene-net-...@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? After the large community response about moving the code base from .Net 2.0 to Net 4.0 I am trying to figure out what is the need for a line-by-line port. Starting with Digy's excellent work on the conversion to generics a priority of the 2.9.4g release is the 2 packages would not be interchangeable. So faster turnaround from a java release won't matter to non line-by-line users they will have to wait until the updates are made to the non line-by-line code base. My question is there really a user base for the line-by-line port? Anyone have a comment? Scott
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
I do not know if too much emphasis should be placed on user vs. contributor. I am sorry for this misunderstanding. What I tried to say with contributor(not committer) was the people that works on Lucene.Net source code, not the ones who just consume it. DIGY -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: Wednesday, June 29, 2011 11:23 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I do not know if too much emphasis should be placed on user vs. contributor. The project needs to also consider those of us who use Lucene.NET source releases only. It is much easier to locally patch/fix the source when I can compare it directly to Lucene core. - Neal -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Wednesday, June 29, 2011 2:58 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? As a Lucene.Net user I wouldn't care whether it is line-by-line port or not. But as a contributer, I would prefer a parallel code that makes the life easier for manual ports of new releases(until this process is automated) PS: I presume no one thinks of functional or index-level incompatibility. DIGY -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: Wednesday, June 29, 2011 10:47 PM To: lucene-net-u...@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? This is has been discussed many times. Lucene.NET is not valid, the code cannot be trusted, if it is not a line-by-line port. It ceases to be Lucene. - Neal -Original Message- From: Scott Lombard [mailto:lombardena...@gmail.com] Sent: Wednesday, June 29, 2011 1:58 PM To: lucene-net-...@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? After the large community response about moving the code base from .Net 2.0 to Net 4.0 I am trying to figure out what is the need for a line-by-line port. Starting with Digy's excellent work on the conversion to generics a priority of the 2.9.4g release is the 2 packages would not be interchangeable. So faster turnaround from a java release won't matter to non line-by-line users they will have to wait until the updates are made to the non line-by-line code base. My question is there really a user base for the line-by-line port? Anyone have a comment? Scott
[Lucene.Net] [jira] [Created] (LUCENENET-429) Corrupted segment file not detected and wipes index contents (LUCENE-3255)
Corrupted segment file not detected and wipes index contents (LUCENE-3255) -- Key: LUCENENET-429 URL: https://issues.apache.org/jira/browse/LUCENENET-429 Project: Lucene.Net Issue Type: New Feature Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g https://issues.apache.org/jira/browse/LUCENE-3255 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-429) Corrupted segment file not detected and wipes index contents (LUCENE-3255)
[ https://issues.apache.org/jira/browse/LUCENENET-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-429: --- Attachment: LUCENENET-429.patch Corrupted segment file not detected and wipes index contents (LUCENE-3255) -- Key: LUCENENET-429 URL: https://issues.apache.org/jira/browse/LUCENENET-429 Project: Lucene.Net Issue Type: New Feature Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g Attachments: LUCENENET-429.patch https://issues.apache.org/jira/browse/LUCENE-3255 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Updated] (LUCENENET-427) Provide limit on phrase analysis in FastVectorHighlighter (LUCENE-3234)
[ https://issues.apache.org/jira/browse/LUCENENET-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy updated LUCENENET-427: --- Attachment: FastVectorHighlighter.patch Provide limit on phrase analysis in FastVectorHighlighter (LUCENE-3234) --- Key: LUCENENET-427 URL: https://issues.apache.org/jira/browse/LUCENENET-427 Project: Lucene.Net Issue Type: Improvement Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g Attachments: FastVectorHighlighter.patch https://issues.apache.org/jira/browse/LUCENE-3234 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-427) Provide limit on phrase analysis in FastVectorHighlighter (LUCENE-3234)
[ https://issues.apache.org/jira/browse/LUCENENET-427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-427. Resolution: Fixed Committed Provide limit on phrase analysis in FastVectorHighlighter (LUCENE-3234) --- Key: LUCENENET-427 URL: https://issues.apache.org/jira/browse/LUCENENET-427 Project: Lucene.Net Issue Type: Improvement Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Digy Priority: Minor Fix For: Lucene.Net 2.9.4g Attachments: FastVectorHighlighter.patch https://issues.apache.org/jira/browse/LUCENE-3234 -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (LUCENE-3234) Provide limit on phrase analysis in FastVectorHighlighter
[ https://issues.apache.org/jira/browse/LUCENE-3234?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13054935#comment-13054935 ] Digy commented on LUCENE-3234: -- I am not sure how much it is related to this issue but there was a similar issue in Lucene.Net. https://issues.apache.org/jira/browse/LUCENENET-350 Provide limit on phrase analysis in FastVectorHighlighter - Key: LUCENE-3234 URL: https://issues.apache.org/jira/browse/LUCENE-3234 Project: Lucene - Java Issue Type: Improvement Affects Versions: 2.9.4, 3.0.3, 3.1, 3.2, 3.3 Reporter: Mike Sokolov Assignee: Koji Sekiguchi Fix For: 3.4, 4.0 Attachments: LUCENE-3234.patch, LUCENE-3234.patch, LUCENE-3234.patch With larger documents, FVH can spend a lot of time trying to find the best-scoring snippet as it examines every possible phrase formed from matching terms in the document. If one is willing to accept less-than-perfect scoring by limiting the number of phrases that are examined, substantial speedups are possible. This is analogous to the Highlighter limit on the number of characters to analyze. The patch includes an artifical test case that shows 1000x speedup. In a more normal test environment, with English documents and random queries, I am seeing speedups of around 3-10x when setting phraseLimit=1, which has the effect of selecting the first possible snippet in the document. Most of our sites operate in this way (just show the first snippet), so this would be a big win for us. With phraseLimit = -1, you get the existing FVH behavior. At larger values of phraseLimit, you may not get substantial speedup in the normal case, but you do get the benefit of protection against blow-up in pathological cases. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] [jira] [Commented] (LUCENENET-426) Mark BaseFragmentsBuilder methods as virtual
[ https://issues.apache.org/jira/browse/LUCENENET-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13053722#comment-13053722 ] Digy commented on LUCENENET-426: 10 min. work done. DIGY Mark BaseFragmentsBuilder methods as virtual Key: LUCENENET-426 URL: https://issues.apache.org/jira/browse/LUCENENET-426 Project: Lucene.Net Issue Type: Improvement Components: Lucene.Net Contrib Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 3.x, Lucene.Net 2.9.4g Reporter: Itamar Syn-Hershko Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: fvh.patch Without marking methods in BaseFragmentsBuilder as virtual, it is meaningless to have FragmentsBuilder deriving from a class named Base, since most of its functionality cannot be overridden. Attached is a patch for marking the important methods virtual. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-426) Mark BaseFragmentsBuilder methods as virtual
[ https://issues.apache.org/jira/browse/LUCENENET-426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Digy resolved LUCENENET-426. Resolution: Fixed Fix Version/s: Lucene.Net 2.9.4g Lucene.Net 2.9.4 Thanks Itamar. Fixed in trunk 2.9.4g branch. DIGY Mark BaseFragmentsBuilder methods as virtual Key: LUCENENET-426 URL: https://issues.apache.org/jira/browse/LUCENENET-426 Project: Lucene.Net Issue Type: Improvement Components: Lucene.Net Contrib Affects Versions: Lucene.Net 2.9.2, Lucene.Net 2.9.4, Lucene.Net 3.x, Lucene.Net 2.9.4g Reporter: Itamar Syn-Hershko Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Attachments: fvh.patch Without marking methods in BaseFragmentsBuilder as virtual, it is meaningless to have FragmentsBuilder deriving from a class named Base, since most of its functionality cannot be overridden. Attached is a patch for marking the important methods virtual. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira