Re: Lets talk graduation
I concur. On Fri, Jun 15, 2012 at 1:44 PM, Christopher Currens currens.ch...@gmail.com wrote: Thanks Prescott. I noticed that we need to elect someone as Chair/VP for the project, someone to be a representative for our collective PMC to the board. Unless I'm mistaken, the job entails the quarterly report to the board and communication with the board when a new PMC member is added or the chair is being changed. Looking through the history of board reports we've done in the incubator, I've noticed that Prescott has largely been the one to put those together in a timely fashion. Since Prescott has always been an active member as well, I think he's an ideal candidate for the roll as VP for Lucene.NET. It doesn't seem much different from the role he's already been involved in. Any objections? Thanks, Christopher On Fri, Jun 15, 2012 at 1:15 PM, Prescott Nasser geobmx...@hotmail.comwrote: https://cwiki.apache.org/confluence/display/LUCENENET/Graduation+-+Resolution+Template Stefan, I think you've more than proved your value. +1
Re: Lets talk graduation
Stephan is VP of EOL Compliance. On Fri, Jun 15, 2012 at 1:48 PM, Prescott Nasser geobmx...@hotmail.com wrote: I thought we were going to force Stefan to do it ;) - or was that something else? Sent from my Windows Phone From: Christopher Currens Sent: 6/15/2012 1:45 PM To: lucene-net-dev@lucene.apache.org Subject: Re: Lets talk graduation Thanks Prescott. I noticed that we need to elect someone as Chair/VP for the project, someone to be a representative for our collective PMC to the board. Unless I'm mistaken, the job entails the quarterly report to the board and communication with the board when a new PMC member is added or the chair is being changed. Looking through the history of board reports we've done in the incubator, I've noticed that Prescott has largely been the one to put those together in a timely fashion. Since Prescott has always been an active member as well, I think he's an ideal candidate for the roll as VP for Lucene.NET. It doesn't seem much different from the role he's already been involved in. Any objections? Thanks, Christopher On Fri, Jun 15, 2012 at 1:15 PM, Prescott Nasser geobmx...@hotmail.comwrote: https://cwiki.apache.org/confluence/display/LUCENENET/Graduation+-+Resolution+Template Stefan, I think you've more than proved your value. +1
Re: svn commit: r1350178 - /incubator/lucene.net/trunk/src/core/Store/FSDirectory.cs
+1 on the string vs DirectoryInfo overload, Iatmar Re: Comments and JVM, I'd suggest editing the .NET comments to remove Java specific recommendations/concerns. We're not running on the JVM, so consumes of our code don't need that info. Re: GC pressure on the File/DirectoryInfo objects.. There's so few of them it really doesn't make a difference. That aspect of their usage should not be a concern. A greater concern is that they don't really support the full range of Win32API file access (eg long paths, etc). Thanks, Troy On Thu, Jun 14, 2012 at 9:39 AM, Christopher Currens currens.ch...@gmail.com wrote: That comment is correct. We don't have support for NIOFSDirectory in .NET (and the JVM support for it in windows has a major bug). We also don't support MMapDirectory, because we haven't bothered to support it yet. I think digy ported it once, but it didn't add any speed benefits, and was actually fairly slower than the FSDirectory classes. It wasn't that long ago that we had the string constructors for Directory classes. I think they were in the java code and then replace with File (DirectoryInfo for .NET). I've always hated passing in DirectoryInfo, and I don't really understand why it's in the code. It doesn't seem to give much added benefit, if any. You can pass a string that is a path to an existing file and it will still create the DirectoryInfo object. I would think (but don't know) it would be better for performance to *not* use the File and Directory classes (I'm actually not sure how deep the usages of these classes go, so I'm not sure what difference it would make), since it results in added pressure on the GC with the extra object creations. +1 to this. On Thu, Jun 14, 2012 at 5:25 AM, Itamar Syn-Hershko ita...@code972.comwrote: I'd assume so, at least partially I just copy-pasted from one method below On Thu, Jun 14, 2012 at 2:52 PM, Stefan Bodewig bode...@apache.org wrote: On 2012-06-14, synhers...@apache.org wrote: /// p/Currently this returns see cref=SimpleFSDirectory / as /// NIOFSDirectory is currently not supported. /// /// p/bNOTE/b: this method may suddenly change which /// implementation is returned from release to release, in /// the event that higher performance defaults become /// possible; if the precise implementation is important to /// your application, please instantiate it directly, /// instead. On 64 bit systems, it may also good to /// return see cref=MMapDirectory /, but this is disabled /// because of officially missing unmap support in Java. /// For optimal performance you should consider using /// this implementation on 64 bit JVMs. Does this apply to the .NET code? Stefan
Re: Lets talk graduation
I'd say we're ready for graduation. Since I'm not really involved in the coding effort at the moment, I'll work with Prescott on this process. The only reservation I have about graduation is losing or lessening Stefan Bodewig's involvement. He's been really helpful. Can we keep all of our mentors even if we graduate? :) Thanks, Troy On Thu, Jun 14, 2012 at 10:08 AM, Christopher Currens currens.ch...@gmail.com wrote: I've gone back and forth on whether I think we're ready for graduation or not. I had always felt like we weren't because the project isn't as active as I'd like it to be. However, I think I've been looking at it wrong. We've got a good enough process and we *have* made progress. If anything, graduating might add an urgency to the project when things get slow, since we'd be an official project and more would be expected. I don't think anybody wants the project to end up like it did last time, before we gave it a reboot. I'm up for starting this process, but I don't want it to take any time away from getting 3.0.3 released. Thanks, Christopher On Thu, Jun 14, 2012 at 9:06 AM, Benson Margulies bimargul...@gmail.comwrote: As a mentor, it's my job argue with Itamar a bit. It's not just semantics. We don't incubate projects indefinitely. I think that you all are good to go. The transition is not very much work. Please do draft a resolution and conduct a vote in the community, and we can then take it to the incubator PMC. On Thu, Jun 14, 2012 at 11:35 AM, Itamar Syn-Hershko ita...@code972.com wrote: IMHO, whatever brings more attention to the project, and I'm not sure graduation is what this project needs right now. In the end it's just semantics. I'd focus those efforts on getting more work done and having more frequent releases. Hence our proposition to sponsor dev, which still stands. On Thu, Jun 14, 2012 at 6:24 PM, Prescott Nasser geobmx...@hotmail.com wrote: I think with the addition of two new committers we've made some progress in community growth. I think we'll have 3.0.3 out the door soon - are there any other items we think we need to address before looking to graduate? ~P
Re: Corrupt index
If this is the case, 2328 probably made it's way to Lucene.Net since we are using the released sources for porting, and we now need to apply 3418 in the current version. Iatmar: I confirmed that 2328 is in the latest code. Thanks, Troy On Wed, Jun 13, 2012 at 5:45 PM, Itamar Syn-Hershko ita...@code972.com wrote: Mike, On Wed, Jun 13, 2012 at 7:31 PM, Michael McCandless luc...@mikemccandless.com wrote: Hi Itamar, One quick question: does Lucene.Net include the fixes done for LUCENE-1044 (to fsync files on commit)? Those are very important for an index to be intact after OS/JVM crash or power loss. Definitely, as Christopher noted we are about to release a 3.0.3 compatible version, which is line-by-line port of the Java version. You shouldn't even have to run CheckIndex ... because (as of LUCENE-1044) we now fsync all segment files before writing the new segments_N file, and then removing old segments_N files (and any segments that are no longer referenced). You do have to remove the write.lock if you aren't using NativeFSLockFactory (but this has been the default lock impl for a while now). Somewhat unrelated to this thread, but what should I expect to see? from time to time we do see write.lock present after an app-crash or power failure. Also, what are the steps that are expected to be performed in such cases? Last week I have been playing with rather large indexes and crashed my app while it was indexing. I wasn't able to open the index, and Luke was even kind enough to wipe the index folder clean even though I opened it in read-only mode. I re-ran this, and after another crash running CheckIndex revealed nothing - the index was detected to be an empty one. I am not entirely sure what could be the cause for this, but I suspect it has been corrupted by the crash. Had no commit completed (no segments file written)? If you don't fsync then all sorts of crazy things are possible... Ok, so we do have fsync since LUCENE-1044 is present, and there were segments present from previous commits. Any idea what went wrong? I've been looking at these: https://issues.apache.org/jira/browse/LUCENE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel https://issues.apache.org/jira/browse/LUCENE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel (And LUCENE-1044 before that ... it was LUCENE-1044 that LUCENE-2328broke...). So 2328 broke 1044, and this was fixed only in 3.4, right? so 2328 made it to a 3.0.x release while the fix for it (3418) was only released in 3.4. Am I right? If this is the case, 2328 probably made it's way to Lucene.Net since we are using the released sources for porting, and we now need to apply 3418 in the current version. Does it make sense to just port FSDirectory from 3.4 to 3.0.3? or were there API or other changes that will make our life miserable if we do that? And it seems like this is what I was experiencing. Mike and Mark will probably be able to tell if this is what they saw or not, but as far as I can tell this is not an expected behavior of a Lucene index. Definitely not expected behavior: assuming nothing is flipping bits, then on OS/JVM crash or power loss your index should be fine, just reverted to the last successful commit. What I suspected. Will try to reproduce reliably - any recommendations? not really feeling like reinventing the wheel here... MockDirectoryWrapper wasn't ported yet as it appears to only appear in 3.4, and as you said it won't really help here anyway What I'm looking for at the moment is some advice on what FSDirectory implementation to use to make sure no corruption can happen. The 3.4 version (which is where LUCENE-3418 was committed to) seems to handle a lot of things the 3.0 doesn't, but on the other hand LUCENE-3418 was introduced by changes made to the 3.0 codebase. Hopefully it's just that you are missing fsync! Also, is there any test in the suite checking for those scenarios? Our test framework has a sneaky MockDirectoryWrapper that, after a test finishes, goes and corrupts any unsync'd files and then verifies the index is still OK... it's good because it'll catch any times we are missing calls t sync, but, it's not low level enough such that if FSDir is failing to actually call fsync (that wsa the bug in LUCENE-3418) then it won't catch that... Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1350178 - /incubator/lucene.net/trunk/src/core/Store/FSDirectory.cs
Well, yeah, that's a lot more than I realized.. My assumption was that there wouldn't be more FileInfo objects created than there were files created, which is not that many. Whatever is doing that should be re-written to pass the obj vs recreating it. -T On Thu, Jun 14, 2012 at 3:57 PM, Christopher Currens currens.ch...@gmail.com wrote: They're used more than it looks like they are. One of the largest ways they're used in the Store namespace is *to create FileStream objects*...which is such a waste of an allocation. A small test program I wrote created ~1000 FileInfo objects every minute, most of that thanks to merging. Considering the other object allocations in the library, though, it's unlikely removing them *alone* will do anything measurable to performance. But I disagree that it shouldn't be a concern. That kind of attitude will kill our performance if applied to other areas of the code. Thanks, Christopher On Thu, Jun 14, 2012 at 2:23 PM, Troy Howard thowar...@gmail.com wrote: +1 on the string vs DirectoryInfo overload, Iatmar Re: Comments and JVM, I'd suggest editing the .NET comments to remove Java specific recommendations/concerns. We're not running on the JVM, so consumes of our code don't need that info. Re: GC pressure on the File/DirectoryInfo objects.. There's so few of them it really doesn't make a difference. That aspect of their usage should not be a concern. A greater concern is that they don't really support the full range of Win32API file access (eg long paths, etc). Thanks, Troy On Thu, Jun 14, 2012 at 9:39 AM, Christopher Currens currens.ch...@gmail.com wrote: That comment is correct. We don't have support for NIOFSDirectory in .NET (and the JVM support for it in windows has a major bug). We also don't support MMapDirectory, because we haven't bothered to support it yet. I think digy ported it once, but it didn't add any speed benefits, and was actually fairly slower than the FSDirectory classes. It wasn't that long ago that we had the string constructors for Directory classes. I think they were in the java code and then replace with File (DirectoryInfo for .NET). I've always hated passing in DirectoryInfo, and I don't really understand why it's in the code. It doesn't seem to give much added benefit, if any. You can pass a string that is a path to an existing file and it will still create the DirectoryInfo object. I would think (but don't know) it would be better for performance to *not* use the File and Directory classes (I'm actually not sure how deep the usages of these classes go, so I'm not sure what difference it would make), since it results in added pressure on the GC with the extra object creations. +1 to this. On Thu, Jun 14, 2012 at 5:25 AM, Itamar Syn-Hershko ita...@code972.com wrote: I'd assume so, at least partially I just copy-pasted from one method below On Thu, Jun 14, 2012 at 2:52 PM, Stefan Bodewig bode...@apache.org wrote: On 2012-06-14, synhers...@apache.org wrote: /// p/Currently this returns see cref=SimpleFSDirectory / as /// NIOFSDirectory is currently not supported. /// /// p/bNOTE/b: this method may suddenly change which /// implementation is returned from release to release, in /// the event that higher performance defaults become /// possible; if the precise implementation is important to /// your application, please instantiate it directly, /// instead. On 64 bit systems, it may also good to /// return see cref=MMapDirectory /, but this is disabled /// because of officially missing unmap support in Java. /// For optimal performance you should consider using /// this implementation on 64 bit JVMs. Does this apply to the .NET code? Stefan
Re: Corrupt index
If this is the case, 2328 probably made it's way to Lucene.Net since we are using the released sources for porting, and we now need to apply 3418 in the current version. Iatmar: I confirmed that 2328 is in the latest code. Thanks, Troy On Wed, Jun 13, 2012 at 5:45 PM, Itamar Syn-Hershko ita...@code972.com wrote: Mike, On Wed, Jun 13, 2012 at 7:31 PM, Michael McCandless luc...@mikemccandless.com wrote: Hi Itamar, One quick question: does Lucene.Net include the fixes done for LUCENE-1044 (to fsync files on commit)? Those are very important for an index to be intact after OS/JVM crash or power loss. Definitely, as Christopher noted we are about to release a 3.0.3 compatible version, which is line-by-line port of the Java version. You shouldn't even have to run CheckIndex ... because (as of LUCENE-1044) we now fsync all segment files before writing the new segments_N file, and then removing old segments_N files (and any segments that are no longer referenced). You do have to remove the write.lock if you aren't using NativeFSLockFactory (but this has been the default lock impl for a while now). Somewhat unrelated to this thread, but what should I expect to see? from time to time we do see write.lock present after an app-crash or power failure. Also, what are the steps that are expected to be performed in such cases? Last week I have been playing with rather large indexes and crashed my app while it was indexing. I wasn't able to open the index, and Luke was even kind enough to wipe the index folder clean even though I opened it in read-only mode. I re-ran this, and after another crash running CheckIndex revealed nothing - the index was detected to be an empty one. I am not entirely sure what could be the cause for this, but I suspect it has been corrupted by the crash. Had no commit completed (no segments file written)? If you don't fsync then all sorts of crazy things are possible... Ok, so we do have fsync since LUCENE-1044 is present, and there were segments present from previous commits. Any idea what went wrong? I've been looking at these: https://issues.apache.org/jira/browse/LUCENE-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel https://issues.apache.org/jira/browse/LUCENE-2328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel (And LUCENE-1044 before that ... it was LUCENE-1044 that LUCENE-2328broke...). So 2328 broke 1044, and this was fixed only in 3.4, right? so 2328 made it to a 3.0.x release while the fix for it (3418) was only released in 3.4. Am I right? If this is the case, 2328 probably made it's way to Lucene.Net since we are using the released sources for porting, and we now need to apply 3418 in the current version. Does it make sense to just port FSDirectory from 3.4 to 3.0.3? or were there API or other changes that will make our life miserable if we do that? And it seems like this is what I was experiencing. Mike and Mark will probably be able to tell if this is what they saw or not, but as far as I can tell this is not an expected behavior of a Lucene index. Definitely not expected behavior: assuming nothing is flipping bits, then on OS/JVM crash or power loss your index should be fine, just reverted to the last successful commit. What I suspected. Will try to reproduce reliably - any recommendations? not really feeling like reinventing the wheel here... MockDirectoryWrapper wasn't ported yet as it appears to only appear in 3.4, and as you said it won't really help here anyway What I'm looking for at the moment is some advice on what FSDirectory implementation to use to make sure no corruption can happen. The 3.4 version (which is where LUCENE-3418 was committed to) seems to handle a lot of things the 3.0 doesn't, but on the other hand LUCENE-3418 was introduced by changes made to the 3.0 codebase. Hopefully it's just that you are missing fsync! Also, is there any test in the suite checking for those scenarios? Our test framework has a sneaky MockDirectoryWrapper that, after a test finishes, goes and corrupts any unsync'd files and then verifies the index is still OK... it's good because it'll catch any times we are missing calls t sync, but, it's not low level enough such that if FSDir is failing to actually call fsync (that wsa the bug in LUCENE-3418) then it won't catch that... Mike McCandless http://blog.mikemccandless.com - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Welcome Simon Svensson as a new committer
Welcome, Simon! On Thu, May 24, 2012 at 12:05 AM, Prescott Nasser geobmx...@hotmail.com wrote: Hey All, Our roster is growing a bit, I'd like to welcome Simon as a new committer. Simon has been quite active on the user mailing list helping answer community questions, he also maintains a C# port of the lucene-hunspell project (java: http://code.google.com/p/lucene-hunspell/, Simons c# port: https://github.com/sisve/Lucene.Net.Analysis.Hunspell) which is commonly used for spell checking (but has a wide array of purposes. Please join me in welcoming Simon to the team, ~Prescott
Re: Welcome Itamar Syn-Hershko as a new committer
Welcome to the team, Itamar! Thanks, Troy On Wed, May 23, 2012 at 5:21 AM, Itamar Syn-Hershko ita...@code972.com wrote: Thanks guys On Wed, May 23, 2012 at 1:14 AM, zoolette gaufre...@gmail.com wrote: Welcome in Itamar ! 2012/5/22 Prescott Nasser geobmx...@hotmail.com Hey all, I'd like to officially welcome Itamar as a new committer. I know the community appreciates the work you've been doing with the Spatial contrib project and the past help you've provided on the mailing lists. Please join me in welcoming Itamar, ~Prescott
Re: including external code under apache 2.0
Itamar, The easiest way to submit the CLA is via email. Grab the form here: http://www.apache.org/licenses/icla.txt Print, then sign, then scan and email the image to secret...@apache.org (or do a digital signature on the PDF version... http://www.apache.org/licenses/icla.pdf ) Thanks, Troy On Sat, Apr 28, 2012 at 2:16 PM, Itamar Syn-Hershko ita...@code972.com wrote: That mail from Stephan got lost in my inbox, so I never followed up on that. I guess now would be a good chance to tie up all loose ends. How do I do the ICLA? On Sat, Apr 28, 2012 at 1:11 AM, Troy Howard thowar...@gmail.com wrote: My understanding is that we can include the code as long as we have a ICLA from Itamar. This was discussed at length in January for another contribution that the same contributor wanted to donate. Stephan (Bodewig, our Incubation mentor) laid out what needed to be done really clearly in that context. Here's the link to the final message on that thread where Stephan recaps the relevant points: http://s.apache.org/HZa I don't know if Itamar ever followed up after that and filed a ICLA. If he did, we're good and just need to commit the code.. Otherwise, we'll need him to do the ICLA first. Thanks, Troy On Fri, Apr 27, 2012 at 2:30 PM, Prescott Nasser geobmx...@hotmail.com wrote: Hey all, We've had a community member port a library for Lucene.Net that we'd like to include in our contrib packages. The package is under the apache 2.0 license, so we think we are in the clear to include the code into our contrib package - but we weren't sure and just wanted to double check. The contrib project is located here: https://github.com/synhershko/Spatial4n Thanks for your guidance~Prescott
[jira] [Commented] (LUCENENET-481) Port Contrib.MemoryIndex
[ https://issues.apache.org/jira/browse/LUCENENET-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238584#comment-13238584 ] Troy Howard commented on LUCENENET-481: --- I started in on porting this, however this one has a few challenges to replicate exact behaviour. It uses Java wildcard generics in a ComparatorT implementation, of which there is no .NET equivalent. I could probably replicate behaviour using Reflection but I'm not sure how that will impact performance vs it's Java equivalent. May need to redesign a little to accomodate .NET type system. Port Contrib.MemoryIndex Key: LUCENENET-481 URL: https://issues.apache.org/jira/browse/LUCENENET-481 Project: Lucene.Net Issue Type: New Feature Affects Versions: Lucene.Net 3.0.3 Reporter: Christopher Currens We need to port MemoryIndex from contrib, if we want to be able to port a few other contrib libraries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Lucene.Net] MVP Summit
NO SUMMIT 4 U! On Mon, Feb 27, 2012 at 10:55 AM, Christopher Currens currens.ch...@gmail.com wrote: I would have loved to, especially since I'm relatively close to the Microsoft campus, but I'm not an MVP. :) Thanks, Christopher On Mon, Feb 27, 2012 at 9:27 AM, Simone Chiaretta simone.chiare...@gmail.com wrote: Hi, is anybody attending the MVP Summit? Simone -- Simone Chiaretta Microsoft MVP ASP.NET - ASPInsider Blog: http://codeclimber.net.nz RSS: http://feeds2.feedburner.com/codeclimber twitter: @simonech Any sufficiently advanced technology is indistinguishable from magic Life is short, play hard
Re: [Lucene.Net] Blog Setup
My understanding is that ASF cares about what is hosted at ASF more than they care about what isn't hosted at ASF... So you're free to do whatever you like externally, hence the GitHub mirrors and NuGet and what not. Regarding readthedocs -- it uses Sphinx/ReStructured Text...So we'd have to convert the XML to that format. There's a project for that called breathe (http://michaeljones.github.com/breathe/index.html). So the build server could add steps that would: - build to binary - run doxygen to create xml - run breathe to convert to RST It could commit that to some other repo like say, on github.. (since pulling from a repo is how RTD ingests content).. then on the RTD servers it'll pull the repo, look for a conf.py file (very simple python file) and then run sphinx-build against that directory, creating documentation for us. Woot! At least that way it wouldn't bomb ASF's infrastructure. I agree with Prescott that long term, using SVN for this is just not gonna work out. Docs do change quickly. -T On Wed, Feb 15, 2012 at 11:18 AM, Christopher Currens currens.ch...@gmail.com wrote: That's similar to a suggestion Stefan made in another email: The only alternative would be [...] running a dynamic server on a dedicated VM. The later would be easier to negotiate for a top level project. Though, his response seems to imply that it would need to stay hosted on Apache servers? I'm really only saying that because it was suggested to negotiate it for a top level project, and the use of an outside server wasn't actually suggested by him. If we can host outside of Apache servers, that does seems like the most straightforward answer to our problem, that would require the least amount of changes to our build process (probably). In regards to readthedocs.org, I think it only supports restructured text? You have to use Sphinx (python) to output your documentation which uses restructured text. I don't know if we can output that from sandcastle. Maybe there would be another way to export the xmldoc to that format, but idk, probably not worth the amount of effort it would take to put that in our build steps. Thanks, Christopher On Wed, Feb 15, 2012 at 10:59 AM, Troy Howard thowar...@gmail.com wrote: Why not move docs to http://readthedocs.org/ On Wed, Feb 15, 2012 at 10:45 AM, Christopher Currens currens.ch...@gmail.com wrote: Erg. Was it related to the whole documentation thing? I'm pretty sure it's frowned upon for our documentation to break SVN every time we do a release, so we should probably figure out a solution for that. :P Thanks, Christopher On Wed, Feb 15, 2012 at 10:28 AM, Prescott Nasser geobmx...@hotmail.com wrote: Ick, I'm in the middle of publishing and working with Joe @ site-dev. We've totally f'd up svn. He's asked we stop doing things and he's going to fix it and get back to us... At least we're active.. ;) ~P Date: Wed, 15 Feb 2012 10:20:29 -0800 From: currens.ch...@gmail.com To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Blog Setup I made some changes to the website in staging. Just a few things regarding the blog list, a little bit of left padding and adding a recent news header above it. I also changed the link style a little bit. Feel free to revert any/all changes I've made. :) Thanks, Christopher On Tue, Feb 14, 2012 at 9:05 PM, Prescott Nasser geobmx...@hotmail.com wrote: Editing the site: http://www.apache.org/dev/cms#usage Staging: http://lucene.net.staging.apache.org/lucene.net/ Date: Tue, 14 Feb 2012 11:03:58 -0800 From: currens.ch...@gmail.com To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Blog Setup Looks like the blog was successfully setup (that was quick!). You can access it here: https://blogs.apache.org/lucenenet/ I've migrated the whopping 3 new entries we already have on our index/front-page in the news section over to the blog. I would give a shot at integrating it into the website, but I actually have no idea how to edit the website. :) Thanks, Christopher On Mon, Feb 13, 2012 at 6:17 PM, Prescott Nasser geobmx...@hotmail.com wrote: I've submitted a ticket here: https://issues.apache.org/jira/browse/INFRA-4433 I only added my name, I wasn't sure who else would want access - if you do want it, submit a comment to that ticket with your apache username and full name. I'm going to try and integrate this into the site soon. I also have some ideas about how to utilize the blog that I'll outline after I get it up and running ~P From: bode...@apache.org To: lucene-net-...@incubator.apache.org Date: Mon, 13 Feb 2012 13:39:11 +0100 Subject: [Lucene.Net] Blog Setup Hi all
Re: [Lucene.Net] Blog Setup
My thought was just we should host our documentation externally and RTD that was the first thing that came to mind.. You're right that tool chain is kind of unfortunate. Another option would be to use GitHub pages and just build our HTMl on the build server then push to GitHub to update it. Then we don't have to use any weird intermediate steps and can just take whatever HTML output we like and use that. In order to function atomically, we'd have to add these steps: ... build docs to HTML with Sandcastle ... - git clone g...@github.com:apache/lucenenet.github.com - git add . - git commit -m time to make the docs - git push (configure the project with a deploy key for the account our CI server is using) Then our doc site (which would be http://lucenenet.github.com) would be auto-magically updated. That might be the lowest impact way of hosting docs externallly. On Wed, Feb 15, 2012 at 11:54 AM, Christopher Currens currens.ch...@gmail.com wrote: I guess it wouldn't be terribly difficult to do, we would have to convert all of the shfbproj to whatever doxygen uses. I do think it's a little weird to use the python-style of documentation in a .NET project however. I personally have never been a fan of python documentation in general simply because I'm used the the MSDN style docs, where I have a nice TOC on the left that I can directly browse to anything fairly easily (think the documentation to MongoDb http://api.mongodb.org/csharp/1.3.1/ ). While you do get a TOC on documentation for sphinx style, it limits you to the current topic (and it's all on a single page using named anchors), so I personally find it cluttered. However, that may not matter, since developers would have access to the CHM (though I'd prefer sandcastle's output over doxygen's), so they wouldn't have to use the online docs if they didn't want to. Plus, I may be in the minority here with my opinion of what the docs should look like. Idk, just makes sense to me to use .NET tools for the project. Of course, I'd take Sphinx over javadoc any day...It hurts my eyes! :P Thanks, Christopher On Wed, Feb 15, 2012 at 11:36 AM, Troy Howard thowar...@gmail.com wrote: My understanding is that ASF cares about what is hosted at ASF more than they care about what isn't hosted at ASF... So you're free to do whatever you like externally, hence the GitHub mirrors and NuGet and what not. Regarding readthedocs -- it uses Sphinx/ReStructured Text...So we'd have to convert the XML to that format. There's a project for that called breathe (http://michaeljones.github.com/breathe/index.html). So the build server could add steps that would: - build to binary - run doxygen to create xml - run breathe to convert to RST It could commit that to some other repo like say, on github.. (since pulling from a repo is how RTD ingests content).. then on the RTD servers it'll pull the repo, look for a conf.py file (very simple python file) and then run sphinx-build against that directory, creating documentation for us. Woot! At least that way it wouldn't bomb ASF's infrastructure. I agree with Prescott that long term, using SVN for this is just not gonna work out. Docs do change quickly. -T On Wed, Feb 15, 2012 at 11:18 AM, Christopher Currens currens.ch...@gmail.com wrote: That's similar to a suggestion Stefan made in another email: The only alternative would be [...] running a dynamic server on a dedicated VM. The later would be easier to negotiate for a top level project. Though, his response seems to imply that it would need to stay hosted on Apache servers? I'm really only saying that because it was suggested to negotiate it for a top level project, and the use of an outside server wasn't actually suggested by him. If we can host outside of Apache servers, that does seems like the most straightforward answer to our problem, that would require the least amount of changes to our build process (probably). In regards to readthedocs.org, I think it only supports restructured text? You have to use Sphinx (python) to output your documentation which uses restructured text. I don't know if we can output that from sandcastle. Maybe there would be another way to export the xmldoc to that format, but idk, probably not worth the amount of effort it would take to put that in our build steps. Thanks, Christopher On Wed, Feb 15, 2012 at 10:59 AM, Troy Howard thowar...@gmail.com wrote: Why not move docs to http://readthedocs.org/ On Wed, Feb 15, 2012 at 10:45 AM, Christopher Currens currens.ch...@gmail.com wrote: Erg. Was it related to the whole documentation thing? I'm pretty sure it's frowned upon for our documentation to break SVN every time we do a release, so we should probably figure out a solution for that. :P Thanks, Christopher On Wed, Feb 15, 2012 at 10:28 AM, Prescott Nasser geobmx...@hotmail.com wrote
Re: [Lucene.Net] Blog Setup
Why not move docs to http://readthedocs.org/ On Wed, Feb 15, 2012 at 10:45 AM, Christopher Currens currens.ch...@gmail.com wrote: Erg. Was it related to the whole documentation thing? I'm pretty sure it's frowned upon for our documentation to break SVN every time we do a release, so we should probably figure out a solution for that. :P Thanks, Christopher On Wed, Feb 15, 2012 at 10:28 AM, Prescott Nasser geobmx...@hotmail.comwrote: Ick, I'm in the middle of publishing and working with Joe @ site-dev. We've totally f'd up svn. He's asked we stop doing things and he's going to fix it and get back to us... At least we're active.. ;) ~P Date: Wed, 15 Feb 2012 10:20:29 -0800 From: currens.ch...@gmail.com To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Blog Setup I made some changes to the website in staging. Just a few things regarding the blog list, a little bit of left padding and adding a recent news header above it. I also changed the link style a little bit. Feel free to revert any/all changes I've made. :) Thanks, Christopher On Tue, Feb 14, 2012 at 9:05 PM, Prescott Nasser geobmx...@hotmail.com wrote: Editing the site: http://www.apache.org/dev/cms#usage Staging: http://lucene.net.staging.apache.org/lucene.net/ Date: Tue, 14 Feb 2012 11:03:58 -0800 From: currens.ch...@gmail.com To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Blog Setup Looks like the blog was successfully setup (that was quick!). You can access it here: https://blogs.apache.org/lucenenet/ I've migrated the whopping 3 new entries we already have on our index/front-page in the news section over to the blog. I would give a shot at integrating it into the website, but I actually have no idea how to edit the website. :) Thanks, Christopher On Mon, Feb 13, 2012 at 6:17 PM, Prescott Nasser geobmx...@hotmail.com wrote: I've submitted a ticket here: https://issues.apache.org/jira/browse/INFRA-4433 I only added my name, I wasn't sure who else would want access - if you do want it, submit a comment to that ticket with your apache username and full name. I'm going to try and integrate this into the site soon. I also have some ideas about how to utilize the blog that I'll outline after I get it up and running ~P From: bode...@apache.org To: lucene-net-...@incubator.apache.org Date: Mon, 13 Feb 2012 13:39:11 +0100 Subject: [Lucene.Net] Blog Setup Hi all, If we want a blog for Lucene.Net on blogs.apache.org, the instructions to request one are at https://cwiki.apache.org/confluence/display/INFRA/Resource+request+FAQs Mainly we should have a list of ASF ids of the initial set of admins and authors. I had a look at how the RSS snippets are added to www.apache.org 's index page and it doesn't look to difficult to adapt. In https://svn.apache.org/repos/asf/infrastructure/site/trunk/content/index.html there is {% for e in blog.list %} h4a href={{ e.url }}{{ e.title }}/a/h4 div class=section-content{{ e.content|safe|truncatewords_html:50 }}/div hr {% endfor %} which iterates over a blogs collection created in path.pm via [qr!^/index\.html$!, news_page = { blog = ASF::Value::Blogs-new(blog = foundation, limit= 3), }, ], and uses the ASF::Value::Blog package that is part of the CMS. https://svn.apache.org/repos/infra/websites/cms/build/lib/ASF/Value/Blogs.pm So getting content from the blog to the main page is pretty easy. I think the main site is re-created every fifteen minutes to ensure things are fresh, not sure how one would go about this for a page that doesn't change that often (manually triggering buildbot might be an option). We'd need to ask this on the site-dev mailing list which is dedicated to the CMS. Stefan
Re: [Lucene.Net] Graduation
I agree with Neal on both points: - Repeatable, documented process: We need a better more defined, public and repeatable process for creating and building releases. As Prescott can attest to, figuring all that out at this point is non-trivial and poorly documented. We have a wider footprint now than ever before but we still have a long way to go in terms of solving our problems as a team/community/project. - Committing to our decisions, despite alienating our user base: As Jesse pointed out, there are users out there who will be alienated by our choices, wether it be to use .Net 4.0 vs 2.0, use VS2010 vs VS2008/2005, change the API to make more sense in .NET, or what have you. We are going to have to make choices regarding the project direction, commit to those choices and move forward, even if it does mean alienating a certain portion of our user base. We don't like that consequence but we can't survive as a project without being decisive about controversial issues and moving forward. The question I have to Stephan is, what are the significant criteria for moving a project from the Incubator to a TLP? In my mind, we have the minimum marketable feature set to be a TLP, which is to say, we have an open dialog and an interested community while remaining somewhat productive. I don't think we need to wait to graduate until we have solved every challenge that we face as a project but rather we simply need to prove that we have what it takes to survive and grow in a healthy and productive manner as a community. I think we've achieved that part and just need to continue improving our process. Thanks, Troy On Wed, Feb 1, 2012 at 11:11 AM, Granroth, Neal V. neal.granr...@thermofisher.com wrote: Jesse, Thanks for making that point. I am also in that situation where I must support.NET 2.0 for years into the future. While I can experiment with .NET 4.0, there are a number or reasons that preclude its deployment or anything that depends upon it. However, consider what the Lucene.NET developers are up against. I think I am not mistaken that the current version of Java, which the Lucene core project uses, now makes use of features that have no equivalent in .NET 2.0; use of the newer versions of .NET are essential in order to update Lucene.NET to the current version of Lucene. At some point you, I, and others in our situation have to develop migration plans to get our products (and customers) to upgrade to the newer versions of .NET - Neal -Original Message- From: Jean-Sylvain Boige [mailto:jsbo...@aricie.fr] Sent: Wednesday, February 01, 2012 12:44 PM To: lucene-net-...@lucene.apache.org; lucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Graduation Hi all, I'm not sure if it's the best moment for that, but here are my 2 cents. I have the feeling that a lot was done recently, and that the project is taking a good direction. To reflect on your impression, the one example of how it could go wrong I'm thinking of, where a few people invest in bursts and in their turn is Sharpmap (http://sharpmap.codeplex.com/) It's been years than a couple of committers are literally throwing thousands of lines of codes at that project, with dozens of branches and each method refactored a couple of time, but not a clean release since then, loads of inertia, and non committers quite at lost. I reckon the effort is better coordinated here, with clear incremental steps. However, when it was announced that the project could collapse, I reflected that we were a quite a few consuming the lib, possibly interested in getting involved, but striving to follow the upgrade path. By that time, v2.4 was the common version around, and with 2.9.2 the upgrade path towards 3.0 by replacing all the obsolete constructs was already a pain. I know several integrators could not be bothered, yet we did make those changes, and by the time we were finally ready to move on with the latest upgrades, you guys added a constraint, which resulted in a complete show stopper for us: .Net Framework 4.0. I understand that it feels natural for anything fresh, but with that decision you probably lost those, who like us have their products packaged with Lucene.Net in many existing environments where moving to .Net 4.0 is neither an option nor a decision of ours. Since then, we have kept investing into our 2.9.2 integration, but it will be months at the very least until we can consider imposing .Net 4.0 as a requirement for any further upgrades to our products. I'm pretty sure there are quite a few of us in that situation, which feels a bit similar to when we were many stuck with 2.4.1 constructs while help was requested to upgrade past 2.9.2. I guess you get the idea: it's a good thing if the project moves fast because of a few committers deeply involved, but it's as important to make sure most traditional integrators are following behind. Cheers, Jesse -Message
Re: [Lucene.Net] do we have a board report due?
Looks good. On Tue, Jan 31, 2012 at 10:38 AM, Michael Herndon mhern...@wickedsoftware.net wrote: even though we're not voting on this. +1 On Tue, Jan 31, 2012 at 1:30 PM, Prescott Nasser geobmx...@hotmail.comwrote: I've updated the board report: wiki.apache.org/incubator/February2012 if you guys could take a moment to review. Sent from my Windows Phone From: Stefan Bodewig Sent: 1/30/2012 8:59 PM To: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] do we have a board report due? On 2012-01-31, Prescott Nasser wrote: I didn't see a notice, me neither but something on general caught my eye. http://wiki.apache.org/incubator/ReportingSchedule says: yes. Thank you for reminding us. Personally I'd like to see Lucene.NET graduate over the next three months so it would be good if we could identify what is preventing the project from just doing that. Stefan
Re: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2)
+1 On Wed, Jan 25, 2012 at 4:25 PM, Digy digyd...@gmail.com wrote: +1 DIGY -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Thursday, January 26, 2012 1:56 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2) Thanks for the +1, we need one more vote here, then Stefan will be comfortable giving us a plus one, which will give us two plus ones in general, and ill only have to beg for one more :) Sent from my Windows Phone From: Michael Herndon Sent: 1/25/2012 11:15 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2) verified tests pass and checksums match. so +1 @P, I remember that thread. Those guys stay busy though and devopt mentality is different than a devs. Our needs probably exceed what the svn CMS is meant for due to documentation. I am curious if infra allows for or would allow us to throw up a static mono/asp.net mvc in the future just so that we could dog food the site with search using Lucene.Net and then have it index certain pages or sites (wiki, tutorials, static site, docs). We'll probably need to dig out our CMS options again and weight against short term and long term goals. On Wed, Jan 25, 2012 at 12:31 PM, Prescott Nasser geobmx...@hotmail.comwrote: You know even making a small change to the website like updating the news takes like 30 minutes to run now because of all the files. Its absolutely ridiculous. I got chided by the CMS group, yet when asked how do we put documentation online with the new system there were crickets. Sent from my Windows Phone From: Michael Herndon Sent: 1/25/2012 8:26 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2) I was not able to download the binaries till this morning. The wiki was also having issues. I ran rat on the the released source, that seems fine. did a compare on src zip and the tag. it matches. The only things I saw are nit picks. in the ReadMe the link should point to its respective tag instead of RC3 for just 2_9_4 https://svn.apache.org/repos/asf/incubator/lucene.net/tags/Lucene.Net_2_9_4_RC3/lib/should be when releasing the source in the future, we should either include a script that pulls the lib for the developers who want to compile from source inside a tag when the project is built using the solution. Or we should invest into using something like nuget for dependencies so that the dependencies are automatically fetched somehow and we can remove those from svn/scm altogether. the source currently violates the don't make me think about it principle. I know we all dislike chms, but until we figure out a better way of posting the generated msdn documentation online, we should include that in releases as well. The static website version generates a high number of static html files and our current CMS requires that those files are pushed into SVN which just is not feasible. Committing that all at once will choke infra's setup (and if they hired ninjas to pay us a visit, I probably wouldn't blame them) and doing partial commits is just borderline insanity. Just waiting on the all the tests to finish running. http://xkcd.com/303/ On Wed, Jan 25, 2012 at 6:29 AM, Stefan Bodewig bode...@apache.org wrote: On 2012-01-25, Michael Herndon wrote: Stefan what did you use to check the eof of files for svn? Pretty much a long and boring manual process. I did something like find . -name \*.cs -print0 | xargs -0 -e svn ps svn:eol-style native i.e. tried to set the eol-style property on all C# source files. This won't do anything if the property is set and tell you it has changed something in svn status if it the property hasn't been set before. svn will also fail if the file in question contains inconsistent line ends, this is the case for the NUnit doc files and even some of Lucene.NET sources. Repeat for all other file extension that should map to text files. I'm setting up RAT on my local. Are there any other tools that you or ASF recommends in general to validate releases? I think Sebb has a bunch of scripts he uses, but never bothered to look them up. If so, they'd be inside the comitters svn repo. For this release you don't even need to check line-feeds, the properties have not been set on all files. The patch I provided a while ago only applied to trunk. To me this is no reason to stop the release, in particular since most files have Windows line-ends and Prescott built the release on Windows so the files would be the same with
Re: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2)
LOL... We just wanted to see you grovel a little bit. ;) On Wed, Jan 25, 2012 at 4:59 PM, Prescott Nasser geobmx...@hotmail.com wrote: Ha - you guys rock Sent from my Windows Phone From: Troy Howard Sent: 1/25/2012 4:37 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2) +1 On Wed, Jan 25, 2012 at 4:25 PM, Digy digyd...@gmail.com wrote: +1 DIGY -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Thursday, January 26, 2012 1:56 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2) Thanks for the +1, we need one more vote here, then Stefan will be comfortable giving us a plus one, which will give us two plus ones in general, and ill only have to beg for one more :) Sent from my Windows Phone From: Michael Herndon Sent: 1/25/2012 11:15 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2) verified tests pass and checksums match. so +1 @P, I remember that thread. Those guys stay busy though and devopt mentality is different than a devs. Our needs probably exceed what the svn CMS is meant for due to documentation. I am curious if infra allows for or would allow us to throw up a static mono/asp.net mvc in the future just so that we could dog food the site with search using Lucene.Net and then have it index certain pages or sites (wiki, tutorials, static site, docs). We'll probably need to dig out our CMS options again and weight against short term and long term goals. On Wed, Jan 25, 2012 at 12:31 PM, Prescott Nasser geobmx...@hotmail.comwrote: You know even making a small change to the website like updating the news takes like 30 minutes to run now because of all the files. Its absolutely ridiculous. I got chided by the CMS group, yet when asked how do we put documentation online with the new system there were crickets. Sent from my Windows Phone From: Michael Herndon Sent: 1/25/2012 8:26 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] [VOTE] Apache-Lucene-2.9.4g-incubating-RC1 Release (take 2) I was not able to download the binaries till this morning. The wiki was also having issues. I ran rat on the the released source, that seems fine. did a compare on src zip and the tag. it matches. The only things I saw are nit picks. in the ReadMe the link should point to its respective tag instead of RC3 for just 2_9_4 https://svn.apache.org/repos/asf/incubator/lucene.net/tags/Lucene.Net_2_9_4_RC3/lib/should be when releasing the source in the future, we should either include a script that pulls the lib for the developers who want to compile from source inside a tag when the project is built using the solution. Or we should invest into using something like nuget for dependencies so that the dependencies are automatically fetched somehow and we can remove those from svn/scm altogether. the source currently violates the don't make me think about it principle. I know we all dislike chms, but until we figure out a better way of posting the generated msdn documentation online, we should include that in releases as well. The static website version generates a high number of static html files and our current CMS requires that those files are pushed into SVN which just is not feasible. Committing that all at once will choke infra's setup (and if they hired ninjas to pay us a visit, I probably wouldn't blame them) and doing partial commits is just borderline insanity. Just waiting on the all the tests to finish running. http://xkcd.com/303/ On Wed, Jan 25, 2012 at 6:29 AM, Stefan Bodewig bode...@apache.org wrote: On 2012-01-25, Michael Herndon wrote: Stefan what did you use to check the eof of files for svn? Pretty much a long and boring manual process. I did something like find . -name \*.cs -print0 | xargs -0 -e svn ps svn:eol-style native i.e. tried to set the eol-style property on all C# source files. This won't do anything if the property is set and tell you it has changed something in svn status if it the property hasn't been set before. svn will also fail if the file in question contains inconsistent line ends, this is the case for the NUnit doc files and even some of Lucene.NET sources. Repeat for all other file extension that should map to text files. I'm setting up RAT on my local. Are there any other tools that you or ASF recommends in general to validate releases? I think Sebb has a bunch of scripts he uses, but never bothered to look them up. If so
Re: [Lucene.Net] I want to help! Also, where are we at?
What Michael said. I would say one thing you may want to do to do, in order to get familiar with the project, is look over the upcoming 2.9.4g release and try to evaluate it for correctness. This will expose you to a lot of the important bits from the committer's perspective. Building/testing/packaging/documentation/licenses/ASF process/etc. It will also expose you to what the current state of the project is. Perhaps you could assemble release notes, have a look at our online documentation and see what can be improved, update the website/wiki, etc.. IIRC, there's a lot of work to do in the Contrib libraries for 3.0.3 release. And if you didn't get the gist from Michael's tone of voice.. while we care a lot about getting code done (and done right), and keeping the project moving along, we're pretty informal around here. So, jump in where ever you feel comfortable and don't be shy. Thanks, Troy On Tue, Jan 24, 2012 at 3:46 PM, Michael Herndon mhern...@wickedsoftware.net wrote: Where would my effort best go? My advice is to play to strengths or work on things that seem interesting or challenging to you. Looking through the tickets on jira is a good place to start. https://issues.apache.org/jira/browse/LUCENENET Look at recent threads on the mailing lists. http://mail-archives.apache.org/mod_mbox/lucene-lucene-net-dev/ Another option is to pair with a committer looking for help on something in particular in order to get it done. ( you have to admit this was a beautifully placed segway/opener for committers to jump in. ;) ). What are we currently working on? 10+ mana points for using the word we. There is a push to get 2.9.4g released and to work on porting 3.0.3 in trunk. Can I join this mailing list? Anyone is welcome to join the mailing list, unless it's your pet goldfish. They're kinda shady. On Tue, Jan 24, 2012 at 6:48 AM, Sean Newham sean.new...@grantadesign.comwrote: Hey, I want to help port Lucene.net. I'm starting to work with it at work (in fact my first job here was to port spellchecker from 3.5.3 to lucene.net2.3.2 (please don't ask)) and I want to help make it better. Questions: Where would my effort best go? What are we currently working on? Can I join this mailing list? Best wishes, Sean ps. My home email is seansevilt...@gmail.com and I might use this to email in.
Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
our own tokens in parallel with the word tokens - We have our own parser (using Irony) so that we can extend the syntax (related to the extra tokens) - We have created a wrapper to abstract/hide most of the Lucene API - maps to and from poco objects - it exposes IEnumerableTPoco So much for the background. I agree with much of what is being said here. Particularly, let's make a choice and stop wasting the little resources the project has. I don't care about java. So I don't care that the API changes. I do want 100% index compatibility. I would like things like name capitalization, IDisposable, IEnumerable etc. Though I think that g adds a little confusion by also using other collection types i.e. ICollectionT, ListT unnecessarily. I'd like to get to Lucene 4 as soon as possible. The NRT and Codec bits would solve a lot of the issues we spend a lot of time on. So being able to catch up to java, currently 3.5, is high on my list. I like Troy's list of I wants. I'd like all of that too. The question is, how? I think the ancient argument about line by line or transliteration or lets change the api or complete rewrite can't be easily resolved because it obscures two separate things. 1. How do we port changes in java to the .net version 2. Most (all?) don't like the java api I don't think the project will survive if this cannot be resolved. It barely survived the last time. But I also don't believe that the project can achieve a full rewrite (such as Lucere started). At least, not yet. It would take too much and is too easily divisive. I would like to see something like line by line with formally defined mutations. I want to see Lucene.net on par with java Lucene and not continually a year or more behind. I think the only way to get there is to adhere, mostly, to the same basic code and structures as java. However, I also think that there should be a set of agreed and documented mutations that are applied (both retro fitting and applied to newly migrated changes). For example: - Method names are capitalized according to dotnet conventions - A class implementing Close should have IDisposable added according to an agreed template/style - A collection class should implement IEnumerableT or ICollectionT or IListT depending on agreed criteria and according to an agreed template/style - Convert to enum - Convert to FuncT/ActionT I'm sure there are many more. All of the above need to be formalized into guidelines/criteria/templates/styles and agreed by the core committers. The experience of the g branch should provide a solid start. Please use this or something like this so that we can accelerate towards parity with java Lucene. As the project progresses more mutations can be added as they become apparent. This doesn't mean that other refactoring cannot be done but I would hope that these can be discussed as mini-projects instead of opening this box yet again. My company is in the process of expanding it's dev team significantly so I'm hoping that I will be able to devote some time to help. Regards, Andy On 30 December 2011 04:55, Christopher Currens currens.ch...@gmail.com wrote: If we could find a reasonable solution and get people to commit to it, I could see it being done, in fact, I'd like to see it done. If we had more developers and more time to work on it, it would be awesome to see that kind of response and progress on the project. I would hope their enthusiasm wouldn't fade with time; that it wouldn't just be a boost of energy at the beginning of the (welcome) change and then fizzle out if it got to be too much for people. I will admit, I'm sure it's not surprising, but there are a lot of annoyances about Lucene.NET that could be done away with, if we did a re-write of the library. Anyway, I really have no idea who uses Lucene.NET besides those on this project, and StackOverflow. I would hope we could get opinions and advice from those other users as well. Thanks, Christopher On Thu, Dec 29, 2011 at 8:47 PM, Troy Howard thowar...@gmail.com wrote: Chris, Regarding release schedule and the amount of work to accomplish porting... What if we had 20 developers working on the project? It's likely that by changing what we're doing, we'll attract more people to work on the project and thus these concerns (which are perfectly valid concerns if you're attempting to port the entire library as a one or two person effort, as George, DIGY and you have done) will no longer be relevant. If the *volume* of work is a problem, then a reasonable solution is to scale up the quantity of devs on the project and get organized enough to keep them all productive. I'm certain that moving away from
Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
the community wants, since I believe it would allow a faster release schedule than our own interpretation of a Lucene library. Thanks, Christopher On Thu, Dec 29, 2011 at 7:42 PM, Troy Howard thowar...@gmail.com wrote: I completely agree with Michael's mentality on thinking of this as a business and coming from the perspective of what will wow our customers ... I also completely agree with Prescott, that we just need to get down to brass tacks and say what we want it to be specifically and subjectively. Here's what I want: * I want an extremely modern .NET API full of injection points where I can pass lambdas, use IEnumerableT and all that System.Linq provides, interfaces for everything, as well as excellent unit test coverage. * I want to write *very* minimal code to accomplish basic tasks * I want an average .NET developer be able to intuitively understand and use the library with Intellisense as their only documentation * I want performance that meets or exceeds Java Lucene * I want no memory leaks * I want no surprises in general * I want minimal I/O * I want any execution that can be deferred or optimized out to be deferred or optimized out * I want any data that could be large in size to be streamable * I want no pointless unavoidable limitations on scale... and I want to be able to horizontally distribute searching and indexing with ease * I want every feature that Java Lucene's latest version has (and then some) * I want the index formats to be compatible with every other Lucene out there in whatever language and I want the query language to work identically across all of them.. That is to say given query Text X and index Y you will always get result set Z from every implementation of Lucene. Because when I have to get to my data via Python, Java, C++, Ruby or whatever, I want everything to just work. * I want to know which clauses in my query caused the result hit and to what degree, without having to incur a huge performance hit * I want real-time updates without having to do a little dance and wave my hands to get it to work * I want to get a new major version of the library roughly once or twice a year and I want to be very impressed by the features in the new version. I want bug fixes rolled out on a quarterly basis (at minimum) between those major versions. * I want to be able to trace or step-debug the execution of a search or indexing process and not think WTF constantly. Some of that code is extremely obtuse. * I want the query parser to be generated from a PEG grammar so that I can easily generate one in other languages ... and much much more. I didn't even get into things like being able to create custom indexes that use something other than strings for the inversion product, decorating my POCO's properties with attributes like [Field(Description)] and just saying Store, better query expansion, and blah blah blah. :) And I agree with Prescott on this one: I don't care *at all* about Java, other than porting code out of it so that it can run on .NET. I hate Java, but I love a lot of the libraries written in it. I feel that the JVM is an inferior runtime to the CLR and the Java language is like C#'s retarded cousin. I'll gladly write a new book on the new API and publish it for free online, so people don't have to read Lucene in Action to learn Lucene.Net. I'll gladly spend the time it takes to understand a changeset from the Java project and the mentally model what they were trying to accomplish by it and then re-engineer the change to apply to our library. Basically, I don't want to limit the project to a line-by-line port at all. I also don't want to piss people off and destroy the project in the process. Soo... I'm flexible as well. :) Thanks, Troy On Thu, Dec 29, 2011 at 6:18 PM, Prescott Nasser geobmx...@hotmail.com wrote: Someone has to take a stand and call out what they prefer - rather than shooting out all the alternatives, we need to start voicing our opinions of which direction we need to go. I'll get us started: I want to see something that is more .NET like, I want to see something that can run on the phone, xbox, pc, mono, etc. I want to use the latest and greatest .NET has to offer. I do care that we keep the index files 100% compabitible. I also care that we try to keep up with Java in feature set and extras (contrib's). I couldn't care less about keeping the API in line with java. I don't really care about the line by line - but others in the past have said they did. My energy isn't really behind keeping that in line but I'll help maintain it if that is what the community really wants. But I agree with Troy - there are lots of options if you want the Java Lucene avaliable in .Net That's my feeling - but at the same time, I realize we are a small community, and if we don't really agree with what we want to do, then we are SOL - I'm FLEXIBLE if others really want
Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
My vote goes to merging the two: Apply the same concepts from 2.9.4g to 3.X development, using generics where possible, Disposable vs Close, and exposing *additional* APIs for generics (but leaving the existing old ones) to enable the underlying performance improvements the generics offer. Also, expose IEnumerableT implementations vs Java style enumerables/iterators. If we are only adding to the existing and making relatively minor changes to enable generics, updating/maintenance should be relatively easy and it won't break anyone's code. Thanks, Troy On Thu, Dec 29, 2011 at 2:08 PM, Prescott Nasser geobmx...@hotmail.com wrote: I agree its a matter of taste. I'd vote continue with g and evolve it to where we want a .net version to be. What do others think? Sent from my Windows Phone From: Digy Sent: 12/29/2011 1:16 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g When I started that g branch, I had no intention to change the API, but at the end it resulted in a few changes like StopAnalyzer(Liststring stopWords), Query.ExtractTerms(ICollectionstring) etc. But I think, a drop-in replacement will work for most of the Lucene.Net users (Of course some contribs have been also modified accordingly) Changing arraylists/collections with generic counterparts, GetEnumerator's with foreach, AnonymousClass's with Func or Action's and Fixing LUCENENET-172 are things most people would not notice. This g version includes also some other patches that were fixed for .GE.(=) Lucene3.1 (Which? I have to rework on my commits) So, there isn't much change in API, more changes for developers and more stable code(At least I think so, since I use this g version in production env. for months without any problem. For short, 2.9.4g is a superset of 2.9.4 in bugfix level) As a result, creating a new branch for .Net friendly Lucene.Net or continuing on this branch is just a matter of taste. DIGY -Original Message- From: Scott Lombard [mailto:lombardena...@gmail.com] Sent: Thursday, December 29, 2011 5:05 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g I don't see the g branch differing all that much from the line-by-line port. All the g branch does is change some data types as generics, but line by line the code the same once the generics are declared. I don't see 2.9.4g being any closer to a .NET style version than 2.9.4. While it does generics use for list style variable types the underlying classes are still the same and all of the problems with 2.9.4 not being .NET enough would be true in 2.9.4g. I would have to refer to Digy on if it changes how an end user interacts with Lucene.NET. If it does not affect how the end user interacts with Lucene.NET then I think we should merge it into the Trunk and go from there on 3.0.3. Scott -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Wednesday, December 28, 2011 8:28 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g Any reason we can't continue this g branch and make it more and more .net like? I was thinking about what we've expressed at goals - we want a line by line port - it's easy to maintain parity with java and easy to compare. We also want a more .NET version - the g branch gets this started - although it's not as .Net as people want (I think). What if we used the g branch as our .Net version and continued to make it more .Net like? and kept the trunk as the line by line? The G branch seems like a good start to the more .Net version anyway - we might as well build off of that? From: digyd...@gmail.com To: lucene-net-...@lucene.apache.org Date: Thu, 29 Dec 2011 02:45:23 +0200 Subject: RE: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4gbut I guess the future of 2.9.4g depends on the extent that it is becoming more .NET like My intention while I was creating that branch was just to make 2.9.4 a little bit more .Net like(+ maybe some performance). I used many codes from 3.0.3 Java. So it is somewhere between 2.9.4 3.0.3 But I didn't think it as a separate branch to evolve on its own path. It is(or I think it is) the final version of 2.9 DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Wednesday, December 28, 2011 9:20 PM To: lucene-net-...@lucene.apache.org Cc: lucene-net-u...@lucene.apache.org Subject: Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g One of the benefits of moving forward with the conversion of the Java Lucene, is that they're using more recent versions of Java that support things like generics and enums, so the direct port is getting more and more like .NET, though not in all respects of course. I'm of the
Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
Apologies upfront: another long email. My most firm opinion on this topic is that, as a community, we spend too much time on this discussion. We should just simply commit to one or the other path, or both, or some middle ground, or just commit to not discussing it anymore and go with whatever code gets written and works is what we use and leave it up to the discretion of the coder who is actually spending time improving the product. Obviously the last option is the worst of them. My view of our current roadmap is/was: 1. We'd maintain basic line-by-line consistency through the 2.x releases. But 3.X and beyond were open to changing the API significantly. We are committed to changing the API and internal implementations in order to improve performance and developer experience on .NET, but haven't yet had made a plan for that (eg, no spec for a new API). 2. We'd try to automate the porting process so that it was repeatable and easy to keep up with (or at least easier) and maintain a line-by-line port in a branch. That means the .NET version would ultimately be a very different product than the line-by-line port and we'd be creating two separate but related products but where possible, share code between them. Patching the line-by-line product from Java would be easier and faster than patching the .NET product and so they may end up with different release schedules. It seems that effort on improving automation of the port has tapered off. As anyone who has done any of the porting from commit patches from Java knows, a good portion of that work can be automated with find/replace but substantial portions and certain scenarios is the current code definitely cannot be and probably will never be able to be fully automated. While I have been advocating doing both and trying to find a strategy that makes sense for that, another option is to just officially drop any concern for line-by-line consistency with Java. A justification for that is simple: IKVM provides this already. The licensing allows use in commercial apps and it's performance is close to the same, so, AFAIK it's a viable replacement for a line-by-line version of Lucene.Net in just about any context as long as no one is modifying IKVM itself. I don't think it's unreasonable to suggest to people who want a line-by-line version to use IKVM instead of Lucene.Net. So, if we use that perspective and say that the need for a .NET usable line-by-line version of Lucene is already available via IKVM, why would we bother handcoding another one? It makes more sense to focus our valuable hand coding work on making something that *improves* upon the .NET development experience. It may cause us to be slow to release, but for good reason. So it seems to me we have the following primary agenda items to deal with: 1. Make an official decision regarding line-by-line porting, publish it and document our reasoning, so that we can end the ambiguity and circular discussions 2. If line-by-line porting is still part of our plan after we accomplish Agenda Item #1, resume work on improving automation of porting, creating scripts/tools/etc and document the process 3. If having a different API for .NET is still part of our plan after we accomplish Agenda Item #1, spec those API changes and associated internal changes required and publish the spec And to drive home the point I made in my first sentence: If had already accomplished those three agenda items, the time I just spent typing this email could have been spent working on Lucene.Net. We need to get to that point if we want to maintain any kind of development velocity. Thanks, Troy On Thu, Dec 29, 2011 at 2:38 PM, Prescott Nasser geobmx...@hotmail.com wrote: I dont think at the end of the day we want to make just cosmetic changes. We also have the issue of same name different casing which needs to be fixed - but it's not clear how to manage that without some large adjustments to the API. Sent from my Windows Phone From: Troy Howard Sent: 12/29/2011 2:19 PM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g My vote goes to merging the two: Apply the same concepts from 2.9.4g to 3.X development, using generics where possible, Disposable vs Close, and exposing *additional* APIs for generics (but leaving the existing old ones) to enable the underlying performance improvements the generics offer. Also, expose IEnumerableT implementations vs Java style enumerables/iterators. If we are only adding to the existing and making relatively minor changes to enable generics, updating/maintenance should be relatively easy and it won't break anyone's code. Thanks, Troy On Thu, Dec 29, 2011 at 2:08 PM, Prescott Nasser geobmx...@hotmail.com wrote: I agree its a matter of taste. I'd vote continue with g and evolve it to where we want a .net version to be. What do others think? Sent from my Windows Phone
Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
Thinking about it, I should make myself more clear regarding why I brought up IKVM again, just so no one gets the wrong idea about my intentions there... I only mentioned it as a justification for dropping line-by-line compatibility and as an alternative for people who really care about that. As we discussed previously, IKVMed Lucene is not Lucene.Net in a lot of important material ways. We are already deviating significantly from Java Lucene even with the mostly line by line approach. Compare Lucene.Net 2.9.4 and IKVMed Java Lucene 2.9.4. They are very different user experiences on a lot of levels (licensing, packaging, data types used, etc). But it's a *reasonable alternative* when a high-degree of consistency with Java Lucene is important to the end user and by pointing to IKVM as our answer to those users, we are free to move forward without that concern. That means, supposing we move away from Java significantly, as a new end user looking to employ Lucene in their .NET product, they can choose between IKVM Lucene (identical API to Java, can use the latest Java build, performs well, may have some problems with licensing and packaging) and Lucene.Net (different API but hopefully one that is more palatable to .NET users so it'd be easy to learn, perfoms better than IKVM, but has a dev cycle that lags behind Java, possibly by a lot). Existing users who like who Lucene.Net as it is now, may feel alienated because they would be forced to choose between learning the new API and dealing with a slow dev cycle, or adapting to IKVM which could be very difficult or impossible for them. Either one would require a code change. But of course, we run this risk with any change we make to what we are doing. I think a greater risk is that the project lacks direction. Anyway, it's just one idea/talking point towards the end goal of getting the general topic off the table completely. Thanks, Troy On Thu, Dec 29, 2011 at 3:32 PM, Troy Howard thowar...@gmail.com wrote: Apologies upfront: another long email. My most firm opinion on this topic is that, as a community, we spend too much time on this discussion. We should just simply commit to one or the other path, or both, or some middle ground, or just commit to not discussing it anymore and go with whatever code gets written and works is what we use and leave it up to the discretion of the coder who is actually spending time improving the product. Obviously the last option is the worst of them. My view of our current roadmap is/was: 1. We'd maintain basic line-by-line consistency through the 2.x releases. But 3.X and beyond were open to changing the API significantly. We are committed to changing the API and internal implementations in order to improve performance and developer experience on .NET, but haven't yet had made a plan for that (eg, no spec for a new API). 2. We'd try to automate the porting process so that it was repeatable and easy to keep up with (or at least easier) and maintain a line-by-line port in a branch. That means the .NET version would ultimately be a very different product than the line-by-line port and we'd be creating two separate but related products but where possible, share code between them. Patching the line-by-line product from Java would be easier and faster than patching the .NET product and so they may end up with different release schedules. It seems that effort on improving automation of the port has tapered off. As anyone who has done any of the porting from commit patches from Java knows, a good portion of that work can be automated with find/replace but substantial portions and certain scenarios is the current code definitely cannot be and probably will never be able to be fully automated. While I have been advocating doing both and trying to find a strategy that makes sense for that, another option is to just officially drop any concern for line-by-line consistency with Java. A justification for that is simple: IKVM provides this already. The licensing allows use in commercial apps and it's performance is close to the same, so, AFAIK it's a viable replacement for a line-by-line version of Lucene.Net in just about any context as long as no one is modifying IKVM itself. I don't think it's unreasonable to suggest to people who want a line-by-line version to use IKVM instead of Lucene.Net. So, if we use that perspective and say that the need for a .NET usable line-by-line version of Lucene is already available via IKVM, why would we bother handcoding another one? It makes more sense to focus our valuable hand coding work on making something that *improves* upon the .NET development experience. It may cause us to be slow to release, but for good reason. So it seems to me we have the following primary agenda items to deal with: 1. Make an official decision regarding line-by-line porting, publish it and document our reasoning, so that we can end the ambiguity and circular
Re: [Lucene.Net] Lucene.Net 3 onwards and 2.9.4g
, community builders, startup entrepreneurs instead of developers for a second. You have limited resources: time, budget, personnel, etc. What is our two biggest metrics of success for this product? My guess is adoption and customer involvement (contributing patches, tutorials, tweets, etc). Most likely both are those are going to be carried by .NET developers as your inside promoter of Lucene.NET. So what is going to wow them? bring them the most value? What can we provide so that it makes their job easier, cost effective, and lets get home faster to their lives or significant other? What is a break out niche that Lucene.Net could have over Solr/Lucene? What is going to make an average developer more willing to grow the community and contribute? What would encourage them to give up their free time to do so? I would approach the answer from this angle rather than continue to talk about it from a developer/committer perspective as we keep going in circles. You're not going to be able to please everyone, so lets figure out was is going to deliver the most value to .NET developers and go from there. - michael On Thu, Dec 29, 2011 at 8:13 PM, Rory Plaire codekai...@gmail.com wrote: The other option for people not wanting a line-by-line port is to just stick with whichever the last version that had a line-by-line transliteration done to it. This is done in a number of projects where new versions break compatibility. 2.9.4 is certainly a nice release... -r On Thu, Dec 29, 2011 at 4:32 PM, Troy Howard thowar...@gmail.com wrote: Thinking about it, I should make myself more clear regarding why I brought up IKVM again, just so no one gets the wrong idea about my intentions there... I only mentioned it as a justification for dropping line-by-line compatibility and as an alternative for people who really care about that. As we discussed previously, IKVMed Lucene is not Lucene.Net in a lot of important material ways. We are already deviating significantly from Java Lucene even with the mostly line by line approach. Compare Lucene.Net 2.9.4 and IKVMed Java Lucene 2.9.4. They are very different user experiences on a lot of levels (licensing, packaging, data types used, etc). But it's a *reasonable alternative* when a high-degree of consistency with Java Lucene is important to the end user and by pointing to IKVM as our answer to those users, we are free to move forward without that concern. That means, supposing we move away from Java significantly, as a new end user looking to employ Lucene in their .NET product, they can choose between IKVM Lucene (identical API to Java, can use the latest Java build, performs well, may have some problems with licensing and packaging) and Lucene.Net (different API but hopefully one that is more palatable to .NET users so it'd be easy to learn, perfoms better than IKVM, but has a dev cycle that lags behind Java, possibly by a lot). Existing users who like who Lucene.Net as it is now, may feel alienated because they would be forced to choose between learning the new API and dealing with a slow dev cycle, or adapting to IKVM which could be very difficult or impossible for them. Either one would require a code change. But of course, we run this risk with any change we make to what we are doing. I think a greater risk is that the project lacks direction. Anyway, it's just one idea/talking point towards the end goal of getting the general topic off the table completely. Thanks, Troy On Thu, Dec 29, 2011 at 3:32 PM, Troy Howard thowar...@gmail.com wrote: Apologies upfront: another long email. My most firm opinion on this topic is that, as a community, we spend too much time on this discussion. We should just simply commit to one or the other path, or both, or some middle ground, or just commit to not discussing it anymore and go with whatever code gets written and works is what we use and leave it up to the discretion of the coder who is actually spending time improving the product. Obviously the last option is the worst of them. My view of our current roadmap is/was: 1. We'd maintain basic line-by-line consistency through the 2.x releases. But 3.X and beyond were open to changing the API significantly. We are committed to changing the API and internal implementations in order to improve performance and developer experience on .NET, but haven't yet had made a plan for that (eg, no spec for a new API). 2. We'd try to automate the porting process so that it was repeatable and easy to keep up with (or at least easier) and maintain a line-by-line port in a branch. That means the .NET version would ultimately be a very different product than the line-by-line port and we'd be creating two separate but related products
Re: [Lucene.Net] Lucene.net twitter account and chat room
Re: Twitter Sadly not a single tweet has been sent out on our twitter account. Really need to remedy that. Re: IRC/realtime chat There have been some good reasons expressed by various folks at Apache (and in our team) that realtime chat in channels which are not publicly logged should generally be discouraged. This is because it's all too easy to have a discussion in which only a few members of the community are present, and make decisions without any opportunity for the rest of the community to have input and without the ability to review the reasoning or discourse later. The same holds true for user support, as it's much better to have that public and logged in a mailing list message so that others might find that through searches and use as a reference. That said, people do use IRC/IM from time to time, but we prefer to keep most if not all of the communications public and on the Apache mailing lists. So feel free to set up a chat room and chat with whomever wants to join about whatever topic, but for most things at Apache the philosophy is mailing list, or it didn't happen. :) Thanks, Troy On Fri, Dec 2, 2011 at 10:03 AM, Prescott Nasser geobmx...@hotmail.com wrote: I just saw that there is a twitter account for Lucene.net http://twitter.com/#!/LuceneDotNet is anybody using it? Login information is in the private repo - does anyone know how I get to that, I can make an announcement
Re: [Lucene.Net] Roadmap
So, if we're getting back to the line by line port discussion... I think either side of this discussion is too extreme. For the case in point Chris just mentioned (which I'm not really sure what part was so difficult, as I ported that library in about 30 minutes from scratch)... anything is a pain if it sticks out in the middle of doing something completely different. The only reason we are able to do this line by line is due to the general similarity between Java and C#'s language syntax. If we were porting Lucene to a completely different language, that had a totally different syntax, the process would go like this: - Look at the original code, understand it's intent - Create similar code in the new language that expresses the same intent When applying changes: - Look at the original code diffs, understanding the intent of the change - Look at the ported code, and apply the changed logic's meaning in that language So, is just a different thought process. In my opinion, it's a better process because it forces the developer to actually think about the code instead of blindly converting syntax (possibly slightly incorrectly and introducing regressions). While there is a large volume of unit tests in Lucene, they are unfortunately not really the right tests and make porting much more difficult, because it's hard to verify that your ported code behaves the same because you can't just rely on the unit tests to verify your port. Therefore, it's safer to follow a process that requires the developer to delve deeply into the meaning of the code. Following a line-by-line process is convenient, but doesn't focus on meaning, which I think is more important. Thanks, Troy On Mon, Nov 21, 2011 at 2:23 PM, Christopher Currens currens.ch...@gmail.com wrote: Digy, No worries. I wasn't taking them personally. You've been doing this for a lot longer than I have, but I didn't understand you pain until I had to go through it personally. :P Have you looked at Contrib in a while? There's a lot of projects that are in Java's Contrib that are not in Lucene.Net? Is this because there are some that can't easily (if at all) be ported over to .NET or just because they've been neglected? I'm trying to get a handle on what's important to port and what isn't. Figured someone with experience could help me with a starting point over deciding where to start with everything that's missing. Thanks, Christopher On Mon, Nov 21, 2011 at 2:13 PM, Digy digyd...@gmail.com wrote: Chris, Sorry, if you took my comments about pain of porting personally. That wasn't my intension. +1 for all your changes/divergences. I made/could have made them too. DIGY -Original Message- From: Christopher Currens [mailto:currens.ch...@gmail.com] Sent: Monday, November 21, 2011 11:45 PM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Roadmap Digy, I used 2.9.4 trunk as the base for the 3.0.3 branch, but I looked to the code in 2.9.4g as a reference for many things, particularly the Support classes. We hit many of the same issues I'm sure, I moved some of the anonymous classes into a base class where you could inject functions, though not all could be replaced, nor did I replace all that could have been. Some of our code is different, I went for the option for WeakDictionary to be completely generic, as in wrapping a generic dictionary with WeakKeyT instead of wrapping the already existing WeakHashTable in support. In hindsight, it may have just been easier to convert the WeakHashTable to generic, but alas, I'm only realizing that now. There is a problem with my WeakDictionary, specifically the function that determines when to clean/compact the dictionary and remove the dead keys. I need a better heuristic of deciding when to run the clean. That's a performance issue though. Regarding the pain of porting, I am a changed man. It's nice, in a sad way, to know that I'm not the only one who experienced those difficulties. I used to be in the camp that porting code that differed from java wouldn't be difficult at all. However, now I code corrected! It threw me a curve-ball, for sure. I DO think a line-by-line port can definitely include the things talked about below, ie the changes to Dispose and the changes to IEnumerableT. Those changes, I thing, can be made without a heavy impact on the porting process. There was one fairly large change I opted to use that differed quite a bit from Java, however, and that was the use of the TPL in ParallelMultiSearcher. It was far easier to port this way, and I don't think it affects the porting process too much. Java uses a helper class defined at the bottom of the source file that handles it, I'm simply using a built-in one instead. I just need to be careful about it, it would be really easy to get carried away with it. Thanks, Christopher On Mon, Nov 21, 2011 at 1:20 PM, Digy digyd...@gmail.com wrote: Hi
Re: [Lucene.Net] Lucene.Net-2.9.4-incubator-RC2 Documentation
Well the issue is that Apache's homegrown site CMS system (which our site is based off of) uses SVN to store/deploy the file for the site... So deployable artifacts for the site need to be in SVN, in a particular directory structure under the site directory, so that they can be picked up by the CMS. Then there's the documentation we want to deploy with the DLL, including XML docs, and CHM, which are stored near the codebase. I believe, but I could be wrong, that it's possible to update your site deployment script and do something like.. store the website docs in a zip file in SVN, then unpack upon deployment, thus reducing the clutter in SVN. Not sure how to go about that off the top of my head though. -T On Tue, Nov 8, 2011 at 10:55 AM, Christopher Currens currens.ch...@gmail.com wrote: I noticed that we have a bit of documentation in the 'site' directory of the, and the trunk directory. Which one are we using to link to the website, the one in trunk or the site folder? I feel that maybe we could still keep the docs in trunk zipped for each release and tag, then extract that to the site directory when we want to release documentation. At least then it would keep our SVN uncluttered. To be honest, though, I don't really know how documentation is stored in other projects, so I'll admit I'm not even 100% sure if that's a plausible solution. On Mon, Nov 7, 2011 at 11:55 PM, Prescott Nasser geobmx...@hotmail.comwrote: My intention is to link it to the website, so we have browsable documentation Date: Mon, 7 Nov 2011 19:26:23 -0800 From: currens.ch...@gmail.com To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Lucene.Net-2.9.4-incubator-RC2 Documentation I don't understand why we have the rendered html in the docs. I don't mind having the .chm rendered and put in the repo, but the entire HTML documentation spans 8,000 files and over 100mb. The CHM comes in at around 15mb. I don't think it's necessary to have both in the repo, but if the consensus is to keep them both, I think we should bundle the HTML docs in a zip, instead of being added as loose files, at least in trunk. I think it's kinda silly the way it is now, and SVN does better at handling 1 large file versus 8,000 smaller ones. On Mon, Nov 7, 2011 at 6:53 AM, Michael Herndon mhern...@wickedsoftware.net wrote: @Stefan. I wouldn't worry about the taking the blame, you've done plenty to help out and there is no way to catch everything. We'll learn as we go. As svnpub is the only option and since we can't run the binary version that uses ASP.NET, we'll need to probably take your suggestion commit the smaller chunks of html then. I'll do it manually this time and see if I can't write a script that automates it in the future. @Chris, thanks for the fixes to the build scripts this weekend. - Michael On Mon, Nov 7, 2011 at 9:20 AM, Stefan Bodewig bode...@apache.org wrote: On 2011-11-07, Michael Herndon wrote: I can rebuild it, but the trick is replacing the version of it in svn so that it does not cause svnsync and cms to choke. Last time I just pushed it into branch/site/docs. However, that is not publicly visible for the incubation website, so Prescott had to do an svn move. When I recommended to do the svn move I didn't realize we were talking about that many files. I simply didn't check, sorry. I'm not quite sure how to go about it this time around. I would push it to jira, but it caps uploads at 10 mb. Then it still had to go to svn in some way. Personally I'm not a friend of generated documents in svn but I'm in a minority here. With svnpubsub being your only option I think the only thing you can do is split the commit into smaller chunks, committing 100 or 200 files at a time. Maybe infra has better ideas than that, I don't. Stefan
Re: [Lucene.Net] [VOTE] Apache Lucene.Net-2.9.4-incubating
Prescott, Good job figuring out the signing and creating the release packages! It's non-trivial to figure out the first time from the docs, for sure. Sorry, I have been so busy this month that I wasn't able to provide help before you figured it out on your own. :) Some nitpicky details about the release packages: - Superfluous subfolders: There's an extra layer of subfolders named the same as the zip file which is unneeded - Root should be trunk all the time, even in the release packages, to keep relative pathing consistently rooted. So the binary release should have a bin subfolder at it's root to match the repo layout - XML doc files should be included in binary release. We have had users state a desire to have them for VS intellisense integration. This was an issue that came up during the last release package build cycle - Various notice files should be included in binary release as well - I don't know about the.SNK file from lib, maybe that should be in the source package, maybe not. I notice it's also in the core project folder. Which one does the project file reference? - .svn folder/files should be removed from source package - Empty subfolders should be left out (\build\vs2008 and \test\demo are the ones I noticed) - \docs are missing from source package as well Regarding docs, generally speaking, I think we should make a decision about what we want to provide and set a standard. Some considerations are: - XML doc files in binary package: Integrate with Visual Studio, providing a rich Intellisense experience, Generated at build time from source code. Where should they go in the folder structure to make it just work with VS from binary package? - Hosted HTML Single Version of the Truth vs HTML/CHM doc files in binary/source package: One one hand, we could only host docs on the website vs distributing them. We can update as needed, and they are the only reference. Can host docs for multiple versions, etc.. HTML/CHM in packages, are good for offline use, but can't be updated. Both can be generated from XML files using Sandcastle. We could do either one, or both of those. Using sandcastle, we can include the Apache license in the headers of all generated files, solving a lot of the RAT complaints. Also, there's a lot of new material in the repo for CI related things.. Do we want to include any of the in the source package, to assist our users with setting up their own CI servers? How simple would it be to modify those files to work in a different environment (assuming they are also using Hudkins)? So all that said, I think there's more work to be done and I'm -1 for these artifacts. Thanks, Troy On Sun, Oct 30, 2011 at 10:08 PM, Prescott Nasser geobmx...@hotmail.com wrote: Artifacts are located here: http://people.apache.org/~pnasser/Lucene.Net/2.9.4-incubating-RC1/ If the vote passes here, I will move the artifacts to staging and call a vote on the general incubator mailing list Please verify the release and cast your vote. The vote will be open for 72 hours. [ ] +1 [ ] 0 [ ] -1 ~Prescott
Re: [Lucene.Net] 2.9.4
I thought it was: 2.9.2 and before are 2.0 compatible 2.9.4 and before are 3.5 compatible After 2.9.4 are 4.0 compatible Thanks, Troy On Wed, Sep 21, 2011 at 10:15 AM, Michael Herndon mhern...@wickedsoftware.net wrote: if thats the case, then well need conditional statements for including ThreadLocalT On Wed, Sep 21, 2011 at 12:47 PM, Prescott Nasser geobmx...@hotmail.comwrote: I thought this was after 2.9.4 Sent from my Windows Phone -Original Message- From: Michael Herndon Sent: Wednesday, September 21, 2011 8:30 AM To: lucene-net-dev@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: Re: [Lucene.Net] 2.9.4 @Robert, I believe the overwhelming consensus on the mailing list vote was to move to .NET 4.0 and drop support for previous versions. I'll take care of build scripts issue while they being refactored into smaller chunks this week. @Troy, Agreed. On Wed, Sep 21, 2011 at 8:08 AM, Robert Jordan robe...@gmx.net wrote: On 20.09.2011 23:48, Prescott Nasser wrote: Hey all seems like we are set with 2.9.4? Feedback has been positive and its been quiet. Do we feel ready to vote for a new release? I don't know if the build infrastructure is part of the release. If yes, then there is an open issue: Contrib doesn't build right now because there are some assembly name mismatches between certain *.csproj files and build/scripts/contrib.targets. The following patches should fix the issue: https://github.com/robert-j/**lucene.net/commit/** c5218bca56c19b3407648224781eec**7316994a39 https://github.com/robert-j/lucene.net/commit/c5218bca56c19b3407648224781eec7316994a39 https://github.com/robert-j/**lucene.net/commit/** 50bad187655d59968d51d472b57c2a**40e201d663 https://github.com/robert-j/lucene.net/commit/50bad187655d59968d51d472b57c2a40e201d663 Also, the fix for [LUCENENET-358] is basically making Lucene.Net.dll a .NET 4.0-only assembly: https://github.com/apache/**lucene.net/commit/** 23ea6f52362fc7dbce48fd012cea12**9a7350c73c https://github.com/apache/lucene.net/commit/23ea6f52362fc7dbce48fd012cea129a7350c73c Did we agree about abandoning .NET = 3.5? Robert
Re: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts
Michael - Could be wrong, but I think Nick might have gotten you confused with Neal. Regardless, I completely agree with everything you just said. And, Yay for NuGet! Package management is the bomb. -T On Wed, Sep 21, 2011 at 7:43 PM, Michael Herndon mhern...@wickedsoftware.net wrote: Nick, The last e-mail was out of line and out of context. If anything, emails like that can push people into emotional or motivational apathy towards working on a project. 1) Lucene.Net will be getting nuget packages. People can hate on it, grumble, or not use it, but its a viable distribution vehicle. Its going in. This thread was to gather feedback on how people that would use it, see themselves using it. 2) Others might want alternatives to nuget that have not been provided yet. We should be open to providing distribution alternatives if enough people warrant it. Its not apathetic or impassive to think to that there might be more than one way to distribute releases. 3) Attack problems. Not people. If you believe a person is the problem, take the issue up with them offline. Those kinds of things are better face to face or through a phone call, or an exceptionally clear e-mail. Its way too easy for people to read into things too much or take things out of context in an e-mail. Attacking people also distracts people from focusing on the actual issue and prevents any actually logic or reason or sound argument from being heard. Its a good way to alienate people that you should actually be trying to persuade. 4) If I was actually apathetic and severely short sighted, I would not be spending my own vacation time this weekend automating nuget packages with the build scripts for Lucene.Net or experimenting Portable Library Tools for Lucene.Net 4.x to see if we can get it working on mobile. Nor would I have spent my last 4 day weekend setting up jenkins and local builds of Lucene.Net. Or put in the hours today to make sure the build scripts are granular enough to implement the smaller packages. 5) If you feel so passionately about all this, why not work towards being a contributor or committer and lead by example ? - Michael Since I'm the one implementing Nuget into the build process and I have not played with the nuget server or creating a package, it just seem wise to gather feedback on how people saw themselves using the contrib packages. On Wed, Sep 21, 2011 at 9:00 PM, Nicholas Paldino [.NET/C# MVP] casper...@caspershouse.com wrote: With all due respect, it's myopic opinions like yours and Michael's (his leans more towards apathy) which will harm the ability to get the project into the hands of people. I think (hope?) it can be agreed upon that the more that people are aware of Lucene.NET, the better it is for the project in general, and most importantly, the more potential that you have that someone will *contribute back* to it (and given what Lucene.NET has gone through in the past year, it desperately needs that participation). The fact of the matter is that Nuget puts packages in the hands of .NET developers, that leads to exposure and regardless of personal opinions on whether or not they *like* Nuget, it can't be denied that it's an *extremely* popular way to get libraries into people's projects. If you want to quibble over the actual numbers (and the definition of extremely popular) then that's fine, but here are the numbers you want: http://stats.nuget.org/ If you want to just tell that audience to take a leap, that's fine, but I think it would be foolish to do so otherwise. Additionally, given that Lucene.NET is already on Nuget, isn't there *any* concern that there isn't an official distro? Aren't you concerned about the integrity of the brand that so many of you fought to keep alive over the past year? There's no guarantee that what's on Nuget will be the official releases/builds that come out of this project, and I'm a little surprised there isn't more concern over that aspect either. Just my $0.02 - Nick -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Wednesday, September 21, 2011 7:06 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts I am not against it, but personally think it as a toy. I am from the generation where people used vi to write codes. DIGY -Original Message- From: Aaron Powell [mailto:m...@aaron-powell.com] Sent: Thursday, September 22, 2011 1:56 AM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts Any particular reason you guys are not interested in NuGet? Aaron Powell MVP - Internet Explorer (Development) | FunnelWeb Team Member http://apowell.me | http://twitter.com/slace | Skype: aaron.l.powell | Github | BitBucket -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Thursday, 22 September 2011 7:42 AM To:
Re: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts
While it may be a bit redundant, why couldn't there be an individual package for each piece of contrib and a Lucene.Net Contrib (All) package that drags them all down. That way users can grab just the bit they need, or if they just want to get the whole thing, grab the All package. Thanks, Troy On Tue, Sep 20, 2011 at 9:11 PM, Aaron Powell m...@aaron-powell.com wrote: I'm going to vote +1 for granular. With the RC you could look at myget and have a Lucene.Net repository on there so people can go for unstable on myget, stables on nuget. Also, I came across this article which explains how to setup a build server to automatically push to nuget/ myget which could be useful to the maintainers: http://brendanforster.com/doing-the-build-server-dance-with-nuget.html Aaron Powell MVP - Internet Explorer (Development) | FunnelWeb Team Member http://apowell.me | http://twitter.com/slace | Skype: aaron.l.powell | Github | BitBucket -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Wednesday, 21 September 2011 2:05 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Nuget, Lucene.Net, and Your Thoughts Right now there are two packages: Lucene Lucene.Contrib. My question to the community is do you wish to finer grain packages, i.e. a package for each contrib project or continue to keep it simple. +1 Granular, we just need to be good about descriptions. Another topic to converse about is would you like to see an out-of-band project nuget feed for nightly builds, branches with new or experimental features, or stable code snapshots for a projected release? Having a package for the latest RC would probably be a good idea
Re: [Lucene.Net] 2.9.4
Why would we want to do that? Under the /site/docs directory, they need to be served up as loose HTML... IMO the XML files shouldn't be checked into SVN because they are auto-generated. The same goes for Sandcastle files.. However, in the release packages, I think we should include the XML files as well as a fully functional version of the Sandcastle docs that can be viewed locally. I personally thing CHMs are terrible user experience, and I'd much rather have a static HTML site I can browse locally, if we're going to bother including a copy of the documentation in the package, vs just hosting it online and calling that good (this is what most projects these days do). Good thing about hosting online -- searchable via google. Thanks, Troy On Tue, Sep 20, 2011 at 9:48 PM, Michael Herndon mhern...@wickedsoftware.net wrote: Could we store sandcastle docs as a single zip/chm? On Wed, Sep 21, 2011 at 12:37 AM, Troy Howard thowar...@gmail.com wrote: At one time I had a SVN server set up at work that had a post-commit hook set up that would generate a static HTML site from the XML doc files using Sandcastle. So.. It can be done. That was about 3-4 years ago and I don't work at that company any longer, so I don't have access to the details of how that was done. Regarding SVN locations... Autogenerated XML doc files should go in the ~/trunk/doc/component folders. The Sandcastle generated static HTML should go under ~/site/docs/version folders. I'll see if I can't dig up some info on how to generate static HTML with Sandcastle. Thanks, Troy On Tue, Sep 20, 2011 at 9:15 PM, Michael Herndon mhern...@wickedsoftware.net wrote: We have a folder /trunk/docs, shouldn't this be the place for that? We should have a live site for the documentation that people can browse, similar to the parent project's site. http://lucene.apache.org/java/3_4_0/api/all/index.html. It makes it the documentation more accessible. The rub is that Sandcastle SHFB generates html files with guid names, xml bin files that map to the html files, and aspx pages which acts as the glue. The aspx pages parses the xml/bin files which creates the document site. Thus we're now required to use a server that knows how to serve up aspx pages. If any one knows a way to generate just vanilla html using Sandcastle SHFB, I could just create a folder per version and push the docs to incubator site. But so far I haven't found an option for this. Keeping the generated help docs inside of source would still require for users to setup a local website just to view the documentation and it adds extra noise in the project. In the release we can provide a zipped file of the site and a generated .chm or even help2/3 files. On Tue, Sep 20, 2011 at 11:38 PM, Prescott Nasser geobmx...@hotmail.com wrote: We should probably fix the ClsCompliance warnings if they have not already been fixed We will have some issues with this - some are marked volatile - which basically have to be a non-CLS compliant type (as far as my research is finding) Anyone have thoughts? I went through and replaced sbyte - Int16, and uint - Int64, but I'm having an issue with this, and I don't think removing the volatile keyword is the right solution. find a place to put the generated documentation. We have a folder /trunk/docs, shouldn't this be the place for that? I remember someone mentioning he/she was unable to access a class from VB.NET. The class had public fields properties with the same names but different casing. The fields should be private. The link in the readme is a dead link: http://lucene.apache.org/lucene.net/docs/2.4.0/ The docs generated by sandcastle SHFB require a server that allows aspx files to be executed. We should either remove the link from the readme or find the docs a new home and update the link. We should generate new documentation and update the link I'll see if I can setup automating Lucene.Net http://lucene.net nuget package creation for trunk in the next day or so. - Michael On Tue, Sep 20, 2011 at 5:48 PM, Prescott Nasser geobmx...@hotmail.com wrote: Hey all seems like we are set with 2.9.4? Feedback has been positive and its been quiet. Do we feel ready to vote for a new release? -Prescott Sent from my Windows Phone
[Lucene.Net] [jira] [Resolved] (LUCENENET-404) Improve brand logo design
[ https://issues.apache.org/jira/browse/LUCENENET-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Troy Howard resolved LUCENENET-404. --- Resolution: Fixed Uploaded the artifacts in r1153264 Improve brand logo design - Key: LUCENENET-404 URL: https://issues.apache.org/jira/browse/LUCENENET-404 Project: Lucene.Net Issue Type: Sub-task Components: Project Infrastructure Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Labels: branding, logo Attachments: lucene-alternates.jpg, lucene-medium.png, lucene-net-logo-display.jpg The existing Lucene.Net logo leaves a lot to be desired. We'd like a new logo that is modern and well designed. To implement this, Troy is coordinating with StackOverflow/StackExchange to manage a logo design contest, the results of which will be our new logo design. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Lucene.Net] svn commit: r1150205 - /incubator/lucene.net/site/trunk/content/lucene.net/index.mdtext
Looking good Prescott! Love the layout with the new logo. On Sat, Jul 23, 2011 at 1:37 PM, Prescott Nasser geobmx...@hotmail.com wrote: I know :( ...I'm going to be updating the site to something different later today - in progress here http://lucene.net.staging.apache.org/lucene.net/index.html From: digyd...@gmail.com To: lucene-net-dev@lucene.apache.org Date: Sat, 23 Jul 2011 23:31:46 +0300 Subject: RE: [Lucene.Net] svn commit: r1150205 - /incubator/lucene.net/site/trunk/content/lucene.net/index.mdtext Get *invovled* to help shape our future -Original Message- From: pnas...@apache.org [mailto:pnas...@apache.org] Sent: Saturday, July 23, 2011 10:50 PM To: lucene-net-comm...@lucene.apache.org Subject: [Lucene.Net] svn commit: r1150205 - /incubator/lucene.net/site/trunk/content/lucene.net/index.mdtext Author: pnasser Date: Sat Jul 23 19:50:05 2011 New Revision: 1150205 URL: http://svn.apache.org/viewvc?rev=1150205view=rev Log: syntax updates Modified: incubator/lucene.net/site/trunk/content/lucene.net/index.mdtext Modified: incubator/lucene.net/site/trunk/content/lucene.net/index.mdtext URL: http://svn.apache.org/viewvc/incubator/lucene.net/site/trunk/content/lucene.net/index.mdtext?rev=1150205r1=1150204r2=1150205view=diff == --- incubator/lucene.net/site/trunk/content/lucene.net/index.mdtext (original) +++ incubator/lucene.net/site/trunk/content/lucene.net/index.mdtext Sat Jul 23 19:50:05 2011 @@ -5,9 +5,10 @@ Lucene.Net is a port of the Lucene search engine library, written in C# and targeted at .NET runtime users. Lucene.Net has three primary goals: -1. Maintain the existing line-by-line port from Java to C#, fully automating and commoditizing the process such that the project can easily synchronize with the Java Lucene release schedule; -2. Maintaining the high-performance requirements excepted of a first class C# search engine library; -3. Maximize usability and power when used within the .NET runtime. To that end, it will present a highly idiomatic, carefully tailored API that takes advantage of many of the special features of the .NET runtime. + +1. Maintain the existing line-by-line port from Java to C#, fully automating and commoditizing the process such that the project can easily synchronize with the Java Lucene release schedule; +2. Maintaining the high-performance requirements excepted of a first class C# search engine library; +3. Maximize usability and power when used within the .NET runtime. To that end, it will present a highly idiomatic, carefully tailored API that takes advantage of many of the special features of the .NET runtime. /h3
Re: [Lucene.Net] Re: Help spread the good news
Thanks so much for spreading the word! You're awesome! On Mon, Jul 18, 2011 at 2:25 PM, Simone Chiaretta simone.chiare...@gmail.com wrote: Given the response and the comments on twitter, it seems like not many people were aware of the new life of Lucene.net. Simone --- Simone Chiaretta @simonech Sent from a tablet On 18/lug/2011, at 18:58, Simone Chiaretta simone.chiare...@gmail.com wrote: As part of my commitment to the project, here is the long overdue blog post about the new life of Lucene.net http://codeclimber.net.nz/archive/2011/07/18/Lucene-net-is-back-on-track.aspx Please, have a look at it, correct me if I wrote something wrong or if you think I missed something important, and help spread the word. Here is the tweet that announces the post in case you want to re-tweet it: https://twitter.com/#!/simonech/status/93000671028199424 Thank you Simone -- Simone Chiaretta Microsoft MVP ASP.NET - ASPInsider Blog: http://codeclimber.net.nz RSS: http://feeds2.feedburner.com/codeclimber twitter: @simonech Any sufficiently advanced technology is indistinguishable from magic Life is short, play hard
[Lucene.Net] [jira] [Resolved] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Troy Howard resolved LUCENENET-437. --- Resolution: Fixed Fix Version/s: (was: Lucene.Net 2.9.4g) Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4 Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067328#comment-13067328 ] Troy Howard commented on LUCENENET-437: --- I assume you're referring to the issue found here: http://blogs.msdn.com/b/ericlippert/archive/2011/07/12/what-curious-property-does-this-string-have.aspx Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Issue Comment Edited] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067328#comment-13067328 ] Troy Howard edited comment on LUCENENET-437 at 7/18/11 9:56 PM: I assume you're referring to the issue found here: http://blogs.msdn.com/b/ericlippert/archive/2011/07/12/what-curious-property-does-this-string-have.aspx I am aware of that issue.. however the implementation in ListComparer is exactly the behaviour of Java. I copied the implementation straight from the Java docs. http://download.oracle.com/javase/1.4.2/docs/api/java/util/List.html#hashCode() The issue you're describing is more of a problem with the .NET implementation of GetHashcode() rather than the correctness of using hashcode for comparison. See (for implementation): http://www.dotnetperls.com/gethashcode Anyhow, I'll update this to use SupportClass.EquatableList since it allows the code to pass unit tests, and also not have that particular problem. was (Author: thoward37): I assume you're referring to the issue found here: http://blogs.msdn.com/b/ericlippert/archive/2011/07/12/what-curious-property-does-this-string-have.aspx Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067370#comment-13067370 ] Troy Howard commented on LUCENENET-437: --- From the docs for List.hashCode(): Returns the hash code value for this list. The hash code of a list is defined to be the result of the following calculation: hashCode = 1; Iterator i = list.iterator(); while (i.hasNext()) { Object obj = i.next(); hashCode = 31*hashCode + (obj==null ? 0 : obj.hashCode()); } This ensures that list1.equals(list2) implies that list1.hashCode()==list2.hashCode() for any two lists, list1 and list2, as required by the general contract of Object.hashCode. The contract intended for Object.hashCode() specifically includes that object1.equals(object2) == object1.hashCode().equals(object2.hashCode()). This is stated here: http://download.oracle.com/javase/1.4.2/docs/api/java/lang/Object.html#hashCode() Quoted: public int hashCode() Returns a hash code value for the object. This method is supported for the benefit of hashtables such as those provided by java.util.Hashtable. The general contract of hashCode is: - Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application. - If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result. - It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables. As much as is reasonably practical, the hashCode method defined by class Object does return distinct integers for distinct objects. (This is typically implemented by converting the internal address of the object into an integer, but this implementation technique is not required by the JavaTM programming language.) Again, this is Java's code contract, which was in some cases, improperly implemented in the .NET CLR. Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067373#comment-13067373 ] Troy Howard commented on LUCENENET-437: --- Although, according to that definition, .NET's implementation is allowed. It's just not a good idea for exactly the reasons we've been discussing. Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067384#comment-13067384 ] Troy Howard commented on LUCENENET-437: --- Well, it does, because of Object.hashCode()'s definition. It ensures equality, but does not ensure inequality. In other words, the contract ensures that for all types, this passes: Assert.IsTrue(object1.equals(object2)) Assert.IsTrue(object1.hashCode().equals(object2.hashCode())) but does not ensure that this passes: Assert.IsFalse(object1.equals(object2)) Assert.IsFalse(object1.hashCode().equals(object2.hashCode())) However, the docs state that while the equality of hashcodes may not be consistent with inequality of hash codes, that all efforts should be made to ensure both. The problem we face is that with Hashtable/HashSet, we generally care more about inequality (eg Uniqueness) than we do about equality (eg Sameness). What we're trying to determine, using hashCode/equals is the uniqueness of an item in the set, which is a different kind of logic (though clearly related). So, using .hashCode() to determine uniqueness of an item in a set is incorrect, it should always be correct for determining sameness. Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Commented] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13067388#comment-13067388 ] Troy Howard commented on LUCENENET-437: --- Sorry, read too fast. Your last comment is true. Essentially, we're saying the same thing. Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4, Lucene.Net 2.9.4g Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Created] (LUCENENET-437) Port Contrib.Shingle from Java
Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4 Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4 Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Work started] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on LUCENENET-437 started by Troy Howard. Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4 Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4 Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Resolved] (LUCENENET-437) Port Contrib.Shingle from Java
[ https://issues.apache.org/jira/browse/LUCENENET-437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Troy Howard resolved LUCENENET-437. --- Resolution: Fixed See r1147514 Port Contrib.Shingle from Java -- Key: LUCENENET-437 URL: https://issues.apache.org/jira/browse/LUCENENET-437 Project: Lucene.Net Issue Type: Task Components: Lucene.Net Contrib, Lucene.Net Test Affects Versions: Lucene.Net 2.9.4 Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Fix For: Lucene.Net 2.9.4 Port Contrib.Shingle from Java -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Lucene.Net] Lucene.Net 3.x
Hi Ofer! We are close to a release of 2.9.4, and another release following that of 2.9.4g (which diverges from the Java versioning/API somewhat). After that, we'll be looking into a 3.0 release. The schedule we hoped to achieve was a bit ambitious. We haven't had the level of code contribution that we would have liked to have yet. That said, it will happen, just taking longer than expected. We'll announce our progress on the list as things are moving forward. Thanks, Troy On Tue, Jul 12, 2011 at 4:01 AM, Ofer Vugman vugman.o...@gmail.com wrote: Hi, I saw that the expected release date of the 3.x version what at the end of June. Will it be released anytime soon ? Thanks
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Yes. But if there are commits to trunk before that happens it's a merge. -T On Jul 2, 2011 1:53 PM, Digy digyd...@gmail.com wrote: Troy, What do you mean by merging? 2.9.4g is a superset of 2.9.4 and has * bux fixes like LUCENENET-414 * new features like LUCENENET-429, MemoryMappedDirectory(although not used yet) , PartiallyTrustedAppDomain tests * improvements like LUCENENET-427, LUCENENET-418, Refactoring of SupportClass * API changes like - StopAnalyzer(Liststring stopWords) - Query.ExtractTerms(ICollectionstring) - TopDocs.TotalHits, TopDocs.ScoreDocs * readibily-changes like replacing some abstract methods with Func, while(XXX.MoveNext())s with foreachs etc. Is it something like creating a 2.9.4 tag and replacing trunk with 2.9.4g? DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Friday, July 01, 2011 12:36 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? DIGY - Re: Why do I wait.. That's mostly because I intend to make some deep changes, which would make merging the 2.9.4g branch back to trunk difficult. So, it's easier to merge those changes first. Also, I won't have enough time to make my changes until a little way in the future, but probably do have the time to put together another release, so I'll do that first because it fits with my work/life schedule. Thanks, Troy On Thu, Jun 30, 2011 at 1:19 PM, Digy digyd...@gmail.com wrote: Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the internals and index formats are significantly different including nixing the current vint file format and using byte[] array slices for Terms instead of char[]. So while porting 1 to 1 while require less knowledge or thought, its most likely going to require more hours of work. And Its definitely not going
[Lucene.Net] [jira] [Commented] (LUCENENET-404) Improve brand logo design
[ https://issues.apache.org/jira/browse/LUCENENET-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058761#comment-13058761 ] Troy Howard commented on LUCENENET-404: --- Just a quick update. The artist is making some final edits before we commit. Will post them soon. I'll attach examples. Improve brand logo design - Key: LUCENENET-404 URL: https://issues.apache.org/jira/browse/LUCENENET-404 Project: Lucene.Net Issue Type: Sub-task Components: Project Infrastructure Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Labels: branding, logo The existing Lucene.Net logo leaves a lot to be desired. We'd like a new logo that is modern and well designed. To implement this, Troy is coordinating with StackOverflow/StackExchange to manage a logo design contest, the results of which will be our new logo design. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Work started] (LUCENENET-404) Improve brand logo design
[ https://issues.apache.org/jira/browse/LUCENENET-404?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on LUCENENET-404 started by Troy Howard. Improve brand logo design - Key: LUCENENET-404 URL: https://issues.apache.org/jira/browse/LUCENENET-404 Project: Lucene.Net Issue Type: Sub-task Components: Project Infrastructure Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Labels: branding, logo Attachments: lucene-alternates.jpg, lucene-medium.png, lucene-net-logo-display.jpg The existing Lucene.Net logo leaves a lot to be desired. We'd like a new logo that is modern and well designed. To implement this, Troy is coordinating with StackOverflow/StackExchange to manage a logo design contest, the results of which will be our new logo design. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
DIGY - Re: Why do I wait.. That's mostly because I intend to make some deep changes, which would make merging the 2.9.4g branch back to trunk difficult. So, it's easier to merge those changes first. Also, I won't have enough time to make my changes until a little way in the future, but probably do have the time to put together another release, so I'll do that first because it fits with my work/life schedule. Thanks, Troy On Thu, Jun 30, 2011 at 1:19 PM, Digy digyd...@gmail.com wrote: Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the internals and index formats are significantly different including nixing the current vint file format and using byte[] array slices for Terms instead of char[]. So while porting 1 to 1 while require less knowledge or thought, its most likely going to require more hours of work. And Its definitely not going to guarantee the stability of the code or that its great code. I'd have to say that its not as stable as most would believe at the moment. Most of the tests avoid anything that remotely looks like it knows about the DRY principle and there is a static constructor in the core test case that throws an exception if it doesn't find an environment variable TEMP which will fail 90% of the tests and nunit will be unable to give you a clear reason why. Just to name a few issues I came across working towards getting Lucene.Net into CI. I haven't even started really digging in under the covers of the code yet. So my suggestion is to chew on this a bit more and build consensus, avoid fracturing people into sides. Be open to reservations and concerns that others have and continue to address them. - Michael On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote: Although there are a lot of people using Lucene.Net, this is our contribution report
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Scott - The idea of the automated port is still worth doing. Perhaps it makes sense for someone more passionate about the line-by-line idea to do that work? I would say, focus on what makes sense to you. Being productive, regardless of the specific direction, is what will be most valuable. Once you start, others will join and momentum will build. That is how these things work. I like DIGY's approach too, but the problem with it is that it is a never-ending manual task. The theory behind the automated port is that it may reduce the manual work. It is complicated, but once it's built and works, it will save a lot of future development hours. If it's built in a sufficiently general manner, it could be useful for other project like Lucene.Net that want to automate a port from Java to C#. It might make sense for that to be a separate project from Lucene.Net though. -T On Thu, Jun 30, 2011 at 2:13 PM, Scott Lombard lombardena...@gmail.comwrote: Ok I think I asked the wrong question. I am trying to figure out where to put my time. I was thinking about working on the automated porting system, but when I saw the response to the .NET 4.0 discussions I started to question if that is the right direction. The community seemed to be more interested in the .NET features. The complexity of the automated tool is going to become very high and will probably end up with a line-for-line style port. So I keep asking my self is the automated tool worth it. I don't think it is. I like the method has been Digy is using for porting the code. So I guess for me the real question is Digy where did you see 2.9.4g going next and what do you need help on? Scott -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Thursday, June 30, 2011 4:20 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Michael - If you bring those changes from git into a branch in SVN, we can help with it. It doesn't have to be complete to be committed. :) Regarding A (angering people)/B (being rejected)/C (feeling comfortable)/D (getting over it)... a) Making progress is more important than keeping everyone happy b) Our goal is to accept things, not reject them. That said, if something gets rejected due to quality issues, don't be afraid of that, it's a learning experience for everyone, and it's a good thing. We can work together to get to something everyone is happy with and learn in the process. c) Commit to a branch. Merge when things are right. No one expects branches to build or be finished. It's OK. I get worried when I merge to trunk or when I make a release. But I don't do that until I'm pretty sure it's all legit. d) Best way to get over it is to start doing it I know you probably already realize all of this, but I wanted to respond, so that, in case anyone else out there is struggling with the same set of fears, they can see that fears that prevent action are more problematic than any action they might take without those fears. Thanks, Troy On Thu, Jun 30, 2011 at 1:57 PM, Michael Herndon mhern...@wickedsoftware.net wrote: @Troy, I've already started working towards fixing unit testing issues, and prototyping some things that sure DRY up the testing just so that I can get the tests running on mono. Those changes are currently in a private git repo, however since we don't have a CI, I'm need to make some time to manually test those on at least 3 different Os (windowx, osx, and ubuntu) before putting those back into the 2.9.4g branch. The reason being I need those in working order so that I can do a write up on pulling those from source and at least running the build script to compile everything and run the tests for you. I don't know about everyone else, but thats a starting point I look for when I go to work on something or commit something back. They should make their way back sometime this month. I think the next thing I'll do is put my money where my mouth is, spend time break down the rest of the CI tasks, then seeing how much stuff I can get documented into the wiki. The simple faceted search is a decent starting template. @Digy I agree with the talk, no work. Though coming from the outside in, I still cringe when I make any commits at the moment. (even that little .gitnore file) A) I don't to want to commit anything thats going to piss alot of people off, B) I don't want to spend time/waste time on modifications that are going to be rejected. C) it took a good deal of going through things before I felt comfortable to even making a commit. D) yes I know I just need to get over it and so does everyone else (hence the obsession with the unit tests at the moment). and I think a key to relaying people to get over it, including myself, is to make the point you had more clear across the board: *If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. * +1 because that makes feel there is more leadway to experiment and any decent effort will at least go somewhere to live and not be wasted. On Thu, Jun 30, 2011 at 4:19 PM, Digy digyd...@gmail.com wrote: Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the internals and index formats are significantly different including nixing the current vint file format and using byte[] array slices for Terms instead of char[]. So while porting 1 to 1 while require less knowledge or thought, its most likely going to require more hours of work. And Its definitely not going to guarantee the stability of the code or that its great code. I'd have to say that its not as stable as most would believe at the moment. Most of the tests avoid anything that remotely looks like it knows about the DRY principle and there is a static constructor in the core test case that throws an exception if it doesn't find an environment variable TEMP which will fail 90% of the tests and nunit will be unable to give you a clear reason why. Just to name a few issues I came across working towards getting Lucene.Net into CI. I haven't even started really digging in under the covers of the code yet. So my suggestion is to chew on this a bit more and build consensus, avoid fracturing people into sides. Be open to reservations and concerns that others have and continue to address them. - Michael On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote: Although there are a lot of people using Lucene.Net, this is our contribution report for the past 5 years. https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r eport.contributions%3AcontributionreportNext=Next DIGY -Original Message- From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Thursday, June 30, 2011 8:16 PM To: Rory Plaire; lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? As someone from the nhibernate project We stopped following hibernate a while ago, and haven't regretted it We have mire features, less bugs and better code base Sent from my Windows Phone From: Rory Plaire Sent: Thursday, June 30, 2011 19:58 To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't want to drag this out much longer, but I am curious with people who hold the line-by-line sentiment - are you NHibernate users? -r On Thu, Jun 30, 2011 at 2:39 AM, Noel Lysaght lysag...@hotmail.com wrote: Can I just plug in my bit and say I agree 100% with what Moray has outlined below. If we move away from the line by line port then over time we'll loose out on the momentum that is Lucene and the improvements that they make. It is only if the Lucene.NET community has expertise in search, a deep knowledge of the project and the community can guarantee that the knowledge will survive members coming and going should such a
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
I pretty much agree with Rory. And as others have said, this issue has been discussed many times. What is most important about the fact that it has been discussed many times is that it has not been resolve, even though it has been discussed so many times. That means that the both the developer community that contributes to the project and the user community that uses the library have an interest in *both*. I think we have enough interest and support from the community to develop both of these at the same time. Some key points: - Being a useful index/search library is the goal of any implementation of Lucene. Being useful is more important than being identical to one another. Don't forget that Java Lucene has bugs, design problems, and may not always be the best implementation of Lucene. - Unit tests should validate the code's correctness in terms of functionality/bugs - The library can contain multiple APIs for the same tasks. Fluent? LINQ? Just Like Java? Just like pylucene? All of the above? - Implementation details between .NET and Java are *very* significant and often account for a lot of the bugs that are Lucene.Net only. Our attempt to be a line-by-line port is what is introducing bugs, not the the other way around - The only reason we are having this discussion is because C# and Java are very similar languages. If this was a F# port or a VB.NET port, we wouldn't even be discussing this. Instead we'd say make it work the way that makes the most sense in {{insert language here}}. That said, DIGY has a very good point. Continued development on the library is the most important part of the project's goals. A dead project helps no one. If the current active contributors are writing a line-by-line port, then that's what it will be. If they are writing a complete re-write, then that is what it will be. Some might find it easier to write line-by-line, but others might find that task daunting. The opposite is also true. It depends on the person, how much time they have, and what they consider easy or manageable or worth doing. As always, if you want the code base to be something specific, submit a patch for that, and it will be. If not, then you need to convince someone else to write that patch. And just so it's clear, *anyone* can write and submit a patch and be a contributor, not just the project committers. Thanks, Troy On Wed, Jun 29, 2011 at 3:06 PM, Rory Plaire codekai...@gmail.com wrote: For what it's worth, I've participated in a number of projects which have been ported from Java to .Net with varying levels of translation into the native style and functionalty of the .Net framework. The largest are NTS, a JTS port and NHibernate, a Java Hibernate port. My experience is that a line-by-line port isn't as valuable as people would imagine. Even if we discount the reality that a line-by-line port is really unachievable due to various differences between the frameworks, keeping even identical code in sync will always take some work: full automation on this large of a project is infeasible. During manual effort, therefore, making readable changes to the code is really not that much more work. For update maintenance, porting over code from recent versions of both projects to the .Net versions, and .Nettifying that code is little trouble. Since both projects use source control, it's easy to see the changes and translate them. When it comes to debugging issues, in NTS or NHibernate, I go to the Java sources, and even if the classes were largely rewritten to take advantage of IEnumerable or generics or structures, running unit tests, tracing the code, and seeing the output of each has always been straightforward. Since I'm using .Net, I'd want the Lucene.Net project to be more .Net than a line-by-line port of Java, in order to take advantage of the Framework as well as provide a better code base for .Net developers to maintain. If large .Net projects ported from Java do this, and have found considerable success, it is, in my view, a well-proven practice and shouldn't be avoided due to uncertainty of how the resulting code should work. Ultimately, that is what unit tests are for, anyway.
[Lucene.Net] [jira] [Commented] (LUCENENET-404) Improve brand logo design
[ https://issues.apache.org/jira/browse/LUCENENET-404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13046757#comment-13046757 ] Troy Howard commented on LUCENENET-404: --- I will post the artifacts here, and commit them to the repo. We need to get a SGA submitted from StackOverflow as well. Improve brand logo design - Key: LUCENENET-404 URL: https://issues.apache.org/jira/browse/LUCENENET-404 Project: Lucene.Net Issue Type: Sub-task Components: Project Infrastructure Reporter: Troy Howard Assignee: Troy Howard Priority: Minor Labels: branding, logo The existing Lucene.Net logo leaves a lot to be desired. We'd like a new logo that is modern and well designed. To implement this, Troy is coordinating with StackOverflow/StackExchange to manage a logo design contest, the results of which will be our new logo design. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] Fwd: Travel Assistance applications now open for ApacheCon NA 2011
-- Forwarded message -- From: Gavin McDonald ga...@16degrees.com.au Date: Jun 6, 2011 1:02 AM Subject: Travel Assistance applications now open for ApacheCon NA 2011 To: committ...@apache.org The Apache Software Foundation (ASF)'s Travel Assistance Committee (TAC) is now accepting applications for ApacheCon North America 2011, 7-11 November in Vancouver BC, Canada. The TAC is seeking individuals from the Apache community at-large --users, developers, educators, students, Committers, and Members-- who would like to attend ApacheCon, but need some financial support in order to be able to get there. There are limited places available, and all applicants will be scored on their individual merit. Financial assistance is available to cover flights/trains, accommodation and entrance fees either in part or in full, depending on circumstances. However, the support available for those attending only the BarCamp (7-8 November) is less than that for those attending the entire event (Conference + BarCamp 7-11 November). The Travel Assistance Committee aims to support all official ASF events, including cross-project activities; as such, it may be prudent for those in Asia and Europe to wait for an event geographically closer to them. More information can be found at http://www.apache.org/travel/index.html including a link to the online application and detailed instructions for submitting. Applications will close on 8 July 2011 at 22:00 BST (UTC/GMT +1). We wish good luck to all those who will apply, and thank you in advance for tweeting, blogging, and otherwise spreading the word. Regards, The Travel Assistance Committee
[Lucene.Net] [jira] [Commented] (LUCENENET-417) implement streams as field values
[ https://issues.apache.org/jira/browse/LUCENENET-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13039443#comment-13039443 ] Troy Howard commented on LUCENENET-417: --- Chris's goal here is to prevent large blobs from being placed in memory either as binary data or as string data. This is to prevent OOM exceptions on very large documents. Using Stream semantics, you can avoid this. The limitation of TextReader value types not being stored is due to the TextReader type being forward-only, which is based around how Encodings work, not due to some kind of fundamental mismatch with Lucene's business rules. There is no reason you should not be provide a resettable Stream, and an Encoding and perform the same operations, but reset the stream between tokenization and value storage stages. The only issue would be multi-threading, if tokenization and value storage were happening at the same time, they could not operate against the same stream. implement streams as field values - Key: LUCENENET-417 URL: https://issues.apache.org/jira/browse/LUCENENET-417 Project: Lucene.Net Issue Type: New Feature Components: Lucene.Net Core Reporter: Christopher Currens Attachments: BinaryStream.patch Adding binary values to a field is an expensive operation, as the whole binary data must be loaded into memory and then written to the index. Adding the ability to use a stream instead of a byte array could not only speed up the indexing process, but reducing the memory footprint as well. -Java lucene has the ability to use a TextReader the both analyze and store text in the index.- Lucene.NET lacks the ability to store string data in the index via streams. This should be a feature added into Lucene .NET as well. My thoughts are to add another Field constructor, that is Field(string name, System.IO.Stream stream, System.Text.Encoding encoding), that will allow the text to be analyzed and stored into the index. Comments about this approach are greatly appreciated. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Closed] (LUCENENET-397) Resolution of the legal issues
[ https://issues.apache.org/jira/browse/LUCENENET-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Troy Howard closed LUCENENET-397. - Resolution: Not A Problem This version of Luke.Net is dependent on WPF which conflicts with our desire for this to be cross platform. Legal clearance is not necessary. Resolution of the legal issues -- Key: LUCENENET-397 URL: https://issues.apache.org/jira/browse/LUCENENET-397 Project: Lucene.Net Issue Type: Sub-task Components: Lucene.Net Contrib Reporter: Scott Lombard Assignee: Troy Howard Priority: Blocker Labels: Luke.Net Fix For: Lucene.Net 2.9.4 Resolution of the legal issues around ingesting the code into Lucene.Net. Coordinate with Aaron Powell to obtain software grant paperwork. Per Stefan Bodewig (Incubating Mentor): All it takes is: * attach the code to a JIRA ticket. * have software grants signed by all contributors to the original code base. * write a single page for the Incubator site * start a vote on Incubator general and wait for 72 hours. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
[Lucene.Net] [jira] [Closed] (LUCENENET-398) Prepare the code for ingestion
[ https://issues.apache.org/jira/browse/LUCENENET-398?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Troy Howard closed LUCENENET-398. - Resolution: Not A Problem Not including this code due to WPF Prepare the code for ingestion -- Key: LUCENENET-398 URL: https://issues.apache.org/jira/browse/LUCENENET-398 Project: Lucene.Net Issue Type: Sub-task Components: Lucene.Net Contrib Reporter: Scott Lombard Assignee: Sergey Mirvoda Labels: Luke.Net Fix For: Lucene.Net 2.9.4 Prepare source to be imported in the Lucene.Net respository. Staging area is a bitbucket fork at: https://bitbucket.org/thoward/luke.net-incbuating from original codebase at: https://bitbucket.org/slace/luke.net See tasks on bitbucket site (forked) for source-code related issues that need to be addressed prior to ingesting into Lucene.Net codebase. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Lucene.Net] Lucene.Net Hackathon (5/13-/516)
Thanks Michael! One quick question -- the Wiki seems to be really locked down for public editing. That's kind of strange. Anyone should be able to log in and whip up a new page or edit an existing one, committer or otherwise. I didn't have access until just the other day, and Chris Currens doesn't have access now (I had to add him to the page manually). Can we open up the permissions on our wiki? Thanks, Troy On Wed, May 11, 2011 at 11:51 AM, Michael Herndon mhern...@wickedsoftware.net wrote: You never know. Personally I generally have most tech people on a list rather directly following them. But thanks. On Wed, May 11, 2011 at 2:43 PM, Wyatt Barnett wyatt.barn...@gmail.comwrote: Retweeted. Though I doubt any of the ~100 people following me aren't in the 36 following him . . . On 5/11/11 2:39 PM, Michael Herndon mhern...@wickedsoftware.net wrote: If any of you follow Hanselman on twitter, please take a second a retweet his on the lucene.net hackathon listed below or even send a thanks. Wanna get involved in Open Source? Why not help with the Lucene.NET HackAThon? http://hnsl.mn/lucenehackathon Cheers, - Michael On Mon, May 9, 2011 at 7:12 PM, Troy Howard thowar...@gmail.com wrote: Here's the wiki page: https://cwiki.apache.org/confluence/x/Go6OAQ Thanks, Troy On Mon, May 9, 2011 at 1:59 PM, Troy Howard thowar...@gmail.com wrote: Michael, That worked! I'm in the process of making a wiki page for the event now. Thanks, Troy On Mon, May 9, 2011 at 1:38 PM, Michael Herndon mhern...@wickedsoftware.net wrote: log out and log back in and verify permission changes. On Mon, May 9, 2011 at 4:22 PM, Troy Howard thowar...@gmail.com wrote: Re: I'm not sure if there is a coding difference between the C# stuff and the other directory stuff. There are a few minor code changes in the new branch vs the C# branch, but those are things like framework target, copyright notices, etc.. I didn't change code significantly, and unit tests still pass. Re: we can probably branch C# to something like pre_NewStructure I made a tag right before committing the directory changes for this exact purpose. It's here: https://svn.apache.org/repos/asf/incubator/lucene.net/tags/pre-layout-cha nge Regarding the hackathon next week, I'd like to put together a list of tasks specifically for this weekend to give people some focus on where they can contribute. Some of these will be major tasks with high priority (like finishing up the 2.9.4 release) and others will be of lower priority like working on the samples/wiki/website... Those will great skills in creating GUI apps, but less skills with writing back-end libraries might want to contribute to Luke.Net, even if it's not a high priority. I agree with Michael that we should tweet/blog/wiki/mailing list the details of the event. I would make a wiki page on the topic, but it seems I don't have sufficient privileges on our Confluence wiki to do that. Can whoever the admin is give me rights to add/edit wiki pages? My login is 'thoward'. Thanks, Troy On Mon, May 9, 2011 at 1:15 AM, Prescott Nasser geobmx...@hotmail.com wrote: I think Troy has the structure ready to roll - I'm not sure if there is a coding difference between the C# stuff and the other directory stuff. If there isn't then we can probably branch C# to something like pre_NewStructure (someone help me with a better name), then remove it from the trunk. Troy I believe was investigating the legal task - perhaps he can update us if he ever got an answer If you want to jump into a smaller task take a look at https://issues.apache.org/jira/browse/LUCENENET-372 (currently assigned to me). I updated a ton of the analyers, but I believe them to be out of date from the java 2.9.4 branch because I used the attached files from Pasha without paying attention to the age of them. So those could use a review. I also never ported the test cases, which we definately should have. Date: Mon, 9 May 2011 10:04:03 +0200 From: ma...@rotselleri.com To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Lucene.Net Hackathon (5/13-/516) On Mon, May 9, 2011 at 1:12 AM, Prescott Nasser wrote: +1 to getting 2.9.4 ready to roll + the changes to the directory structure we have going +1 for 2.9.4 and directory structure. To make that happen, I'd like to know what needs to be done and in what way I could be of any help. There are 10 open issues for 2.9.4, and (apart from the Luke issues mentioned below
Re: [Lucene.Net] Lucene.Net Hackathon (5/13-/516)
No problem. I set up the permissions such that any user account can edit/add pages in the wiki. This should make things a lot easier on us. Thanks, Troy On Wed, May 11, 2011 at 12:50 PM, Michael Herndon mhern...@wickedsoftware.net wrote: Troy, Confluence admin is not my forte, but I can look at the privileges tonight and see if we change that. You and Prescott also have admin privileges as of right now. I'm pretty much giving all committers who have forwarded their username those privileges. I've also added a snippet to the page for people to e-mail me in the meantime if they are unable to edit the page to add to the table on the hack-a-thon page. (And there are some who may just not want to join yet another wiki). Do keep an eye out for spam once we elevate privileges. - Michael On Wed, May 11, 2011 at 3:37 PM, Troy Howard thowar...@gmail.com wrote: Thanks Michael! One quick question -- the Wiki seems to be really locked down for public editing. That's kind of strange. Anyone should be able to log in and whip up a new page or edit an existing one, committer or otherwise. I didn't have access until just the other day, and Chris Currens doesn't have access now (I had to add him to the page manually). Can we open up the permissions on our wiki? Thanks, Troy On Wed, May 11, 2011 at 11:51 AM, Michael Herndon mhern...@wickedsoftware.net wrote: You never know. Personally I generally have most tech people on a list rather directly following them. But thanks. On Wed, May 11, 2011 at 2:43 PM, Wyatt Barnett wyatt.barn...@gmail.com wrote: Retweeted. Though I doubt any of the ~100 people following me aren't in the 36 following him . . . On 5/11/11 2:39 PM, Michael Herndon mhern...@wickedsoftware.net wrote: If any of you follow Hanselman on twitter, please take a second a retweet his on the lucene.net hackathon listed below or even send a thanks. Wanna get involved in Open Source? Why not help with the Lucene.NET HackAThon? http://hnsl.mn/lucenehackathon Cheers, - Michael On Mon, May 9, 2011 at 7:12 PM, Troy Howard thowar...@gmail.com wrote: Here's the wiki page: https://cwiki.apache.org/confluence/x/Go6OAQ Thanks, Troy On Mon, May 9, 2011 at 1:59 PM, Troy Howard thowar...@gmail.com wrote: Michael, That worked! I'm in the process of making a wiki page for the event now. Thanks, Troy On Mon, May 9, 2011 at 1:38 PM, Michael Herndon mhern...@wickedsoftware.net wrote: log out and log back in and verify permission changes. On Mon, May 9, 2011 at 4:22 PM, Troy Howard thowar...@gmail.com wrote: Re: I'm not sure if there is a coding difference between the C# stuff and the other directory stuff. There are a few minor code changes in the new branch vs the C# branch, but those are things like framework target, copyright notices, etc.. I didn't change code significantly, and unit tests still pass. Re: we can probably branch C# to something like pre_NewStructure I made a tag right before committing the directory changes for this exact purpose. It's here: https://svn.apache.org/repos/asf/incubator/lucene.net/tags/pre-layout-cha nge Regarding the hackathon next week, I'd like to put together a list of tasks specifically for this weekend to give people some focus on where they can contribute. Some of these will be major tasks with high priority (like finishing up the 2.9.4 release) and others will be of lower priority like working on the samples/wiki/website... Those will great skills in creating GUI apps, but less skills with writing back-end libraries might want to contribute to Luke.Net, even if it's not a high priority. I agree with Michael that we should tweet/blog/wiki/mailing list the details of the event. I would make a wiki page on the topic, but it seems I don't have sufficient privileges on our Confluence wiki to do that. Can whoever the admin is give me rights to add/edit wiki pages? My login is 'thoward'. Thanks, Troy On Mon, May 9, 2011 at 1:15 AM, Prescott Nasser geobmx...@hotmail.com wrote: I think Troy has the structure ready to roll - I'm not sure if there is a coding difference between the C# stuff and the other directory stuff. If there isn't then we can probably branch C# to something like pre_NewStructure (someone help me with a better name), then remove it from the trunk. Troy I believe was investigating the legal task - perhaps he can update us if he ever got an answer If you want to jump into a smaller task take a look at https://issues.apache.org/jira/browse/LUCENENET
Re: [Lucene.Net] Lucene.Net Hackathon (5/13-/516)
Re: I'm not sure if there is a coding difference between the C# stuff and the other directory stuff. There are a few minor code changes in the new branch vs the C# branch, but those are things like framework target, copyright notices, etc.. I didn't change code significantly, and unit tests still pass. Re: we can probably branch C# to something like pre_NewStructure I made a tag right before committing the directory changes for this exact purpose. It's here: https://svn.apache.org/repos/asf/incubator/lucene.net/tags/pre-layout-change Regarding the hackathon next week, I'd like to put together a list of tasks specifically for this weekend to give people some focus on where they can contribute. Some of these will be major tasks with high priority (like finishing up the 2.9.4 release) and others will be of lower priority like working on the samples/wiki/website... Those will great skills in creating GUI apps, but less skills with writing back-end libraries might want to contribute to Luke.Net, even if it's not a high priority. I agree with Michael that we should tweet/blog/wiki/mailing list the details of the event. I would make a wiki page on the topic, but it seems I don't have sufficient privileges on our Confluence wiki to do that. Can whoever the admin is give me rights to add/edit wiki pages? My login is 'thoward'. Thanks, Troy On Mon, May 9, 2011 at 1:15 AM, Prescott Nasser geobmx...@hotmail.comwrote: I think Troy has the structure ready to roll - I'm not sure if there is a coding difference between the C# stuff and the other directory stuff. If there isn't then we can probably branch C# to something like pre_NewStructure (someone help me with a better name), then remove it from the trunk. Troy I believe was investigating the legal task - perhaps he can update us if he ever got an answer If you want to jump into a smaller task take a look at https://issues.apache.org/jira/browse/LUCENENET-372 (currently assigned to me). I updated a ton of the analyers, but I believe them to be out of date from the java 2.9.4 branch because I used the attached files from Pasha without paying attention to the age of them. So those could use a review. I also never ported the test cases, which we definately should have. Date: Mon, 9 May 2011 10:04:03 +0200 From: ma...@rotselleri.com To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Lucene.Net Hackathon (5/13-/516) On Mon, May 9, 2011 at 1:12 AM, Prescott Nasser wrote: +1 to getting 2.9.4 ready to roll + the changes to the directory structure we have going +1 for 2.9.4 and directory structure. To make that happen, I'd like to know what needs to be done and in what way I could be of any help. There are 10 open issues for 2.9.4, and (apart from the Luke issues mentioned below) none of them makes me feel that I can grab it and start coding. -Sharpen stuff - I haven't had time to get it really working (not to mention I don't know eclipse from a hole in the ground). I haven't heard from Alex in a while, who I think is the most knowledgeable on the subject. Also most important to get closer to the java version. -.NET syntax. +1, the API often feels quite awkward to use. That said, I think Luke is important. If we left with the idea of you could run Luke in java just find, we could also just say use lucene/solr and the api provided, no need for the Lucene.Net project. (I know it's a bit different). That said, I don't think it's top priority, but it would be nice to have a .net implimentation. Agree, it would be nice to have. Sergey was working on a port of this in WPF - can he perhaps provide an update on what's going on with that? I think it was located at bit bucket at one point, and then I lost track.. The WPF track was abandoned due to absent WPF support in mono. I adopted code attached to LUCENET-391 by Pasha Bizhan and it is continued on https://github.com/mammo/LukeSharp (mirror at https://bitbucket.org/mammo/lukesharp). Testing and reporting of broken or missing features would be most appreciated. I am not sure how to resolve the Luke legal sub-task LUCENET-397, is it enough that Pasha has attached the code or is more paper work required? /amanuel
Re: [Lucene.Net] var
Yes, sorry -- I didn't mean to conflate the two issues. 'var' is just syntactic sugar. I'm more concerned with the framework support issue, which is not directly related to the use of var, but is tied in with the discussion. Thanks, Troy On Mon, May 9, 2011 at 1:18 PM, Digy digyd...@gmail.com wrote: I'll start a more official vote thread to finalize our stance. I think the general consensus is yes to var, but that might just be my bias talking. Maybe, I am missing something but var is just a syntactic sugar and changes nothing in IL level. So, I don't see a case to vote. If you think the code will be easier to read, use it. If not, don't. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Monday, May 09, 2011 10:54 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] var I'll start a more official vote thread to finalize our stance. I think the general consensus is yes to var, but that might just be my bias talking. Re: Government projects and new tech.. There is nothing stopping conservative organizations from using our previous releases. Building from source or using the bleeding edge is not a smart tactic for anyone who cares about stability, government or otherwise. -T On Mon, May 9, 2011 at 10:58 AM, Michael Herndon mhern...@wickedsoftware.net wrote: Let me know once this is a concrete answer. It needs to go on the wiki and tweeted and even blogged about. There will most likely be some push back, especially if anyone is using Lucene.Net inside of government projects. They always take forever in letting you develop with the latest stable technologies. - Michael On Sat, May 7, 2011 at 11:09 AM, Digy digyd...@gmail.com wrote: The new C# features are committed only to 2.9.4g branch. 2.9.4 can still be built targeting .NET 2.0. We can continue to support both version in parallel (in terms of bug fixes such as LUCENENET-172 LUCENENET-413, maybe LUCENENET-266) and declare that 2.9.4 will be the last version supporting 2.0 framework. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Saturday, May 07, 2011 12:06 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] var Using var is wonderful and great. We'll hopefully do doing a lot of refactoring in the near future. var makes refactoring easier. I think we've committed fairly strongly to moving past 2.0 support. AFAIK the current trunk won't build under 2.0 anyhow (or am I mistaken, DIGY used HashSetT in a recent patch, which is 3.5 or higher, and all the solutions I committed in the recent directory updates were VS2010, and all the csproj files updated to target 4.0). So, I don't see any reason to maintain 2.0 compatibility... The 4.0 runtime offers so many benefits over previous versions that, IMO, everyone who writes .NET apps should be working hard to migrate forward to 4.0 if they aren't already there. We can help the community along by giving them one more good reason to switch to a better runtime. Thanks, Troy On Sat, May 7, 2011 at 12:41 AM, Aaron Powell m...@aaron-powell.com wrote: Yes it's a C# 3 feature, but the C# 3 compiler (shipped in VS 2008) can compile C# 2.0 and C# 3.0 assemblies. Quick test: http://www.aaron-powell.com/get/var-csharp-2.PNG I don't have VS 2008 though, this test was done with VS 2010 using the multitargetting features Aaron Powell MVP - Internet Explorer (Development) | Umbraco Core Team Member | FunnelWeb Team Member http://apowell.me | http://twitter.com/slace | Skype: aaron.l.powell | MSN: aaz...@hotmail.com -Original Message- From: Prescott Nasser [mailto:geobmx...@hotmail.com] Sent: Saturday, 7 May 2011 5:32 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] var ~Prescott Nasser prescott.nas...@hotmail.com 650.208.4205 It's a 3.0 keyword, can't be used pre C# 3.0 From: m...@aaron-powell.com To: lucene-net-dev@lucene.apache.org Date: Sat, 7 May 2011 07:28:36 + Subject: RE: [Lucene.Net] var My understanding of the 'var' keyword is just C# syntactic sugar, which the compiler will translate into the actual CLR type for variable assignment, so the compiler is capable of compiling CLR 2.0 assemblies anyway. Aaron Powell MVP - Internet Explorer (Development) | Umbraco Core Team Member | FunnelWeb Team Member http://apowell.me | http://twitter.com/slace | Skype: aaron.l.powell | MSN: aaz...@hotmail.com -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Saturday, 7 May 2011 3:56 PM To: lucene-net-dev@lucene.apache.org Subject: Re
Re: [Lucene.Net] Lucene.Net Hackathon (5/13-/516)
Michael, That worked! I'm in the process of making a wiki page for the event now. Thanks, Troy On Mon, May 9, 2011 at 1:38 PM, Michael Herndon mhern...@wickedsoftware.net wrote: log out and log back in and verify permission changes. On Mon, May 9, 2011 at 4:22 PM, Troy Howard thowar...@gmail.com wrote: Re: I'm not sure if there is a coding difference between the C# stuff and the other directory stuff. There are a few minor code changes in the new branch vs the C# branch, but those are things like framework target, copyright notices, etc.. I didn't change code significantly, and unit tests still pass. Re: we can probably branch C# to something like pre_NewStructure I made a tag right before committing the directory changes for this exact purpose. It's here: https://svn.apache.org/repos/asf/incubator/lucene.net/tags/pre-layout-change Regarding the hackathon next week, I'd like to put together a list of tasks specifically for this weekend to give people some focus on where they can contribute. Some of these will be major tasks with high priority (like finishing up the 2.9.4 release) and others will be of lower priority like working on the samples/wiki/website... Those will great skills in creating GUI apps, but less skills with writing back-end libraries might want to contribute to Luke.Net, even if it's not a high priority. I agree with Michael that we should tweet/blog/wiki/mailing list the details of the event. I would make a wiki page on the topic, but it seems I don't have sufficient privileges on our Confluence wiki to do that. Can whoever the admin is give me rights to add/edit wiki pages? My login is 'thoward'. Thanks, Troy On Mon, May 9, 2011 at 1:15 AM, Prescott Nasser geobmx...@hotmail.com wrote: I think Troy has the structure ready to roll - I'm not sure if there is a coding difference between the C# stuff and the other directory stuff. If there isn't then we can probably branch C# to something like pre_NewStructure (someone help me with a better name), then remove it from the trunk. Troy I believe was investigating the legal task - perhaps he can update us if he ever got an answer If you want to jump into a smaller task take a look at https://issues.apache.org/jira/browse/LUCENENET-372 (currently assigned to me). I updated a ton of the analyers, but I believe them to be out of date from the java 2.9.4 branch because I used the attached files from Pasha without paying attention to the age of them. So those could use a review. I also never ported the test cases, which we definately should have. Date: Mon, 9 May 2011 10:04:03 +0200 From: ma...@rotselleri.com To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Lucene.Net Hackathon (5/13-/516) On Mon, May 9, 2011 at 1:12 AM, Prescott Nasser wrote: +1 to getting 2.9.4 ready to roll + the changes to the directory structure we have going +1 for 2.9.4 and directory structure. To make that happen, I'd like to know what needs to be done and in what way I could be of any help. There are 10 open issues for 2.9.4, and (apart from the Luke issues mentioned below) none of them makes me feel that I can grab it and start coding. -Sharpen stuff - I haven't had time to get it really working (not to mention I don't know eclipse from a hole in the ground). I haven't heard from Alex in a while, who I think is the most knowledgeable on the subject. Also most important to get closer to the java version. -.NET syntax. +1, the API often feels quite awkward to use. That said, I think Luke is important. If we left with the idea of you could run Luke in java just find, we could also just say use lucene/solr and the api provided, no need for the Lucene.Net project. (I know it's a bit different). That said, I don't think it's top priority, but it would be nice to have a .net implimentation. Agree, it would be nice to have. Sergey was working on a port of this in WPF - can he perhaps provide an update on what's going on with that? I think it was located at bit bucket at one point, and then I lost track.. The WPF track was abandoned due to absent WPF support in mono. I adopted code attached to LUCENET-391 by Pasha Bizhan and it is continued on https://github.com/mammo/LukeSharp (mirror at https://bitbucket.org/mammo/lukesharp). Testing and reporting of broken or missing features would be most appreciated. I am not sure how to resolve the Luke legal sub-task LUCENET-397, is it enough that Pasha has attached the code or is more paper work required? /amanuel
Re: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4
That makes sense, however my suggestion of using 2.9.5 is for the same purpose. Since the code base is now diverging from the Java library, it makes sense that the version numbers would diverge as well. The fact that there is no Java version 2.9.5 will make that Lucene.Net version stand out as having features/code which are different from the Java library. 2.9.4g sounds like a bug fix version for 2.9.4. Thanks, Troy On Mon, May 9, 2011 at 2:02 PM, Digy digyd...@gmail.com wrote: I chose the name 2.9.4g, since 2.9.5 may give a feeling of lucene.java 2.9.5 exists. 2.9.4g is somewhere between 2.9.4 3.0.3(more close to 3.0.3) DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Monday, May 09, 2011 11:54 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4 We could specify a new version starting with 2.9.4g and call it 2.9.5 ... Let 2.9.4 be 2.0 compatible, and let 2.9.5 not be. 2.9.5 would include the changes to generic collections, etc.. Thanks, Troy On Mon, May 9, 2011 at 1:49 PM, Digy digyd...@gmail.com wrote: Before 2.9.4g, I would surely say drop support for 2.0 completely. But now we have two versions(2.9.4 2.9.4g) and one can continue to support 2.0 till its death (2.9.4g may be used as base for future versions, but this is not true for 2.9.4) DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Monday, May 09, 2011 11:05 PM To: lucene-net-dev@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4 All, Please cast your votes regarding the topic of .Net Framework support. The question on the table is: Should Apache Lucene.Net 2.9.4 be the last release which supports the .Net 2.0 Framework? Some options are: [+1] - Yes, move forward to the latest .Net Framework version, and drop support for 2.0 completely. New features and performance are more important than backwards compatibility. [0] - Yes, focus on the latest .Net Framework, but also include patches and/or preprocessor directives and conditional compilation blocks to include support for 2.0 when needed. New features, performance, and backwards compatibility are all equally important and it's worth the additional complexity and coding work to meet all of those goals. [-1] No, .Net Framework 2.0 should remain our target platform. Backwards compatibility is more important than new features and performance. This vote is not limited to the Apache Lucene.Net IPMC. All users/contributors/committers/mailing list lurkers are welcome to cast their votes with an equal weight. This has been cross posted to both the dev and user mailing lists. Thanks, Troy
Re: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4
Indeed... 2.9.4g it is! G for Generics should be easy to remember. On Mon, May 9, 2011 at 2:27 PM, Digy digyd...@gmail.com wrote: It is used already. https://issues.apache.org/jira/browse/LUCENE/fixforversion/12315914 DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Tuesday, May 10, 2011 12:21 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4 That makes sense, however my suggestion of using 2.9.5 is for the same purpose. Since the code base is now diverging from the Java library, it makes sense that the version numbers would diverge as well. The fact that there is no Java version 2.9.5 will make that Lucene.Net version stand out as having features/code which are different from the Java library. 2.9.4g sounds like a bug fix version for 2.9.4. Thanks, Troy On Mon, May 9, 2011 at 2:02 PM, Digy digyd...@gmail.com wrote: I chose the name 2.9.4g, since 2.9.5 may give a feeling of lucene.java 2.9.5 exists. 2.9.4g is somewhere between 2.9.4 3.0.3(more close to 3.0.3) DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Monday, May 09, 2011 11:54 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4 We could specify a new version starting with 2.9.4g and call it 2.9.5 ... Let 2.9.4 be 2.0 compatible, and let 2.9.5 not be. 2.9.5 would include the changes to generic collections, etc.. Thanks, Troy On Mon, May 9, 2011 at 1:49 PM, Digy digyd...@gmail.com wrote: Before 2.9.4g, I would surely say drop support for 2.0 completely. But now we have two versions(2.9.4 2.9.4g) and one can continue to support 2.0 till its death (2.9.4g may be used as base for future versions, but this is not true for 2.9.4) DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Monday, May 09, 2011 11:05 PM To: lucene-net-dev@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4 All, Please cast your votes regarding the topic of .Net Framework support. The question on the table is: Should Apache Lucene.Net 2.9.4 be the last release which supports the .Net 2.0 Framework? Some options are: [+1] - Yes, move forward to the latest .Net Framework version, and drop support for 2.0 completely. New features and performance are more important than backwards compatibility. [0] - Yes, focus on the latest .Net Framework, but also include patches and/or preprocessor directives and conditional compilation blocks to include support for 2.0 when needed. New features, performance, and backwards compatibility are all equally important and it's worth the additional complexity and coding work to meet all of those goals. [-1] No, .Net Framework 2.0 should remain our target platform. Backwards compatibility is more important than new features and performance. This vote is not limited to the Apache Lucene.Net IPMC. All users/contributors/committers/mailing list lurkers are welcome to cast their votes with an equal weight. This has been cross posted to both the dev and user mailing lists. Thanks, Troy
Re: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4
My goal with moving forward to .Net 4.0 specifically, is that with 4.0 there are major improvements to the .NET GC, which we have already found in our company's testing, improves Lucene.Net's memory management and overall speed significantly. This is without any code changes, just compiling for .Net 4.0 framework target vs 2.0 or 3.5... Thanks, Troy On Mon, May 9, 2011 at 2:40 PM, Aaron Powell m...@aaron-powell.com wrote: +1 PS: If you are supporting .NET 3.5 then you get .NET 2.0 support anyway, you just have to bin-deploy the .NET 3.5 dependencies (System.Core, etc) since they are all the same CLR Aaron Powell MVP - Internet Explorer (Development) | Umbraco Core Team Member | FunnelWeb Team Member http://apowell.me | http://twitter.com/slace | Skype: aaron.l.powell | MSN: aaz...@hotmail.com -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Tuesday, 10 May 2011 6:05 AM To: lucene-net-dev@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4 All, Please cast your votes regarding the topic of .Net Framework support. The question on the table is: Should Apache Lucene.Net 2.9.4 be the last release which supports the .Net 2.0 Framework? Some options are: [+1] - Yes, move forward to the latest .Net Framework version, and drop support for 2.0 completely. New features and performance are more important than backwards compatibility. [0] - Yes, focus on the latest .Net Framework, but also include patches and/or preprocessor directives and conditional compilation blocks to include support for 2.0 when needed. New features, performance, and backwards compatibility are all equally important and it's worth the additional complexity and coding work to meet all of those goals. [-1] No, .Net Framework 2.0 should remain our target platform. Backwards compatibility is more important than new features and performance. This vote is not limited to the Apache Lucene.Net IPMC. All users/contributors/committers/mailing list lurkers are welcome to cast their votes with an equal weight. This has been cross posted to both the dev and user mailing lists. Thanks, Troy
Re: [Lucene.Net] Lucene.Net Hackathon (5/13-/516)
Here's the wiki page: https://cwiki.apache.org/confluence/x/Go6OAQ Thanks, Troy On Mon, May 9, 2011 at 1:59 PM, Troy Howard thowar...@gmail.com wrote: Michael, That worked! I'm in the process of making a wiki page for the event now. Thanks, Troy On Mon, May 9, 2011 at 1:38 PM, Michael Herndon mhern...@wickedsoftware.net wrote: log out and log back in and verify permission changes. On Mon, May 9, 2011 at 4:22 PM, Troy Howard thowar...@gmail.com wrote: Re: I'm not sure if there is a coding difference between the C# stuff and the other directory stuff. There are a few minor code changes in the new branch vs the C# branch, but those are things like framework target, copyright notices, etc.. I didn't change code significantly, and unit tests still pass. Re: we can probably branch C# to something like pre_NewStructure I made a tag right before committing the directory changes for this exact purpose. It's here: https://svn.apache.org/repos/asf/incubator/lucene.net/tags/pre-layout-change Regarding the hackathon next week, I'd like to put together a list of tasks specifically for this weekend to give people some focus on where they can contribute. Some of these will be major tasks with high priority (like finishing up the 2.9.4 release) and others will be of lower priority like working on the samples/wiki/website... Those will great skills in creating GUI apps, but less skills with writing back-end libraries might want to contribute to Luke.Net, even if it's not a high priority. I agree with Michael that we should tweet/blog/wiki/mailing list the details of the event. I would make a wiki page on the topic, but it seems I don't have sufficient privileges on our Confluence wiki to do that. Can whoever the admin is give me rights to add/edit wiki pages? My login is 'thoward'. Thanks, Troy On Mon, May 9, 2011 at 1:15 AM, Prescott Nasser geobmx...@hotmail.com wrote: I think Troy has the structure ready to roll - I'm not sure if there is a coding difference between the C# stuff and the other directory stuff. If there isn't then we can probably branch C# to something like pre_NewStructure (someone help me with a better name), then remove it from the trunk. Troy I believe was investigating the legal task - perhaps he can update us if he ever got an answer If you want to jump into a smaller task take a look at https://issues.apache.org/jira/browse/LUCENENET-372 (currently assigned to me). I updated a ton of the analyers, but I believe them to be out of date from the java 2.9.4 branch because I used the attached files from Pasha without paying attention to the age of them. So those could use a review. I also never ported the test cases, which we definately should have. Date: Mon, 9 May 2011 10:04:03 +0200 From: ma...@rotselleri.com To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Lucene.Net Hackathon (5/13-/516) On Mon, May 9, 2011 at 1:12 AM, Prescott Nasser wrote: +1 to getting 2.9.4 ready to roll + the changes to the directory structure we have going +1 for 2.9.4 and directory structure. To make that happen, I'd like to know what needs to be done and in what way I could be of any help. There are 10 open issues for 2.9.4, and (apart from the Luke issues mentioned below) none of them makes me feel that I can grab it and start coding. -Sharpen stuff - I haven't had time to get it really working (not to mention I don't know eclipse from a hole in the ground). I haven't heard from Alex in a while, who I think is the most knowledgeable on the subject. Also most important to get closer to the java version. -.NET syntax. +1, the API often feels quite awkward to use. That said, I think Luke is important. If we left with the idea of you could run Luke in java just find, we could also just say use lucene/solr and the api provided, no need for the Lucene.Net project. (I know it's a bit different). That said, I don't think it's top priority, but it would be nice to have a .net implimentation. Agree, it would be nice to have. Sergey was working on a port of this in WPF - can he perhaps provide an update on what's going on with that? I think it was located at bit bucket at one point, and then I lost track.. The WPF track was abandoned due to absent WPF support in mono. I adopted code attached to LUCENET-391 by Pasha Bizhan and it is continued on https://github.com/mammo/LukeSharp (mirror at https://bitbucket.org/mammo/lukesharp). Testing and reporting of broken or missing features would be most appreciated. I am not sure how to resolve the Luke legal sub-task LUCENET-397, is it enough that Pasha has
[Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4
All, Please cast your votes regarding the topic of .Net Framework support. The question on the table is: Should Apache Lucene.Net 2.9.4 be the last release which supports the .Net 2.0 Framework? Some options are: [+1] - Yes, move forward to the latest .Net Framework version, and drop support for 2.0 completely. New features and performance are more important than backwards compatibility. [0] - Yes, focus on the latest .Net Framework, but also include patches and/or preprocessor directives and conditional compilation blocks to include support for 2.0 when needed. New features, performance, and backwards compatibility are all equally important and it's worth the additional complexity and coding work to meet all of those goals. [-1] No, .Net Framework 2.0 should remain our target platform. Backwards compatibility is more important than new features and performance. This vote is not limited to the Apache Lucene.Net IPMC. All users/contributors/committers/mailing list lurkers are welcome to cast their votes with an equal weight. This has been cross posted to both the dev and user mailing lists. Thanks, Troy
Re: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4
We could specify a new version starting with 2.9.4g and call it 2.9.5 ... Let 2.9.4 be 2.0 compatible, and let 2.9.5 not be. 2.9.5 would include the changes to generic collections, etc.. Thanks, Troy On Mon, May 9, 2011 at 1:49 PM, Digy digyd...@gmail.com wrote: Before 2.9.4g, I would surely say drop support for 2.0 completely. But now we have two versions(2.9.4 2.9.4g) and one can continue to support 2.0 till its death (2.9.4g may be used as base for future versions, but this is not true for 2.9.4) DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Monday, May 09, 2011 11:05 PM To: lucene-net-...@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4 All, Please cast your votes regarding the topic of .Net Framework support. The question on the table is: Should Apache Lucene.Net 2.9.4 be the last release which supports the .Net 2.0 Framework? Some options are: [+1] - Yes, move forward to the latest .Net Framework version, and drop support for 2.0 completely. New features and performance are more important than backwards compatibility. [0] - Yes, focus on the latest .Net Framework, but also include patches and/or preprocessor directives and conditional compilation blocks to include support for 2.0 when needed. New features, performance, and backwards compatibility are all equally important and it's worth the additional complexity and coding work to meet all of those goals. [-1] No, .Net Framework 2.0 should remain our target platform. Backwards compatibility is more important than new features and performance. This vote is not limited to the Apache Lucene.Net IPMC. All users/contributors/committers/mailing list lurkers are welcome to cast their votes with an equal weight. This has been cross posted to both the dev and user mailing lists. Thanks, Troy
Re: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4
Yes, if you can't use a later framework, then you won't get the benefits that come with that. One of the benefits that you may not get is the latest version of the code with the least bugs. These are all factors that a organization must take into account when considering such policies. It's a tough choice to make, but even the most conservative organizations need to move forward at some point. This is the same issue that we all suffered through moving from 1.1 to 2.0... Or moving from 32bit to 64bit... etc. If there is a real technical limitation (as opposed to a 'business decision/policy'), then the best option is to branch from a previous 2.0 compatible revision, and update the code to resolve whatever issues you are encountering. Backporting from 3.5/4.0 code to 2.0 code is not that difficult, especially when we have Mono available to work from. Also, 2.9.4 (2.0 compatible) should have all the features of 2.9.4g (4.0 compatible)... That is accomplished by setting the target framework to 2.0, and using Mono implementations of HashSet/SortedSet in the SupportClass.cs. So, until we get to Lucene.Net 3.X (next version after 2.9.4), there will be support for 2.0 framework for all changes/features. For those with a situation similar to Neal's, I would consider option [0] in the vote. This option proposes maintaining 2.0 compatibility with patches/ifdef blocks, but still considering 4.0 as the primary target framework. This seems like it would be ideal for those stuck with limitations about framework support. It is unfortunately, the option that requires the most amount of coding work and the most code complexity. In general, I don't think we should consider targeting 3.5. One of the problems with 3.5 compatibility is that depending on what version of 3.5 you have (service packs, etc) you may get different results (eg, can't compile with certain builds). So if we say 3.5 is our target -- which 3.5? 4.0 may end up the same, but at the moment, it doesn't have this problem. Perhaps we should work up a For the boss page which explains, in detail, the cost/benefit analysis of choosing a version of Lucene.Net (and it's associated framework dependencies). This will assist folks who are trying to justify a particular perspective (either for/against using a particular version). Benchmarks, known bugs/bug fixes/features list, etc.. Thanks, Troy On Mon, May 9, 2011 at 2:52 PM, Granroth, Neal V. neal.granr...@thermofisher.com wrote: That only works if you are *allowed* to deploy a new or updated .NET framework on the target system, which is not always true. But the problem is not really about deployment it is really more for those of us who must compile from source and who are not permitted to upgrade our development toolset. - Neal -Original Message- From: Aaron Powell [mailto:m...@aaron-powell.com] Sent: Monday, May 09, 2011 4:41 PM To: lucene-net-...@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: RE: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4 +1 PS: If you are supporting .NET 3.5 then you get .NET 2.0 support anyway, you just have to bin-deploy the .NET 3.5 dependencies (System.Core, etc) since they are all the same CLR Aaron Powell MVP - Internet Explorer (Development) | Umbraco Core Team Member | FunnelWeb Team Member http://apowell.me | http://twitter.com/slace | Skype: aaron.l.powell | MSN: aaz...@hotmail.com -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Tuesday, 10 May 2011 6:05 AM To: lucene-net-...@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] VOTE: .NET 2.0 Framework Support After Apache Lucene.Net 2.9.4 All, Please cast your votes regarding the topic of .Net Framework support. The question on the table is: Should Apache Lucene.Net 2.9.4 be the last release which supports the .Net 2.0 Framework? Some options are: [+1] - Yes, move forward to the latest .Net Framework version, and drop support for 2.0 completely. New features and performance are more important than backwards compatibility. [0] - Yes, focus on the latest .Net Framework, but also include patches and/or preprocessor directives and conditional compilation blocks to include support for 2.0 when needed. New features, performance, and backwards compatibility are all equally important and it's worth the additional complexity and coding work to meet all of those goals. [-1] No, .Net Framework 2.0 should remain our target platform. Backwards compatibility is more important than new features and performance. This vote is not limited to the Apache Lucene.Net IPMC. All users/contributors/committers/mailing list lurkers are welcome to cast their votes with an equal weight. This has been cross posted to both the dev and user mailing lists. Thanks, Troy
Re: [Lucene.Net] svn commit: r1100639 - /incubator/lucene.net/branches/Lucene.Net_2_9_4g/src/core/Support/EquatableList.cs
I was waiting for a round of testing/consensus regarding the trunk before deleting the old directory tree under C#. I suppose it's been enough time, and I'll just delete/commit that. Thanks, Troy On Sun, May 8, 2011 at 12:16 AM, Prescott Nasser geobmx...@hotmail.comwrote: I'd be happy too - but I haven't looked at the trunk in a while - and I took a quick look at it today to get the latest, and well it was a bit confusing. We have a trunk/C# directory which seems to have the old .NET after everything, and then we have what looks like the new directory structure right in trunk/ In other words, two copies of everything in a slightly different format or am I missing something? From: digyd...@gmail.com To: lucene-net-dev@lucene.apache.org Date: Sun, 8 May 2011 01:23:49 +0300 Subject: RE: [Lucene.Net] svn commit: r1100639 - /incubator/ lucene.net/branches/Lucene.Net_2_9_4g/src/core/Support/EquatableList.cs Hi Prescott, Can you apply these fixes also to trunk? They don't diverge from java 2.9.4 and can be compiled targetting .NET 2.0. LUCENENET-361 Workaround for a Mono C# compiler issue LUCENENET-330 Search.Regex Minimal Port LUCENENET-371 Unit test for Search.Regex port DIGY -Original Message- From: pnas...@apache.org [mailto:pnas...@apache.org] Sent: Sunday, May 08, 2011 1:05 AM To: lucene-net-comm...@lucene.apache.org Subject: [Lucene.Net] svn commit: r1100639 - /incubator/ lucene.net/branches/Lucene.Net_2_9_4g/src/core/Support/EquatableList.cs Author: pnasser Date: Sat May 7 22:04:43 2011 New Revision: 1100639 URL: http://svn.apache.org/viewvc?rev=1100639view=rev Log: LUCENENET-361 Workaround for a Mono C# Compiler issue Modified: incubator/ lucene.net/branches/Lucene.Net_2_9_4g/src/core/Support/EquatableList.cs Modified: incubator/ lucene.net/branches/Lucene.Net_2_9_4g/src/core/Support/EquatableList.cs URL: http://svn.apache.org/viewvc/incubator/lucene.net/branches/Lucene.Net_2_9_4g/src/core/Support/EquatableList.cs?rev=1100639r1=1100638r2=1100639view=diff == --- incubator/ lucene.net/branches/Lucene.Net_2_9_4g/src/core/Support/EquatableList.cs(original) +++ incubator/ lucene.net/branches/Lucene.Net_2_9_4g/src/core/Support/EquatableList.csSat May 7 22:04:43 2011 @@ -230,12 +230,16 @@ namespace Lucene.Net.Support return GetHashCode(this); } +#if __MonoCS__ + public static int GetHashCode(System.Collections.Generic.IEnumerable source) +#else /// Gets the hash code for the list. - /// The + /// The /// implementation which will have all the contents hashed. /// The hash code value. public static int GetHashCode(System.Collections.Generic.IEnumerable source) - { +#endif + { // If source is null, then return 0. if (source == null) return 0;
[Lucene.Net] Lucene.Net Hackathon (5/13-/516)
All, I'll be attending the Apache Retreat in Knockree next week (5/13-5/16). I was thinking it would be fun to arrange a Lucene.Net Hackathon for the same time frame. Even if you can't go to the event, we can hack together in a distributed fashion. ;) Hackathons are usually focused on particular features/tasks. Some outstanding tasks are: - Release 2.9.4 - Work on Sharpen/fully automated port (initial implementation) - Luke.Net - .Net idiomatic API (initial design) - Improve website documentation, how tos, samples, etc - Miscellaneous alternative framework support (Compact, Silverlight) - ??? Does this sound like a fun idea? Got any other tasks we could try to tackle that didn't occur to me as I typed up this email? Even if no one else wants to get in on the Hackathon, I'll be working on Lucene.Net all weekend, so which ones of these do you think is most urgent/interesting? Thanks, Troy
Re: [Lucene.Net] separating project file from actual code files
Are you working with VS2010 Express? Or one of the full versions? If you're using Express, unfortunately, it's default settings are a bit confusing. In the create new project dialog, the 'normal' field for project directory is not displayed, only one for name. To enable this, you need to go to 'Tools-Options' make sure 'Show all settings' is enabled, select 'Projects and Solutions-General', and check 'Save new projects when created'. This will get it to behave like VS2008. If you tweak those settings (or using a paid version which has those defaults already set), simply specify the correct directory when you make your project, and the .csproj and any other .cs files will all be placed in the right spot. The solution files can be anywhere, and when you create/add the project it will just keep track of the relative paths for you. As a side note: You can get a MSDN subscription from MS via ASF if you are committer. I requested that, and got a subscription. Now I have a full install of VS2010 Ultimate that I can use when working on Lucene.Net. JetBrains also gives out licenses for ASF committers for Resharper (and many other tools). I suggest investigating this and getting yourself setup with these licensed products, as it will make this kind of thing *much* easier. Thanks, Troy On Sat, May 7, 2011 at 12:10 PM, Prescott Nasser geobmx...@hotmail.comwrote: This seems like it should be obvious, but I'm trying to create a new Contrib.Regex project for the Regex stuff. However, when I do this, VS wants to put all the .cs files into the build directory where the project file is. How do I tell the project file that the actual files for the project should be located in another folder? I've taken a look at some of hte other projects but I can't see where this is designated ~P
Re: [Lucene.Net] new structure
Only thing I would suggest is keeping .cmd/bin files in the build folder. The bin folder is meant for the compiled artifacts. Otherwise, everything else sounds great. Thanks, Troy On Fri, Apr 29, 2011 at 1:08 PM, Michael Herndon mhern...@wickedsoftware.net wrote: If you think it would be beneficial to have the scripts in the branch, I can do that. On Fri, Apr 29, 2011 at 3:50 PM, Digy digyd...@gmail.com wrote: Would you add the same stuff to 2.9.4g branch too? DIGY -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Friday, April 29, 2011 10:28 PM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] new structure I'm going to move ahead with this stuff this weekend unless anyone objects. On Sun, Apr 24, 2011 at 4:42 PM, Michael Herndon mhern...@wickedsoftware.net wrote: if you celebrate Easter, Happy Easter, if not, then Happy lucene.netclean up day. couple of questions. would it be cool if I can add a .gitignore to the root folder? also would it upset anyone if I add .cmd .sh files to the /bin folder and .xml/.build files to the /build folder ? and sand castle and shfb to the /lib folder? - Michael On Sat, Apr 23, 2011 at 7:57 AM, Digy digyd...@gmail.com wrote: Everything seems to be OK. +1 for removing old directory structure. Thanks Troy DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Saturday, April 23, 2011 3:07 AM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] new structure I guess by 'today' I meant 'In about 6 days'. Anyhow, I completed the commit of the new directory structure.. I did not delete the OLD directory structure, because they can live side-by-side. Also, please note that I only created vs2010 solutions and upgraded the projects to same. Please pull down the latest revision and validate these changes. If all goes well, I'll delete the old directory structure (everything under the 'C#' directory). Thanks, Troy On Sat, Apr 16, 2011 at 3:42 PM, Troy Howard thowar...@gmail.com wrote: Apologize. I got a bit derailed. Will be commiting today. On Apr 16, 2011 2:20 PM, Prescott Nasser geobmx...@hotmail.com wrote: Hey Troy any status update on the new structure? I'm hesistant to do updates since I know you're going to be modifying it all shortly ~P
Re: [Lucene.Net] new structure
I guess by 'today' I meant 'In about 6 days'. Anyhow, I completed the commit of the new directory structure.. I did not delete the OLD directory structure, because they can live side-by-side. Also, please note that I only created vs2010 solutions and upgraded the projects to same. Please pull down the latest revision and validate these changes. If all goes well, I'll delete the old directory structure (everything under the 'C#' directory). Thanks, Troy On Sat, Apr 16, 2011 at 3:42 PM, Troy Howard thowar...@gmail.com wrote: Apologize. I got a bit derailed. Will be commiting today. On Apr 16, 2011 2:20 PM, Prescott Nasser geobmx...@hotmail.com wrote: Hey Troy any status update on the new structure? I'm hesistant to do updates since I know you're going to be modifying it all shortly ~P
Re: [Lucene.Net] Joining the Revolution
I considered going, but the price was a bit high. I AM going to the Apache Knockree retreat in May... My request for travel assistance was granted! Yippee! Anyone else going? Thanks, Troy On Thu, Apr 21, 2011 at 1:37 PM, Wyatt Barnett wyatt.barn...@gmail.comwrote: Comerade? Seems like there is a Lucene conference: http://www.lucenerevolution.com/ ; Anyone planning on attending?
Re: [Lucene.Net] Link to binary distribution is broken..
Thanks for the report, Ryan. I fixed the issue, and it should be updating shortly, -Troy On Tue, Apr 19, 2011 at 6:43 AM, Ryan Barrett r...@seldonplan.com wrote: I guess this is the right place to mention this? Think link to the latest binary distribution of DotNet Lucene here: http://incubator.apache.org/lucene.net/conversation.html Points towards: lia href= http://http://www.apache.org/dist/incubator/lucene.net/binaries/ http://http//www.apache.org/dist/incubator/lucene.net/binaries/ Binary/a/li Double http://, thus. Guess someone on this list can fix it? Or can pass the message on? -- Ryan
Re: [Lucene.Net] release 2.9.4
Yes. Once we're ready to call this revision an RC, it should be tagged as such. Wyatt: Thanks for helping to test! Looking forward to your results. Thanks, Troy On Tue, Apr 5, 2011 at 11:37 AM, Granroth, Neal V. neal.granr...@thermofisher.com wrote: No, the URL in DIGY's email apepars correct and the SVN revision appears to be 1086410. Question: Should there be a tag for Lucene.Net_2_9_4 as there are for previous release candidates? - Neal -Original Message- From: Wyatt Barnett [mailto:wyatt.barn...@gmail.com] Sent: Tuesday, April 05, 2011 12:15 PM To: lucene-net-dev@lucene.apache.org Cc: digy digy Subject: Re: [Lucene.Net] release 2.9.4 Thanks. For anyone watching, the corrected clickable link is https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/C%23/. Also, just to make sure we are looking at this right, the revision we should be using is 1089138 -- main thing is I've been in and out of town, not caught up on anything and I'd hate to start building stuff against the wrong version . . On Tue, Apr 5, 2011 at 1:10 PM, digy digy digyd...@gmail.com wrote: Sorry, no binaries. You can download the source from https://svn.apache.org/repos/asf/incubator/lucene.net/trunk/C#/src/Lucene.Net DIGY On Tue, Apr 5, 2011 at 12:12 AM, Wyatt Barnett wyatt.barn...@gmail.comwrote: Actually about to dive into a big search tweaking spike in a certain project here, happy to do it on 2.9.4. Got binaries? On Mon, Apr 4, 2011 at 12:27 PM, Troy Howard thowar...@gmail.com wrote: We don't have any sort of QA report on the latest build. DIGY called for testing, but I haven't seen anyone respond to that request indicating successful testing. So, how do we want to manage this? In the business world, we'd never think of making a release without extensive QA first. In my other open source projects, either we've managed QA ourselves by 'switching hats' for a couple weeks prior to release, or just crossed our fingers because the user base was too small. Lucene.Net is a fairly high-profile project, with a large user base. I think it would not be responsible to make a release without a formal QA process. We do have extensive unit tests, but do you think those are sufficient to cover our QA needs? Should we try to find community members with a specialty in software testing that would be willing to fulfill this role on our project? Should we just swap hats? I didn't worry about this issue with the latest 2.9.2 release because it was QAed by the user base for a long time before it was an 'official release'. Maybe this is an effective tactic? Release first, and let the user base roll in bug reports fixing them on yet later minor maintenance releases? This seems to be the method a lot of projects use (i.e. no specific QA process, but rather an organic process of 'try our best then deal with bug reports later'). What do we think about this? Thanks, Troy On Sun, Apr 3, 2011 at 11:59 PM, Prescott Nasser geobmx...@hotmail.com wrote: Hey all, I know we have a number of outstanding JIRA issues, but I think most of them have been handled for the 2.9.4 release? Do we have anything outstanding that is holding back a new release? ~P
Re: [Lucene.Net] release 2.9.4
We don't have any sort of QA report on the latest build. DIGY called for testing, but I haven't seen anyone respond to that request indicating successful testing. So, how do we want to manage this? In the business world, we'd never think of making a release without extensive QA first. In my other open source projects, either we've managed QA ourselves by 'switching hats' for a couple weeks prior to release, or just crossed our fingers because the user base was too small. Lucene.Net is a fairly high-profile project, with a large user base. I think it would not be responsible to make a release without a formal QA process. We do have extensive unit tests, but do you think those are sufficient to cover our QA needs? Should we try to find community members with a specialty in software testing that would be willing to fulfill this role on our project? Should we just swap hats? I didn't worry about this issue with the latest 2.9.2 release because it was QAed by the user base for a long time before it was an 'official release'. Maybe this is an effective tactic? Release first, and let the user base roll in bug reports fixing them on yet later minor maintenance releases? This seems to be the method a lot of projects use (i.e. no specific QA process, but rather an organic process of 'try our best then deal with bug reports later'). What do we think about this? Thanks, Troy On Sun, Apr 3, 2011 at 11:59 PM, Prescott Nasser geobmx...@hotmail.com wrote: Hey all, I know we have a number of outstanding JIRA issues, but I think most of them have been handled for the 2.9.4 release? Do we have anything outstanding that is holding back a new release? ~P
Re: [Lucene.Net] [VOTE] New Directory Layout for Project
Looks like we have a 'lazy consensus', in that, no one has raised any significant objections, a few minor modifications have been suggested (which sound totally reasonable), and those who did vote were positive. Barring any objections, this vote passes. Since DIGY and Scott seem to have gotten the bulk of the work on 2.9.4 finished, I think now is a good time to start the directory layout changes, and it won't be too intrusive to any active commits. I'll start on that this week. If you have any pending commits that would be totally screwed up by this directory change, please finalize those as soon as possible! Otherwise I'll be moving things around and your patches/commits might not be able to find the appropriate files. Thanks, Troy On Sun, Mar 20, 2011 at 12:44 AM, Prescott Nasser geobmx...@hotmail.com wrote: Any more thoughts on the directory structure? Quick Recap: We have Troy's original proposal here: http://people.apache.org/~thoward/Lucene.Net/directory-structure-example/ bin/ build/ (various solution and project files) vs2008/ vs2010/ doc/ lib/ - third party libraries to make it easy to pull down the source and go src/ contrib/ core/ demo/ test/ contrib/ core/ demo/ From here, I further suggested cleaning up the contrib folder - because we have extra folders: src/contrib/contrib.net/contrib.net/ - src/contrib/contrib.net/ src/contrib/snowball/snowball.net/ - src/contrib/Snowball.net/ Digy further suggested dropping the .net in all those folders above, and finding a better name for contrib.net. Date: Thu, 10 Mar 2011 09:41:17 +0200 From: digyd...@gmail.com To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] [VOTE] New Directory Layout for Project Well, not really core. Codes under Analyzer(by DIGY) can be moved to /src/contrib/analyzers (but they are not ports from java). The others(by M.GARSKI) are extensions to the core(something like Lucene.Net.Core.Extensions) DIGY On Thu, Mar 10, 2011 at 1:36 AM, Troy Howard wrote: Yeah -- I also changed the Contrib.Net project folder name to ~/src/contrib/core ... IMO we should just roll these into the main library if they are solid, tested and useful.. This is keeping in line with our new philosophy about allowing .NET specific changes, even if it means diverging from Java Lucene to do it. Thanks, Troy On Wed, Mar 9, 2011 at 12:56 PM, Prescott Nasser wrote: Actually what IS contrib.net? It looks like it replaces certain files in Lucene.Net core - are they files better suited to .net? What are they? If they are plugins / additional contributions like snowball, etc - why not just break it out and include the appropriate stuff in contrib? Do we need to specify that they are not avaliable in the java version? Date: Wed, 9 Mar 2011 22:18:22 +0200 From: digyd...@gmail.com To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] [VOTE] New Directory Layout for Project 0 .Nets seem to be redundant under /src/contrib/ . It could be something like Analyzers Highlighter Similarity ... (Maybe, we should find a different name for contrib.net. It contains contributions specific to Lucene.Net which are not available in Lucene.java) DIGY On Wed, Mar 9, 2011 at 9:08 PM, Prescott Nasser wrote: Probably just a miss - but under the src/contrib folder you also have a number of tests in there... Also, is it necessary to have all the sub folders? For the most part the stuff in contrib.net is contrib.net - why the secondary folder? Unless that is a requirement of NUnit to have the structure that way it seems a bit cluttered. I would think something like src/contrib/contrib.net/ src/contrib/Snowball.net/ instead of src/contrib/contrib.net/contrib.net/ src/contrib/snowball/snowball.net/ I don't know how people feel about that ~P Date: Wed, 9 Mar 2011 13:31:34 -0500 From: mhern...@wickedsoftware.net To: lucene-net-...@lucene.apache.org CC: thowar...@gmail.com Subject: Re: [Lucene.Net] [VOTE] New Directory Layout for Project +1 just a question though. for cmd/bat//sh files for letting people executing the build or just executing other tools from the command line, would those have a place in /bin or somewhere els? This is that someone can just export PATH = / SET PATH= to that one folder and then be able to execute those commands from one location? On Sun, Mar 6, 2011 at 11:27 PM, Troy Howard wrote: All, We'd like to update the project directory structure/layout. See below
Re: [Lucene.Net] [VOTE] New Directory Layout for Project
Sounds good. I'll make a tag prior to starting the directory changes, but I'll commit changes to trunk. Thanks, Troy On Tue, Mar 29, 2011 at 11:55 AM, digy digy digyd...@gmail.com wrote: +1. No pending commits. A copy of the current trunk somewhere else(tag, branches etc.) would be good too. DIGY. On Tue, Mar 29, 2011 at 9:38 PM, Troy Howard thowar...@gmail.com wrote: Looks like we have a 'lazy consensus', in that, no one has raised any significant objections, a few minor modifications have been suggested (which sound totally reasonable), and those who did vote were positive. Barring any objections, this vote passes. Since DIGY and Scott seem to have gotten the bulk of the work on 2.9.4 finished, I think now is a good time to start the directory layout changes, and it won't be too intrusive to any active commits. I'll start on that this week. If you have any pending commits that would be totally screwed up by this directory change, please finalize those as soon as possible! Otherwise I'll be moving things around and your patches/commits might not be able to find the appropriate files. Thanks, Troy On Sun, Mar 20, 2011 at 12:44 AM, Prescott Nasser geobmx...@hotmail.com wrote: Any more thoughts on the directory structure? Quick Recap: We have Troy's original proposal here: http://people.apache.org/~thoward/Lucene.Net/directory-structure-example/ bin/ build/ (various solution and project files) vs2008/ vs2010/ doc/ lib/ - third party libraries to make it easy to pull down the source and go src/ contrib/ core/ demo/ test/ contrib/ core/ demo/ From here, I further suggested cleaning up the contrib folder - because we have extra folders: src/contrib/contrib.net/contrib.net/ - src/contrib/contrib.net/ src/contrib/snowball/snowball.net/ - src/contrib/Snowball.net/ Digy further suggested dropping the .net in all those folders above, and finding a better name for contrib.net. Date: Thu, 10 Mar 2011 09:41:17 +0200 From: digyd...@gmail.com To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] [VOTE] New Directory Layout for Project Well, not really core. Codes under Analyzer(by DIGY) can be moved to /src/contrib/analyzers (but they are not ports from java). The others(by M.GARSKI) are extensions to the core(something like Lucene.Net.Core.Extensions) DIGY On Thu, Mar 10, 2011 at 1:36 AM, Troy Howard wrote: Yeah -- I also changed the Contrib.Net project folder name to ~/src/contrib/core ... IMO we should just roll these into the main library if they are solid, tested and useful.. This is keeping in line with our new philosophy about allowing .NET specific changes, even if it means diverging from Java Lucene to do it. Thanks, Troy On Wed, Mar 9, 2011 at 12:56 PM, Prescott Nasser wrote: Actually what IS contrib.net? It looks like it replaces certain files in Lucene.Net core - are they files better suited to .net? What are they? If they are plugins / additional contributions like snowball, etc - why not just break it out and include the appropriate stuff in contrib? Do we need to specify that they are not avaliable in the java version? Date: Wed, 9 Mar 2011 22:18:22 +0200 From: digyd...@gmail.com To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] [VOTE] New Directory Layout for Project 0 .Nets seem to be redundant under /src/contrib/ . It could be something like Analyzers Highlighter Similarity ... (Maybe, we should find a different name for contrib.net. It contains contributions specific to Lucene.Net which are not available in Lucene.java) DIGY On Wed, Mar 9, 2011 at 9:08 PM, Prescott Nasser wrote: Probably just a miss - but under the src/contrib folder you also have a number of tests in there... Also, is it necessary to have all the sub folders? For the most part the stuff in contrib.net is contrib.net - why the secondary folder? Unless that is a requirement of NUnit to have the structure that way it seems a bit cluttered. I would think something like src/contrib/contrib.net/ src/contrib/Snowball.net/ instead of src/contrib/contrib.net/contrib.net/ src/contrib/snowball/snowball.net/ I don't know how people feel about that ~P Date: Wed, 9 Mar 2011 13:31:34 -0500 From: mhern...@wickedsoftware.net To: lucene-net-...@lucene.apache.org CC: thowar...@gmail.com Subject: Re: [Lucene.Net] [VOTE] New Directory Layout for Project +1
Re: [Lucene.Net] [VOTE] New Directory Layout for Project
Sounds good to me. I have done this previously in a local branch and noticed massive performance improvements. Removing all the casting in the library makes for dramatic speedups. As a side note: Chris Currens is in the process of benchmarking Lucene.Net running under .NET 4.0 vs 3.5 vs 2.0... This benchmarking is to prove what we found in our production deployments... Compiling and deploying as a .NET 4.0 assembly results in major improvements in both speed and correct memory handling (memory leaks magically disappear). We want to prove this with benchmarks before publishing a definitive statement about this however. If this is the case, there might be a very compelling reason to move forward to 4.0 runtime for Lucene.Net. Thanks, Troy On Tue, Mar 29, 2011 at 12:23 PM, digy digy digyd...@gmail.com wrote: After this directory layout changes; what about replacing ArrayLists, Hashtables etc, with appropriate Generics? This would bring us very close to lucene 3.0.3 (and not hard to do with the help of VS). DIGY On Tue, Mar 29, 2011 at 10:02 PM, Troy Howard thowar...@gmail.com wrote: Sounds good. I'll make a tag prior to starting the directory changes, but I'll commit changes to trunk. Thanks, Troy On Tue, Mar 29, 2011 at 11:55 AM, digy digy digyd...@gmail.com wrote: +1. No pending commits. A copy of the current trunk somewhere else(tag, branches etc.) would be good too. DIGY. On Tue, Mar 29, 2011 at 9:38 PM, Troy Howard thowar...@gmail.com wrote: Looks like we have a 'lazy consensus', in that, no one has raised any significant objections, a few minor modifications have been suggested (which sound totally reasonable), and those who did vote were positive. Barring any objections, this vote passes. Since DIGY and Scott seem to have gotten the bulk of the work on 2.9.4 finished, I think now is a good time to start the directory layout changes, and it won't be too intrusive to any active commits. I'll start on that this week. If you have any pending commits that would be totally screwed up by this directory change, please finalize those as soon as possible! Otherwise I'll be moving things around and your patches/commits might not be able to find the appropriate files. Thanks, Troy On Sun, Mar 20, 2011 at 12:44 AM, Prescott Nasser geobmx...@hotmail.com wrote: Any more thoughts on the directory structure? Quick Recap: We have Troy's original proposal here: http://people.apache.org/~thoward/Lucene.Net/directory-structure-example/ bin/ build/ (various solution and project files) vs2008/ vs2010/ doc/ lib/ - third party libraries to make it easy to pull down the source and go src/ contrib/ core/ demo/ test/ contrib/ core/ demo/ From here, I further suggested cleaning up the contrib folder - because we have extra folders: src/contrib/contrib.net/contrib.net/ - src/contrib/contrib.net/ src/contrib/snowball/snowball.net/ - src/contrib/Snowball.net/ Digy further suggested dropping the .net in all those folders above, and finding a better name for contrib.net. Date: Thu, 10 Mar 2011 09:41:17 +0200 From: digyd...@gmail.com To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] [VOTE] New Directory Layout for Project Well, not really core. Codes under Analyzer(by DIGY) can be moved to /src/contrib/analyzers (but they are not ports from java). The others(by M.GARSKI) are extensions to the core(something like Lucene.Net.Core.Extensions) DIGY On Thu, Mar 10, 2011 at 1:36 AM, Troy Howard wrote: Yeah -- I also changed the Contrib.Net project folder name to ~/src/contrib/core ... IMO we should just roll these into the main library if they are solid, tested and useful.. This is keeping in line with our new philosophy about allowing .NET specific changes, even if it means diverging from Java Lucene to do it. Thanks, Troy On Wed, Mar 9, 2011 at 12:56 PM, Prescott Nasser wrote: Actually what IS contrib.net? It looks like it replaces certain files in Lucene.Net core - are they files better suited to .net? What are they? If they are plugins / additional contributions like snowball, etc - why not just break it out and include the appropriate stuff in contrib? Do we need to specify that they are not avaliable in the java version? Date: Wed, 9 Mar 2011 22:18:22 +0200 From: digyd...@gmail.com To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] [VOTE] New Directory Layout for Project 0 .Nets seem to be redundant under /src/contrib/ . It could be something like Analyzers
Re: [Lucene.Net] Arithmetic Overflow
I agree that it should not throw an overflow exception. That's not very helpful for debugging. The valid range of that parameter is essentially [0... Int32.MaxValue-1].. That's because it needs one extra slot in it's internal array, beyond the max size for juggling values during sort. The exception is thrown when it tries to setup the array to be the parameter value +1 to accommodate that... So the best we could do would be to throw a nicer exception when someone passes an invalid value. Until a nicer exception can be worked into the code, the documentation should be updated to reflect that limit... Thanks, Troy On Mon, Mar 21, 2011 at 12:15 PM, steven.h...@pattersoncompanies.com wrote: While I understand that and thank you for taking the time to respond, my concern was that passing a perfectly valid parameter leads to an exception. So this is bad design. Regards, Steve Hoff From: Digy digyd...@gmail.com To: lucene-net-dev@lucene.apache.org Date: 03/21/2011 01:53 PM Subject: RE: [Lucene.Net] Arithmetic Overflow This is an expected behaviour. You get arithmetic overflow since an array[MaxInt] is created in PriorityQueue to store the results. (Lucene doesn't know the result-count before iterating over the results). If you want to collect all results( say,not the top 10 or 100), you can use HitCollector. DIGY -Original Message- From: steven.h...@pattersoncompanies.com [mailto:steven.h...@pattersoncompanies.com] Sent: Monday, March 21, 2011 7:10 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Arithmetic Overflow I'm not sure how to report this or if this is working as intended, however using the recommended Search overload of IndexSearcher and passing Int32.MaxValue, results in an arithmetic overflow exception in PriorityQueue.Initialize(int maxSize). I tested this out on several platforms, including Windows 7 - 32, 64, Server 2003, Server 2008. Perhaps the parameter that is passed to the Search method should be updated to reflect a more realistic value? Regards, Steve Hoff NOTICE: This email transmission and any attachments that accompany it may contain information that is confidential or otherwise exempt from disclosure under applicable law and is intended solely for the use of the individual(s) to whom it was intended to be addressed. If you have received this email by mistake, or you are not the intended recipient, any disclosure, dissemination, distribution, copying or other use or retention of this communication or its substance is prohibited. If you have received this communication in error, please immediately report to the author via email that you received this message by mistake and also permanently destroy printed copies and delete the original and all copies of this email and any attachments from your computer. NOTICE: This email transmission and any attachments that accompany it may contain information that is confidential or otherwise exempt from disclosure under applicable law and is intended solely for the use of the individual(s) to whom it was intended to be addressed. If you have received this email by mistake, or you are not the intended recipient, any disclosure, dissemination, distribution, copying or other use or retention of this communication or its substance is prohibited. If you have received this communication in error, please immediately report to the author via email that you received this message by mistake and also permanently destroy printed copies and delete the original and all copies of this email and any attachments from your computer.
[Lucene.Net] [jira] [Commented] (LUCENENET-391) Luke.Net for Lucene.Net
[ https://issues.apache.org/jira/browse/LUCENENET-391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13009268#comment-13009268 ] Troy Howard commented on LUCENENET-391: --- Amanuel: +1 Agreed Luke.Net for Lucene.Net --- Key: LUCENENET-391 URL: https://issues.apache.org/jira/browse/LUCENENET-391 Project: Lucene.Net Issue Type: New Feature Components: Lucene.Net Contrib Reporter: Pasha Bizhan Assignee: Sergey Mirvoda Priority: Minor Labels: Luke.Net Fix For: Lucene.Net 2.9.4 Attachments: luke-net-bin.zip, luke-net-src.zip Create a port of Java Luke to .NET for use with Lucene.Net See attachments for a 1.4 compatible version or https://bitbucket.org/thoward/luke.net-incbuating for a partial implementation that is 2.9.2 compatible. The attached version was contributed by Pasha Bizhan, and the bitbucket version was contributed by Aaron Powell (above version is a fork, original at https://bitbucket.org/slace/luke.net). If source code from either is used, a software grant must be provided from the original authors. The final version should be 2.9.4 compatible and implement most or all features of Java Luke 1.0.1 (see http://code.google.com/p/luke/ ). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: [Lucene.Net] Arithmetic Overflow
That's a good point. There are a lot of scenarios where exceptions could still occur, even with parameter validation. The difference is, in some situations, MaxInt/2 is a valid value. However, Int32.MaxValue is never a valid value due to the internal sorting logic requiring an additional array slot beyond the parameter value. Another example is any negative value. Negative values are always invalid for this parameter. The parameter values should be checked for known 'always invalid' conditions, and ArgumentException should be thrown explaining the problem. Thanks, Troy On Mon, Mar 21, 2011 at 12:58 PM, Digy digyd...@gmail.com wrote: Is -for ex- MaxInt/2 a valid value? You can get easily an OOM exception( and the worst case would be getting sporadic OOM when search fequency increases). DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Monday, March 21, 2011 9:43 PM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Arithmetic Overflow I agree that it should not throw an overflow exception. That's not very helpful for debugging. The valid range of that parameter is essentially [0... Int32.MaxValue-1].. That's because it needs one extra slot in it's internal array, beyond the max size for juggling values during sort. The exception is thrown when it tries to setup the array to be the parameter value +1 to accommodate that... So the best we could do would be to throw a nicer exception when someone passes an invalid value. Until a nicer exception can be worked into the code, the documentation should be updated to reflect that limit... Thanks, Troy On Mon, Mar 21, 2011 at 12:15 PM, steven.h...@pattersoncompanies.com wrote: While I understand that and thank you for taking the time to respond, my concern was that passing a perfectly valid parameter leads to an exception. So this is bad design. Regards, Steve Hoff From: Digy digyd...@gmail.com To: lucene-net-...@lucene.apache.org Date: 03/21/2011 01:53 PM Subject: RE: [Lucene.Net] Arithmetic Overflow This is an expected behaviour. You get arithmetic overflow since an array[MaxInt] is created in PriorityQueue to store the results. (Lucene doesn't know the result-count before iterating over the results). If you want to collect all results( say,not the top 10 or 100), you can use HitCollector. DIGY -Original Message- From: steven.h...@pattersoncompanies.com [mailto:steven.h...@pattersoncompanies.com] Sent: Monday, March 21, 2011 7:10 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Arithmetic Overflow I'm not sure how to report this or if this is working as intended, however using the recommended Search overload of IndexSearcher and passing Int32.MaxValue, results in an arithmetic overflow exception in PriorityQueue.Initialize(int maxSize). I tested this out on several platforms, including Windows 7 - 32, 64, Server 2003, Server 2008. Perhaps the parameter that is passed to the Search method should be updated to reflect a more realistic value? Regards, Steve Hoff NOTICE: This email transmission and any attachments that accompany it may contain information that is confidential or otherwise exempt from disclosure under applicable law and is intended solely for the use of the individual(s) to whom it was intended to be addressed. If you have received this email by mistake, or you are not the intended recipient, any disclosure, dissemination, distribution, copying or other use or retention of this communication or its substance is prohibited. If you have received this communication in error, please immediately report to the author via email that you received this message by mistake and also permanently destroy printed copies and delete the original and all copies of this email and any attachments from your computer. NOTICE: This email transmission and any attachments that accompany it may contain information that is confidential or otherwise exempt from disclosure under applicable law and is intended solely for the use of the individual(s) to whom it was intended to be addressed. If you have received this email by mistake, or you are not the intended recipient, any disclosure, dissemination, distribution, copying or other use or retention of this communication or its substance is prohibited. If you have received this communication in error, please immediately report to the author via email that you received this message by mistake and also permanently destroy printed copies and delete the original and all copies of this email and any attachments from your computer.
Re: [Lucene.Net] [VOTE] New Directory Layout for Project
I agree that the names are redundant. I definitely want to drop the .Net from the project names. At some point I'd like to revise the namespaces in the same manner, removing the superfluous '.Net' elements: Apache.Lucene.Core.xxx, Apache.Lucene.Contrib.xxx, Apache.Lucene.Demo.xxx etc... I also agree that there's no need for additional sub-directories under the projects folders... I've updated the example directory structure under contrib to remove the test projects I accidentally left in there, remove the extra depth of foldering, and remove the '.Net' designation from the folders. Project files would be updated as well (but I didn't do that). Michael: Regarding build scripts, those could go in ~/build directly, or maybe make a subdir there ~/build/scripts ... Thanks, Troy On Wed, Mar 9, 2011 at 12:18 PM, digy digy digyd...@gmail.com wrote: 0 .Nets seem to be redundant under /src/contrib/ . It could be something like Analyzers Highlighter Similarity ... (Maybe, we should find a different name for contrib.net. It contains contributions specific to Lucene.Net which are not available in Lucene.java) DIGY On Wed, Mar 9, 2011 at 9:08 PM, Prescott Nasser geobmx...@hotmail.comwrote: Probably just a miss - but under the src/contrib folder you also have a number of tests in there... Also, is it necessary to have all the sub folders? For the most part the stuff in contrib.net is contrib.net - why the secondary folder? Unless that is a requirement of NUnit to have the structure that way it seems a bit cluttered. I would think something like src/contrib/contrib.net/ src/contrib/Snowball.net/ instead of src/contrib/contrib.net/contrib.net/ src/contrib/snowball/snowball.net/ I don't know how people feel about that ~P Date: Wed, 9 Mar 2011 13:31:34 -0500 From: mhern...@wickedsoftware.net To: lucene-net-dev@lucene.apache.org CC: thowar...@gmail.com Subject: Re: [Lucene.Net] [VOTE] New Directory Layout for Project +1 just a question though. for cmd/bat//sh files for letting people executing the build or just executing other tools from the command line, would those have a place in /bin or somewhere els? This is that someone can just export PATH = / SET PATH= to that one folder and then be able to execute those commands from one location? On Sun, Mar 6, 2011 at 11:27 PM, Troy Howard wrote: All, We'd like to update the project directory structure/layout. See below for a proposed layout. I've also uploaded an example which you can navigate at: http://people.apache.org/~thoward/Lucene.Net/directory-structure-example NOTE: This will not build!! I just put things in the appropriate places without updating the solution/project files to show how we might lay things out. Also, I included NUnit as an example of a third-party dependency that we might include in the repository under 'lib'. We of course will *not* be distributing NUnit in this manner, due to licensing restrictions. Ok, disclaimer over... Please vote on this layout, or suggest a modification or alternative layout. Voting will be open for 72 hours. [ ] +1 Use this directory structure exactly as described, or with a minor modification [ ] 0 Use a different structure (described in response) [ ] -1 Do not change the directory structure at all Text description of directory schema: Build Files: \build \build\VS2008 \build\VS2010 Source Projects: \src \src\contrib \src\core \src\demo \src\contrib\ \src\core\ \src\demo\ Test Projects: \test \test\contrib \test\core \test\demo \test\contrib\ \test\core\ \test\demo\ Product Documentation: \doc \doc\contrib \doc\core \doc\demo \doc\contrib\ \doc\core\ \doc\demo\ Third-Party Dependencies: \lib \lib\ \lib\\ \lib\\\ Binary Builds: \bin \bin\contrib \bin\core \bin\demo \bin\contrib\ \bin\core\ \bin\demo\
Re: [Lucene.Net] Google Ranking with Lucene
Another good way to put this: Google is an application which crawls and indexes the internet and provides search functionality to end users. It uses domain specific logic to perform it's ranking to improve relevance. Lucene and Lucene.Net is a library, which allows end users to build full text search into applications. It has no domain-specific logic built into it, but rather provides the building blocks that can be used to build any kind of domain specific search logic. You certainly could implement something like Google's Page Rank algorithm using Lucene's scoring system, but it would be a custom, application/domain specific implementation.. Not something that would be appropriate for a library like Lucene. This might be better posed to the Apache Nutch project which uses Lucene and Solr as part of a Google-like web crawler/indexer. You can read about Apache Nutch here: http://nutch.apache.org/index.html Thanks, Troy On Mon, Mar 7, 2011 at 6:35 PM, Prescott Nasser geobmx...@hotmail.com wrote: Hello Björn. In short, no. What google does to their algorithmic search is extremely secretive. It is also a very limited subset of the type of data Lucene.Net might store. It uses lots of signals from around the web, such as how many people link to a particular page, to guage the important of a particular search result. This becomes irrelveant if you are indexing something such as research documents. People don't link to other documents in the traditional sense. Further, the recent changes they made really focused on downranking content farms where data is cheaply put together and lots of advertising is added to generate advertising revenues. Their changes don't really apply to Lucene. That would more apply to how you decide to index your content, how deal with strengtheners, etc. Hope that helps, ~Prescott Nasser Date: Mon, 7 Mar 2011 11:12:17 +0100 From: b...@patorg.de To: lucene-net-dev@lucene.apache.org Subject: [Lucene.Net] Google Ranking with Lucene Hello, i have just read that google has optimised its ranking. Now google shows more relevant results on the first pagen as before. Is there a chance to get advantage of this ranking algorithm with Lucene? Thank You Björn
Re: [Lucene.Net] help with a couple of resources
Prescott, I started a thread called 'Proposed Roadmap' a while back. It's here: http://s.apache.org/9Ov We just finished a new official release, and I will be announcing it either today or tomorrow (I was waiting for the mirror sites to get updated before announcing it). It's located here: http://www.apache.org/dist/incubator/lucene.net/ Thanks, Troy On Sun, Mar 6, 2011 at 4:09 PM, Prescott Nasser geobmx...@hotmail.com wrote: Hey guys, I'm trying to update the website. But I'm having some trouble with a few items. 1. Do we have a binary release? (if so where?) 2. I thought I saw a roadmap document of some kind, but I'm having trouble finding it. Thanks, ~P
Re: [Lucene.Net] releasedate for Lucene.NET 3.0
Björn, As far as a release schedule for a 3.X release goes.. Realistically, it's not likely to be ready until sometime in the later half of this year. As you found in the blog article mentioned, the project was going through some difficult times last year. Since that time, the project has seen a lot of improvement. We've had a fresh start with a new group of committers and are back on track to start developing new releases. Unfortunately, there is a lot of cleaning up and catching up that we have to do before we can get back in sync with Java Lucene's releases. If you'd like to get involved with developing Lucene.Net, we can still use a lot of help. Some of the things we're focused on now are setting up a more automated way of creating the line-by-line ports, getting caught up to match the Java Lucene releases (3.X), and designing a new API and refining the existing implementation to fully take advantage of the .NET runtime and the C# language. There's a lot of ways someone could help out right now. We need everything from core code contributions to unit tests, documentation, HTML/CSS work on the project website, infrastructure coding for a CI server, and basic administrative and management tasks like keeping up with the issue tracker. The first step to getting involved would be to read over the history of the development mailing list for the past month or so, and review the JIRA issue tracker for current open issues. If there's not an issue for something that you think should be, feel free to enter it in. As usual feel free to post any questions to the list. Thanks, Troy 2011/3/2 Björn Kremer b...@patorg.de: Hello Michael, Does this help with what you were looking for? Yes, thank you for the answer. I have just read http://codeclimber.net.nz/archive/2010/11/01/Lucene-Net-needs-your-help-or-it-will-die.aspx Are you still missing programmers? Björn Am 01.03.2011 15:43, schrieb Michael Herndon: Hi Björn Kremer!, There is currently not a release date set for Lucene.Net 3.0. A roadmap that covers a Lucene.Net release will most likely be developed when we've made progress on the 2.9.x releases and infrastructure changes that are currently be discussed on the mailing list. I could be wrong, but incubator status should not have a bearing whether a binary release is official, but rather the status of the project and community behind it. It will most likely be a decent time gap before Lucene.Net makes it back out of incubation. However a release for Lucene.Net 3.0 is one of our priorities and goals. Does this help with what you were looking for? - Michael 2011/3/1 Björn Kremer b...@patorg.de Hi, ist there a releasedate for Lucene.NET 3.0 or a roadmap? When is Lucene.NET released official?(Without incubator state) Thank You Björn --
Re: [Lucene.Net] Knockree Apache Retreat
This sounds like a great opportunity. However airfare, roundtrip from Portland, OR is ~$900; A little expensive. So, I'm a solid 'maybe'. Thanks, Troy On Wed, Mar 2, 2011 at 9:19 PM, Stefan Bodewig bode...@apache.org wrote: Hi all, I think most people around here are US based, but is anybody going to Knockree in May? https://sites.google.com/site/apacheretreatknockree/ Stefan
Re: [Lucene.Net] Knockree Apache Retreat
I went ahead and applied for travel assistance. On Wed, Mar 2, 2011 at 10:12 PM, Troy Howard thowar...@gmail.com wrote: This sounds like a great opportunity. However airfare, roundtrip from Portland, OR is ~$900; A little expensive. So, I'm a solid 'maybe'. Thanks, Troy On Wed, Mar 2, 2011 at 9:19 PM, Stefan Bodewig bode...@apache.org wrote: Hi all, I think most people around here are US based, but is anybody going to Knockree in May? https://sites.google.com/site/apacheretreatknockree/ Stefan
Re: [Lucene.Net] CI Task Update: Hudkins
I've been following builds@ for the past couple of days. Looks like they just finished the migration to Jenkins. Michael - Have you had a chance to contact them and find out what tools are available out of our list? Want me to do that? Thanks, Troy On Mon, Feb 28, 2011 at 9:19 PM, Scott Lombard slomb...@theta.net wrote: +1 Scott On Mon, Feb 28, 2011 at 5:18 AM, Stefan Bodewig bode...@apache.org wrote: On 2011-02-28, Troy Howard wrote: One quick concern I have, is how much of the things listed are already available on the Apache hudson server? builds@apache is the place to ask. A lot of this is .NET specific, so unlikely that it will already be available. well, the DotCMIS build seems to be using Sandcastle Helpfile Builder by looking the console output. We'll have to request that ASF Infra team install these tools for us, and they may not agree, or there might be licensing issues, etc.. Not sure. I'd start the conversation with them now to suss this out. Really, go to the builds list. License issues usually don't show up for build tools. It may be good if anybody of the team could volunteer time helping administrate the Windows slave. - Mono is going to be a requirement moving forward This could be done on a non-Windows slave just to completely sure it works. This may require installing a newer Mono (or just pulling in in a different Debian package source for Mono) than is installed by default. - Project structure was being discussed on the LUCENENET-377 thread. As a quick note, in general we prefer the mailing list of JIRA for discussions around the ASF. Stefan
Re: [Lucene.Net] [VOTE] Release Apache Lucene.Net 2.9.2-incubating-RC2
+1 for RC release. On Fri, Feb 25, 2011 at 9:37 AM, Troy Howard thowar...@gmail.com wrote: All, I updated the .src zip and associated checksums/signatures at: http://people.apache.org/~thoward/Lucene.Net/2.9.2-incubating-RC2/dist/ Hopefully this one will pass the licensing validation! I think I got them all this time.. Thanks, Troy
Re: [Lucene.Net] [VOTE] Release Apache Lucene.Net 2.9.2-incubating-RC2
All, A quick voting reminder... This [VOTE] thread will only be active for another 4 hours (72 hours total). So far, we have two +1 votes in. After this vote, the release will be proposed to the Incubator PMC, and will have another 72 hour vote for acceptance there. Assuming that passes, it will become an official ASF Incubator release. Thanks, Troy On Fri, Feb 25, 2011 at 12:29 PM, Stefan Bodewig bode...@apache.org wrote: On 2011-02-25, Troy Howard wrote: I updated the .src zip and associated checksums/signatures at: I have verified the bin zip is still the same that I checked. All signatures and hashes are fine, RAT is reasonably happy with the src zip (I've updated http://people.apache.org/~bodewig/Lucene.NET/src-rat.log). +1 from me for the release. Of course I haven't perfromed any technical tests, just verified the artifacts meet the Incubator requirements (at least all I know of). Hopefully this one will pass the licensing validation! I think I got them all this time.. Thanks a lot. Stefan
Re: [Lucene.Net] [VOTE] Release Apache Lucene.Net 2.9.2-incubating-RC2
Thanks everyone for voting. Voting is now closed. The Apache Lucene.Net 2.9.2-incubating RC2 release passes the PPMC vote with six +1 votes. I'll move on and propose it for official release on the Incubator general mailing list today. Thanks, Troy On Mon, Feb 28, 2011 at 6:58 AM, Michael Herndon mhern...@o19s.com wrote: +1 On Mon, Feb 28, 2011 at 9:50 AM, Prescott Nasser geobmx...@hotmail.comwrote: +1 +1 On Mon, Feb 28, 2011 at 2:44 PM, digy digy wrote: +1 DIGY On Mon, Feb 28, 2011 at 11:04 AM, Glyn Darkin wrote: +1 On 28 Feb 2011, at 08:39, Troy Howard wrote: All, A quick voting reminder... This [VOTE] thread will only be active for another 4 hours (72 hours total). So far, we have two +1 votes in. After this vote, the release will be proposed to the Incubator PMC, and will have another 72 hour vote for acceptance there. Assuming that passes, it will become an official ASF Incubator release. Thanks, Troy On Fri, Feb 25, 2011 at 12:29 PM, Stefan Bodewig wrote: On 2011-02-25, Troy Howard wrote: I updated the .src zip and associated checksums/signatures at: I have verified the bin zip is still the same that I checked. All signatures and hashes are fine, RAT is reasonably happy with the src zip (I've updated ). +1 from me for the release. Of course I haven't perfromed any technical tests, just verified the artifacts meet the Incubator requirements (at least all I know of). Hopefully this one will pass the licensing validation! I think I got them all this time.. Thanks a lot. Stefan Glyn Darkin Darkin Systems Ltd Mob: 07961815649 Fax: 08717145065 Web: www.darkinsystems.com Company No: 6173001 VAT No: 906350835 -- --Regards, Sergey Mirvoda -- Michael Herndon Senior Developer (mhern...@o19s.com) 804.767.0083 [connect online] http://www.opensourceconnections.com http://www.amptools.net http://www.linkedin.com/pub/michael-herndon/4/893/23 http://www.facebook.com/amptools.net http://www.twitter.com/amptools-net
Re: [Lucene.Net] [VOTE] Release Apache Lucene.Net 2.9.2-incubating-RC1
Stefan, I'm pretty close to finishing a second release candidate... Been busy today/yesterday. Thanks, Troy On Fri, Feb 25, 2011 at 1:25 AM, Stefan Bodewig bode...@apache.org wrote: On 2011-02-23, Stefan Bodewig wrote: Not only cosmetic: * The NOTICE file contains a bad copyright year and doesn't talk about Lucene.NET at all. Make that Lucene.NET rather than Lucene and 2006-2011. * LICENSE talks about src/java/org/apache/lucene/util/UnicodeUtil.java and src/java/org/apache/lucene/util/ArrayUtil.java that certainly don't exist while there are files with different names that the corresponding license entry applies to. * Quite a few files that could contain the ASF license don't. I've run RAT[1] over the distribution archives and the results are here http://people.apache.org/~bodewig/Lucene.NET/ I dont think the .txt files need a license, but the .html, .cs, .xml (at least the ones that are not generated), .config, .nunit and .resources files can and should. One could even argue the .sln and .c[ds]proj files should (the build.xml or pom.xml files of Java projects also do). * some snowball files need to get relicensed under Apache Software License 2.0 (the are still at 1.1). These are so straight forward to fix that even I can do it ;-) Unless anybody yells I'll put some time aside today to create a patch that fixes the issues in trunk and should hopefully be easy to merge to the 2.9.2 tag/branch and will attach it to LOCENENET-381. Stefan
Re: [Lucene.Net] svn commit: r1074470 - in /incubator/lucene.net/tags/Lucene.Net_2_9_2_RC1: contrib/WordNet.Net/WordNet.Net/SynExpand/ contrib/WordNet.Net/WordNet.Net/SynLookup/ contrib/WordNet.Net/Wo
No problem. Updated them, and a new release candidate is ready (posting now). Thanks, Troy On Fri, Feb 25, 2011 at 3:17 AM, Stefan Bodewig bode...@apache.org wrote: On 2011-02-25, thow...@apache.org wrote: Added app.configs (missing from previous commit) They need licenses as well. Sorry to be a PITA. Stefan