RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
I don't think I'm as hard core on this as Neal, but remember: the history of the Lucene.NET project is that all the intellectual work, all the understanding of search, all the new features come from the Lucene Java folks. Theirs is an immensely respected project, and I trust them to add new features that will be well-tested and well-researched, and to have a decent roadmap which I can trust they will execute on. Now I know there's been an influx of capable developers to Lucene.NET who are ready, willing and (I'm going to assume) able to add a lot more value in a generic .NET implementation as they change it. But it'll take a while before I trust a .NET dedicated framework which is significantly diverged from Java in the way I do the line-by-line version. And at what stage is it not just not a line-by-line port, but not a port at all? At the same time, I recognise that if this project is going to continue, and attract good developers, it has to change in this direction. So that said, I can see why a line-by-line port might not be sustainable. And most people don't need it. But most of us using Lucene in production systems do need a system that we can trust and rely on. So let me chime in with someone else's plea, to keep the general structure close to Lucene, to keep the same general objects and inheritance set-up, and to keep the same method names, even if you add other methods and classes to provide additional functionality. ABSOLUTELY the same file formats. End users benefit a lot from a high degree of similarity, with good documentation and help being available from the Java community. Yours, Moray - Moray McConnachie Director of IT+44 1865 261 600 Oxford Analytica http://www.oxan.com -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: 29 June 2011 20:47 To: lucene-net-u...@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? This is has been discussed many times. Lucene.NET is not valid, the code cannot be trusted, if it is not a line-by-line port. It ceases to be Lucene. - Neal -Original Message- From: Scott Lombard [mailto:lombardena...@gmail.com] Sent: Wednesday, June 29, 2011 1:58 PM To: lucene-net-dev@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? After the large community response about moving the code base from .Net 2.0 to Net 4.0 I am trying to figure out what is the need for a line-by-line port. Starting with Digy's excellent work on the conversion to generics a priority of the 2.9.4g release is the 2 packages would not be interchangeable. So faster turnaround from a java release won't matter to non line-by-line users they will have to wait until the updates are made to the non line-by-line code base. My question is there really a user base for the line-by-line port? Anyone have a comment? Scott - Disclaimer This message and any attachments are confidential and/or privileged. If this has been sent to you in error, please do not use, retain or disclose them, and contact the sender as soon as possible. Oxford Analytica Ltd Registered in England: No. 1196703 5 Alfred Street, Oxford United Kingdom, OX1 4EH -
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Can I just plug in my bit and say I agree 100% with what Moray has outlined below. If we move away from the line by line port then over time we'll loose out on the momentum that is Lucene and the improvements that they make. It is only if the Lucene.NET community has expertise in search, a deep knowledge of the project and the community can guarantee that the knowledge will survive members coming and going should such a consideration be give. When Lucene.NET has stood on it's feet for a number of years after it has moved out of Apache incubation should consideration be given to abandoning a line by line port. By all means extend and wrap the libraries in .NET equivalents and .NET goodness like LINQ (we do this internally in our company at the moment); but leave the core of the project on a line by line port. Just my tu-pence worth. Kind Regards Noel -Original Message- From: Moray McConnachie Sent: Thursday, June 30, 2011 10:25 AM To: lucene-net-u...@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't think I'm as hard core on this as Neal, but remember: the history of the Lucene.NET project is that all the intellectual work, all the understanding of search, all the new features come from the Lucene Java folks. Theirs is an immensely respected project, and I trust them to add new features that will be well-tested and well-researched, and to have a decent roadmap which I can trust they will execute on. Now I know there's been an influx of capable developers to Lucene.NET who are ready, willing and (I'm going to assume) able to add a lot more value in a generic .NET implementation as they change it. But it'll take a while before I trust a .NET dedicated framework which is significantly diverged from Java in the way I do the line-by-line version. And at what stage is it not just not a line-by-line port, but not a port at all? At the same time, I recognise that if this project is going to continue, and attract good developers, it has to change in this direction. So that said, I can see why a line-by-line port might not be sustainable. And most people don't need it. But most of us using Lucene in production systems do need a system that we can trust and rely on. So let me chime in with someone else's plea, to keep the general structure close to Lucene, to keep the same general objects and inheritance set-up, and to keep the same method names, even if you add other methods and classes to provide additional functionality. ABSOLUTELY the same file formats. End users benefit a lot from a high degree of similarity, with good documentation and help being available from the Java community. Yours, Moray - Moray McConnachie Director of IT+44 1865 261 600 Oxford Analytica http://www.oxan.com -Original Message- From: Granroth, Neal V. [mailto:neal.granr...@thermofisher.com] Sent: 29 June 2011 20:47 To: lucene-net-u...@lucene.apache.org Cc: lucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? This is has been discussed many times. Lucene.NET is not valid, the code cannot be trusted, if it is not a line-by-line port. It ceases to be Lucene. - Neal -Original Message- From: Scott Lombard [mailto:lombardena...@gmail.com] Sent: Wednesday, June 29, 2011 1:58 PM To: lucene-net-dev@lucene.apache.org; lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? After the large community response about moving the code base from .Net 2.0 to Net 4.0 I am trying to figure out what is the need for a line-by-line port. Starting with Digy's excellent work on the conversion to generics a priority of the 2.9.4g release is the 2 packages would not be interchangeable. So faster turnaround from a java release won't matter to non line-by-line users they will have to wait until the updates are made to the non line-by-line code base. My question is there really a user base for the line-by-line port? Anyone have a comment? Scott - Disclaimer This message and any attachments are confidential and/or privileged. If this has been sent to you in error, please do not use, retain or disclose them, and contact the sender as soon as possible. Oxford Analytica Ltd Registered in England: No. 1196703 5 Alfred Street, Oxford United Kingdom, OX1 4EH -
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
As someone from the nhibernate project We stopped following hibernate a while ago, and haven't regretted it We have mire features, less bugs and better code base Sent from my Windows Phone From: Rory Plaire Sent: Thursday, June 30, 2011 19:58 To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't want to drag this out much longer, but I am curious with people who hold the line-by-line sentiment - are you NHibernate users? -r On Thu, Jun 30, 2011 at 2:39 AM, Noel Lysaght lysag...@hotmail.com wrote: Can I just plug in my bit and say I agree 100% with what Moray has outlined below. If we move away from the line by line port then over time we'll loose out on the momentum that is Lucene and the improvements that they make. It is only if the Lucene.NET community has expertise in search, a deep knowledge of the project and the community can guarantee that the knowledge will survive members coming and going should such a consideration be give. When Lucene.NET has stood on it's feet for a number of years after it has moved out of Apache incubation should consideration be given to abandoning a line by line port. By all means extend and wrap the libraries in .NET equivalents and .NET goodness like LINQ (we do this internally in our company at the moment); but leave the core of the project on a line by line port. Just my tu-pence worth. Kind Regards Noel -Original Message- From: Moray McConnachie Sent: Thursday, June 30, 2011 10:25 AM To: lucene-net-user@lucene.apache.**orglucene-net-u...@lucene.apache.org Cc: lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't think I'm as hard core on this as Neal, but remember: the history of the Lucene.NET project is that all the intellectual work, all the understanding of search, all the new features come from the Lucene Java folks. Theirs is an immensely respected project, and I trust them to add new features that will be well-tested and well-researched, and to have a decent roadmap which I can trust they will execute on. Now I know there's been an influx of capable developers to Lucene.NET who are ready, willing and (I'm going to assume) able to add a lot more value in a generic .NET implementation as they change it. But it'll take a while before I trust a .NET dedicated framework which is significantly diverged from Java in the way I do the line-by-line version. And at what stage is it not just not a line-by-line port, but not a port at all? At the same time, I recognise that if this project is going to continue, and attract good developers, it has to change in this direction. So that said, I can see why a line-by-line port might not be sustainable. And most people don't need it. But most of us using Lucene in production systems do need a system that we can trust and rely on. So let me chime in with someone else's plea, to keep the general structure close to Lucene, to keep the same general objects and inheritance set-up, and to keep the same method names, even if you add other methods and classes to provide additional functionality. ABSOLUTELY the same file formats. End users benefit a lot from a high degree of similarity, with good documentation and help being available from the Java community. Yours, Moray --**--- Moray McConnachie Director of IT+44 1865 261 600 Oxford Analytica http://www.oxan.com -Original Message- From: Granroth, Neal V. [mailto:neal.granroth@**thermofisher.comneal.granr...@thermofisher.com ] Sent: 29 June 2011 20:47 To: lucene-net-user@lucene.apache.**orglucene-net-u...@lucene.apache.org Cc: lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? This is has been discussed many times. Lucene.NET is not valid, the code cannot be trusted, if it is not a line-by-line port. It ceases to be Lucene. - Neal -Original Message- From: Scott Lombard [mailto:lombardenator@gmail.**comlombardena...@gmail.com ] Sent: Wednesday, June 29, 2011 1:58 PM To: lucene-net-dev@lucene.apache.**org lucene-net-dev@lucene.apache.org; lucene-net-user@lucene.apache.**org lucene-net-u...@lucene.apache.org Subject: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? After the large community response about moving the code base from .Net 2.0 to Net 4.0 I am trying to figure out what is the need for a line-by-line port. Starting with Digy's excellent work on the conversion to generics a priority of the 2.9.4g release is the 2 packages would not be interchangeable. So faster turnaround from a java release won't matter to non line-by-line users they will have to wait until the updates are made to the non line-by-line code base. My question is there really a user
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
NHibernate has a much bigger community and more active devs afaict. The proposed changes as I understand them are not about changing class structure or APIs, but merely touch hunks of code and rewrite them to use better .NET practices (yield, generics, LINQ etc). In conjunction with a move to .NET 4.0 this would increase readability, improve GC and boost performance. IMO this doesn't have to be a line-by-line port in order to make porting of patches easy - what digy seem to be really worried about (and he's right). As long as the meaning of the code is clear, it shouldn't be a real problem to apply Java patches to the .NET codebase. And as long as the test suite keeps being thorough, there's really nothing to fear of. On 30/06/2011 20:15, Ayende Rahien wrote: As someone from the nhibernate project We stopped following hibernate a while ago, and haven't regretted it We have mire features, less bugs and better code base
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the internals and index formats are significantly different including nixing the current vint file format and using byte[] array slices for Terms instead of char[]. So while porting 1 to 1 while require less knowledge or thought, its most likely going to require more hours of work. And Its definitely not going to guarantee the stability of the code or that its great code. I'd have to say that its not as stable as most would believe at the moment. Most of the tests avoid anything that remotely looks like it knows about the DRY principle and there is a static constructor in the core test case that throws an exception if it doesn't find an environment variable TEMP which will fail 90% of the tests and nunit will be unable to give you a clear reason why. Just to name a few issues I came across working towards getting Lucene.Net into CI. I haven't even started really digging in under the covers of the code yet. So my suggestion is to chew on this a bit more and build consensus, avoid fracturing people into sides. Be open to reservations and concerns that others have and continue to address them. - Michael On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote: Although there are a lot of people using Lucene.Net, this is our contribution report for the past 5 years. https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r eport.contributions%3AcontributionreportNext=Next DIGY -Original Message- From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Thursday, June 30, 2011 8:16 PM To: Rory Plaire; lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
A) I don't to want to commit anything thats going to piss alot of people off, B) I don't want to spend time/waste time on modifications that are going to be rejected. What I've learnt from Apache Way is creating a JIRA issue if you are hesitant. If no one answers in a reasonable time(mostly), then commit. DIGY -Original Message- From: Michael Herndon [mailto:mhern...@wickedsoftware.net] Sent: Thursday, June 30, 2011 11:58 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? @Troy, I've already started working towards fixing unit testing issues, and prototyping some things that sure DRY up the testing just so that I can get the tests running on mono. Those changes are currently in a private git repo, however since we don't have a CI, I'm need to make some time to manually test those on at least 3 different Os (windowx, osx, and ubuntu) before putting those back into the 2.9.4g branch. The reason being I need those in working order so that I can do a write up on pulling those from source and at least running the build script to compile everything and run the tests for you. I don't know about everyone else, but thats a starting point I look for when I go to work on something or commit something back. They should make their way back sometime this month. I think the next thing I'll do is put my money where my mouth is, spend time break down the rest of the CI tasks, then seeing how much stuff I can get documented into the wiki. The simple faceted search is a decent starting template. @Digy I agree with the talk, no work. Though coming from the outside in, I still cringe when I make any commits at the moment. (even that little .gitnore file) A) I don't to want to commit anything thats going to piss alot of people off, B) I don't want to spend time/waste time on modifications that are going to be rejected. C) it took a good deal of going through things before I felt comfortable to even making a commit. D) yes I know I just need to get over it and so does everyone else (hence the obsession with the unit tests at the moment). and I think a key to relaying people to get over it, including myself, is to make the point you had more clear across the board: *If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. * +1 because that makes feel there is more leadway to experiment and any decent effort will at least go somewhere to live and not be wasted. On Thu, Jun 30, 2011 at 4:19 PM, Digy digyd...@gmail.com wrote: Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
DIGY - Re: Why do I wait.. That's mostly because I intend to make some deep changes, which would make merging the 2.9.4g branch back to trunk difficult. So, it's easier to merge those changes first. Also, I won't have enough time to make my changes until a little way in the future, but probably do have the time to put together another release, so I'll do that first because it fits with my work/life schedule. Thanks, Troy On Thu, Jun 30, 2011 at 1:19 PM, Digy digyd...@gmail.com wrote: Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the internals and index formats are significantly different including nixing the current vint file format and using byte[] array slices for Terms instead of char[]. So while porting 1 to 1 while require less knowledge or thought, its most likely going to require more hours of work. And Its definitely not going to guarantee the stability of the code or that its great code. I'd have to say that its not as stable as most would believe at the moment. Most of the tests avoid anything that remotely looks like it knows about the DRY principle and there is a static constructor in the core test case that throws an exception if it doesn't find an environment variable TEMP which will fail 90% of the tests and nunit will be unable to give you a clear reason why. Just to name a few issues I came across working towards getting Lucene.Net into CI. I haven't even started really digging in under the covers of the code yet. So my suggestion is to chew on this a bit more and build consensus, avoid fracturing people into sides. Be open to reservations and concerns that others have and continue to address them. - Michael On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote: Although there are a lot of people using Lucene.Net, this is our contribution report
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Scott - The idea of the automated port is still worth doing. Perhaps it makes sense for someone more passionate about the line-by-line idea to do that work? I would say, focus on what makes sense to you. Being productive, regardless of the specific direction, is what will be most valuable. Once you start, others will join and momentum will build. That is how these things work. I like DIGY's approach too, but the problem with it is that it is a never-ending manual task. The theory behind the automated port is that it may reduce the manual work. It is complicated, but once it's built and works, it will save a lot of future development hours. If it's built in a sufficiently general manner, it could be useful for other project like Lucene.Net that want to automate a port from Java to C#. It might make sense for that to be a separate project from Lucene.Net though. -T On Thu, Jun 30, 2011 at 2:13 PM, Scott Lombard lombardena...@gmail.comwrote: Ok I think I asked the wrong question. I am trying to figure out where to put my time. I was thinking about working on the automated porting system, but when I saw the response to the .NET 4.0 discussions I started to question if that is the right direction. The community seemed to be more interested in the .NET features. The complexity of the automated tool is going to become very high and will probably end up with a line-for-line style port. So I keep asking my self is the automated tool worth it. I don't think it is. I like the method has been Digy is using for porting the code. So I guess for me the real question is Digy where did you see 2.9.4g going next and what do you need help on? Scott -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Thursday, June 30, 2011 4:20 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Michael - If you bring those changes from git into a branch in SVN, we can help with it. It doesn't have to be complete to be committed. :) Regarding A (angering people)/B (being rejected)/C (feeling comfortable)/D (getting over it)... a) Making progress is more important than keeping everyone happy b) Our goal is to accept things, not reject them. That said, if something gets rejected due to quality issues, don't be afraid of that, it's a learning experience for everyone, and it's a good thing. We can work together to get to something everyone is happy with and learn in the process. c) Commit to a branch. Merge when things are right. No one expects branches to build or be finished. It's OK. I get worried when I merge to trunk or when I make a release. But I don't do that until I'm pretty sure it's all legit. d) Best way to get over it is to start doing it I know you probably already realize all of this, but I wanted to respond, so that, in case anyone else out there is struggling with the same set of fears, they can see that fears that prevent action are more problematic than any action they might take without those fears. Thanks, Troy On Thu, Jun 30, 2011 at 1:57 PM, Michael Herndon mhern...@wickedsoftware.net wrote: @Troy, I've already started working towards fixing unit testing issues, and prototyping some things that sure DRY up the testing just so that I can get the tests running on mono. Those changes are currently in a private git repo, however since we don't have a CI, I'm need to make some time to manually test those on at least 3 different Os (windowx, osx, and ubuntu) before putting those back into the 2.9.4g branch. The reason being I need those in working order so that I can do a write up on pulling those from source and at least running the build script to compile everything and run the tests for you. I don't know about everyone else, but thats a starting point I look for when I go to work on something or commit something back. They should make their way back sometime this month. I think the next thing I'll do is put my money where my mouth is, spend time break down the rest of the CI tasks, then seeing how much stuff I can get documented into the wiki. The simple faceted search is a decent starting template. @Digy I agree with the talk, no work. Though coming from the outside in, I still cringe when I make any commits at the moment. (even that little .gitnore file) A) I don't to want to commit anything thats going to piss alot of people off, B) I don't want to spend time/waste time on modifications that are going to be rejected. C) it took a good deal of going through things before I felt comfortable to even making a commit. D) yes I know I just need to get over it and so does everyone else (hence the obsession with the unit tests at the moment). and I think a key to relaying people to get over it, including myself, is to make the point you had more clear across the board: *If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. * +1 because that makes feel there is more leadway to experiment and any decent effort will at least go somewhere to live and not be wasted. On Thu, Jun 30, 2011 at 4:19 PM, Digy digyd...@gmail.com wrote: Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
I can not say I like this approach, but till we find an automated way(with good results), it seems to be the only way we can use. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Friday, July 01, 2011 12:43 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Scott - The idea of the automated port is still worth doing. Perhaps it makes sense for someone more passionate about the line-by-line idea to do that work? I would say, focus on what makes sense to you. Being productive, regardless of the specific direction, is what will be most valuable. Once you start, others will join and momentum will build. That is how these things work. I like DIGY's approach too, but the problem with it is that it is a never-ending manual task. The theory behind the automated port is that it may reduce the manual work. It is complicated, but once it's built and works, it will save a lot of future development hours. If it's built in a sufficiently general manner, it could be useful for other project like Lucene.Net that want to automate a port from Java to C#. It might make sense for that to be a separate project from Lucene.Net though. -T On Thu, Jun 30, 2011 at 2:13 PM, Scott Lombard lombardena...@gmail.comwrote: Ok I think I asked the wrong question. I am trying to figure out where to put my time. I was thinking about working on the automated porting system, but when I saw the response to the .NET 4.0 discussions I started to question if that is the right direction. The community seemed to be more interested in the .NET features. The complexity of the automated tool is going to become very high and will probably end up with a line-for-line style port. So I keep asking my self is the automated tool worth it. I don't think it is. I like the method has been Digy is using for porting the code. So I guess for me the real question is Digy where did you see 2.9.4g going next and what do you need help on? Scott -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Thursday, June 30, 2011 4:20 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
So, veering towards action - are there concrete tasks written up anywhere for the unit tests? If a poor schlep like me wanted to dig in and start to improve them, where would I get the understanding of what is good and what needs help? -r On Thu, Jun 30, 2011 at 3:29 PM, Digy digyd...@gmail.com wrote: I can not say I like this approach, but till we find an automated way(with good results), it seems to be the only way we can use. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Friday, July 01, 2011 12:43 AM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Scott - The idea of the automated port is still worth doing. Perhaps it makes sense for someone more passionate about the line-by-line idea to do that work? I would say, focus on what makes sense to you. Being productive, regardless of the specific direction, is what will be most valuable. Once you start, others will join and momentum will build. That is how these things work. I like DIGY's approach too, but the problem with it is that it is a never-ending manual task. The theory behind the automated port is that it may reduce the manual work. It is complicated, but once it's built and works, it will save a lot of future development hours. If it's built in a sufficiently general manner, it could be useful for other project like Lucene.Net that want to automate a port from Java to C#. It might make sense for that to be a separate project from Lucene.Net though. -T On Thu, Jun 30, 2011 at 2:13 PM, Scott Lombard lombardena...@gmail.com wrote: Ok I think I asked the wrong question. I am trying to figure out where to put my time. I was thinking about working on the automated porting system, but when I saw the response to the .NET 4.0 discussions I started to question if that is the right direction. The community seemed to be more interested in the .NET features. The complexity of the automated tool is going to become very high and will probably end up with a line-for-line style port. So I keep asking my self is the automated tool worth it. I don't think it is. I like the method has been Digy is using for porting the code. So I guess for me the real question is Digy where did you see 2.9.4g going next and what do you need help on? Scott -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Thursday, June 30, 2011 4:20 PM To: lucene-net-dev@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-dev@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without
Re: code to call arbitrary function on Python modules, and eval()
On Jul 1, 2011, at 0:49, Bill Janssen jans...@parc.com wrote: Here's some code implementing a class called PythonModule, Hmm, no code was received here... Andi.. which allows Java code to invoke arbitary module-level functions, and allows use of Python's eval built-in. The Python code is a bit tricky; is there a better way to cast a Java scalar type to its Python equivalent value? Bill
Re: code to call arbitrary function on Python modules, and eval()
On Jul 1, 2011, at 0:49, Bill Janssen jans...@parc.com wrote: Here's some code implementing a class called PythonModule, Hmm, no code was received here... Andi.. which allows Java code to invoke arbitary module-level functions, and allows use of Python's eval built-in. The Python code is a bit tricky; is there a better way to cast a Java scalar type to its Python equivalent value? Bill
[jira] [Commented] (LUCENE-3241) Remove Lucene core's FunctionQuery impls
[ https://issues.apache.org/jira/browse/LUCENE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057635#comment-13057635 ] Chris Male commented on LUCENE-3241: I will re-evaluate the tests and port what I can. Remove Lucene core's FunctionQuery impls Key: LUCENE-3241 URL: https://issues.apache.org/jira/browse/LUCENE-3241 Project: Lucene - Java Issue Type: Sub-task Components: core/search Reporter: Chris Male Assignee: Chris Male Fix For: 4.0 Attachments: LUCENE-3241.patch As part of the consolidation of FunctionQuerys, we want to remove Lucene core's impls. Included in this work, we will make sure that all the functionality provided by the core impls is also provided by the new module. Any tests will be ported across too, to increase the test coverage. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3261) Faceting module userguide
[ https://issues.apache.org/jira/browse/LUCENE-3261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera updated LUCENE-3261: --- Attachment: facet-userguide.pdf Attaching the userguide from LUCENE-3079. Faceting module userguide - Key: LUCENE-3261 URL: https://issues.apache.org/jira/browse/LUCENE-3261 Project: Lucene - Java Issue Type: Improvement Components: modules/facet Reporter: Shai Erera Assignee: Shai Erera Priority: Minor Attachments: facet-userguide.pdf In LUCENE-3079 I've uploaded a userguide for the faceting module. I'd like to discuss where is the best place to include the module. We include it with the code (in our SVN), so that it's always attached to some branch (or in other words a release). That way we can have versions of it per releases that reflect API changes. This document is like the file format document, or any other document we put under site-versioned. So we have two places: * facet/docs * site/userguides Unlike the site, which its PDFs are built automatically by Forrest, we cannot convert ODT to PDF with it, so it's a challenge to put it there. What we do today (in our SVN) is whoever updates the userguide, creates a PDF too, that's easy from OpenOffice. I'll upload the file later when I'm in front of it. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3264) crank up faceting module tests
[ https://issues.apache.org/jira/browse/LUCENE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057668#comment-13057668 ] Shai Erera commented on LUCENE-3264: Patch looks very good. All tests pass for me (I've applied on trunk only). Few things I've noticed: * Previously the tests took 1m20s to run, now they take 2m55s. I guess it's because previously we only created RAMDirs, while now newDirectory picks FSDir from time to time (10%?). * FacetTestUtils.close*() can be removed and calls replaced by IOUtils.closeSafely. This is not critical, just remove redundant code. * You added a TODO to CategoryListIteratorTest about the test failing if TieredMP is used. In general TieredMP is not good for the taxonomy index, which relies on Lucene doc IDs, and therefore segments must be merged in-order. LTW uses LMP specifically because of that. I will look into the test to understand why would it care about doc IDs, since it doesn't using the taxonomy index at all. * There are few places with code like: assertTrue(Would like to test this with deletions!,indexReader.hasDeletions()), and assertTrue(Would like to test this with deletions!,indexReader.numDeletedDocs() 0) which you removed. Any reason? * You added a TODO to TestScoredDocIDsUtils (about reader is read-only) -- you're right, the comment can be deleted. While I reviewed, I was thinking that RandomIndexWriter is used to replace the IndexWriter for content indexing. While this is good, this does not cover the 'taxonomy' indexing. So I wonder if we should have under facet/test/o.a.l.utils a RandomTaxonomyWriter which opens RIW internally? This is very impressive progress Robert, thanks for doing it ! I am +1 to commit, after we resolve the tiny issues I raised above. We can add RandomTaxonomyWriter as a follow-on commit. crank up faceting module tests -- Key: LUCENE-3264 URL: https://issues.apache.org/jira/browse/LUCENE-3264 Project: Lucene - Java Issue Type: Test Components: modules/facet Reporter: Robert Muir Assignee: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3264.patch The faceting module has a large set of good tests. lets switch them over to use all of our test infra (randomindexwriter, random iwconfig, mockanalyzer, newDirectory, ...) I don't want to address multipliers and atLeast() etc on this issue, I think we should follow up with that on a separate issue, that also looks at speed and making sure the nightly build is exhaustive. for now, lets just get the coverage in, it will be good to do before any refactoring. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3241) Remove Lucene core's FunctionQuery impls
[ https://issues.apache.org/jira/browse/LUCENE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057694#comment-13057694 ] Chris Male commented on LUCENE-3241: Command for patch: {code} svn move lucene/src/java/org/apache/lucene/search/function/NumericIndexDocValueSource.java modules/queries/src/java/org/apache/lucene/queries/function/valuesource/ svn move lucene/src/test/org/apache/lucene/search/function/TestFieldScoreQuery.java modules/queries/src/test/org/apache/lucene/queries/function/ svn move lucene/src/test/org/apache/lucene/search/function/TestOrdValues.java modules/queries/src/test/org/apache/lucene/queries/function/ svn --force delete lucene/src/java/org/apache/lucene/search/function svn --force delete lucene/src/test/org/apache/lucene/search/function {code} Remove Lucene core's FunctionQuery impls Key: LUCENE-3241 URL: https://issues.apache.org/jira/browse/LUCENE-3241 Project: Lucene - Java Issue Type: Sub-task Components: core/search Reporter: Chris Male Assignee: Chris Male Fix For: 4.0 Attachments: LUCENE-3241.patch, LUCENE-3241.patch As part of the consolidation of FunctionQuerys, we want to remove Lucene core's impls. Included in this work, we will make sure that all the functionality provided by the core impls is also provided by the new module. Any tests will be ported across too, to increase the test coverage. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3241) Remove Lucene core's FunctionQuery impls
[ https://issues.apache.org/jira/browse/LUCENE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Male updated LUCENE-3241: --- Attachment: LUCENE-3241.patch New patch which incorporates Robert's suggestions. I have salvaged some of the tests, but theres definitely a big TODO in regards to the test coverage. Command coming up. Remove Lucene core's FunctionQuery impls Key: LUCENE-3241 URL: https://issues.apache.org/jira/browse/LUCENE-3241 Project: Lucene - Java Issue Type: Sub-task Components: core/search Reporter: Chris Male Assignee: Chris Male Fix For: 4.0 Attachments: LUCENE-3241.patch, LUCENE-3241.patch As part of the consolidation of FunctionQuerys, we want to remove Lucene core's impls. Included in this work, we will make sure that all the functionality provided by the core impls is also provided by the new module. Any tests will be ported across too, to increase the test coverage. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3079) Faceting module
[ https://issues.apache.org/jira/browse/LUCENE-3079?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shai Erera resolved LUCENE-3079. Resolution: Fixed Faceting module in 3.x and trunk, tests pass, opened follow up issues. I think we can close this. Thanks for everyone for helping get this in so quickly ! Faceting module --- Key: LUCENE-3079 URL: https://issues.apache.org/jira/browse/LUCENE-3079 Project: Lucene - Java Issue Type: Improvement Components: modules/facet Reporter: Michael McCandless Assignee: Shai Erera Fix For: 3.4, 4.0 Attachments: LUCENE-3079-dev-tools.patch, LUCENE-3079.patch, LUCENE-3079.patch, LUCENE-3079.patch, LUCENE-3079.patch, LUCENE-3079_4x.patch, LUCENE-3079_4x_broken.patch, TestPerformanceHack.java, facet-userguide.pdf Faceting is a hugely important feature, available in Solr today but not [easily] usable by Lucene-only apps. We should fix this, by creating a shared faceting module. Ideally, we factor out Solr's faceting impl, and maybe poach/merge from other impls (eg Bobo browse). Hoss describes some important challenges we'll face in doing this (http://markmail.org/message/5w35c2fr4zkiwsz6), copied here: {noformat} To look at faceting as a concrete example, there are big the reasons faceting works so well in Solr: Solr has total control over the index, knows exactly when the index has changed to rebuild caches, has a strict schema so it can make sense of field types and pick faceting algos accordingly, has multi-phase distributed search approach to get exact counts efficiently across multiple shards, etc... (and there are still a lot of additional enhancements and improvements that can be made to take even more advantage of knowledge solr has because it owns the index that we no one has had time to tackle) {noformat} This is a great list of the things we face in refactoring. It's also important because, if Solr needed to be so deeply intertwined with caching, schema, etc., other apps that want to facet will have the same needs and so we really have to address them in creating the shared module. I think we should get a basic faceting module started, but should not cut Solr over at first. We should iterate on the module, fold in improvements, etc., and then, once we can fully verify that cutting over doesn't hurt Solr (ie lose functionality or performance) we can later cutover. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3216: Attachment: LUCENE-3216.patch we are getting closer to the overall target here. This patch enables each codec to decided to use CFS for DocValues or write individual files. To configure this and more stuff per codec I introduced a CodecConfig (just like IWC) that holds configuration for core codecs and is passed to each codec on creation. I added testcases for the Config and for nested CFS in the case IW or SegmentMerger decides to use CFS too so DocValues still can safely open the CFS. For test coverage I added a static newCodecConfig() to LuceneTestCase that randomly configures a codec per file to use CFS or individual files for DocValues and other stuff I figured make sense in the CodecConfig. All tests pass and there is no nocommit left I think its close. Review is appreciated Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3216: Attachment: LUCENE-3239.patch since the vote has passed here is a patch to cut over the build and references to 1.6 Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch, LUCENE-3239.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3142) benchmark/stats package is obsolete and unused - remove it
[ https://issues.apache.org/jira/browse/LUCENE-3142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Doron Cohen resolved LUCENE-3142. - Resolution: Fixed r1141465: trunk r1141468: 3x benchmark/stats package is obsolete and unused - remove it -- Key: LUCENE-3142 URL: https://issues.apache.org/jira/browse/LUCENE-3142 Project: Lucene - Java Issue Type: Bug Components: modules/benchmark Reporter: Doron Cohen Assignee: Doron Cohen Priority: Minor This seems like a leftover from the original benchmark implementation and can thus be removed. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3216: Comment: was deleted (was: since the vote has passed here is a patch to cut over the build and references to 1.6) Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3216: Attachment: (was: LUCENE-3239.patch) Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3239: Attachment: LUCENE-3239.patch this patch moves the build and metadata to 1.6 drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-3239.patch its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057765#comment-13057765 ] Uwe Schindler commented on LUCENE-3239: --- Patch looks fine, Jenkins already moved. drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-3239.patch its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057772#comment-13057772 ] Simon Willnauer commented on LUCENE-3239: - I just committed that patch, I will continue on all the *.java TODOs drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-3239.patch its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3239: Attachment: LUCENE-3239.patch here is a patch that fixes almost all todos except of the one in NativeFSLock. I think for that we should open a sep. issue. I didn't convert all the ArrayUtils yet I think we can do that later in a followup too. drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-3239.patch, LUCENE-3239.patch its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3260) need a test that uses termsenum.seekExact() (which returns true), then calls next()
[ https://issues.apache.org/jira/browse/LUCENE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057778#comment-13057778 ] Michael McCandless commented on LUCENE-3260: Thanks Shai! The 200+ iterations are exceptionally fast since they only do 1 TermsEnum op per iter (it's the indexing that'll be slow in this test -- for that I do numDocs = atLeast(10)). Also, this bug only happens when seekExact is followed by next, only on certain terms, and only on a multi-seg index. So it seems an OK investment of CPU for test coverage ;) need a test that uses termsenum.seekExact() (which returns true), then calls next() --- Key: LUCENE-3260 URL: https://issues.apache.org/jira/browse/LUCENE-3260 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Attachments: LUCENE-3260.patch i tried to do some seekExact (where the result must exist) then next()ing in the faceting module, and it seems like there could be a bug here. I think we should add a test that mixes seekExact/seekCeil/next like this, to ensure that if seekExact returns true, that the enum is properly positioned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057780#comment-13057780 ] Uwe Schindler commented on LUCENE-3239: --- +1 as a start drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-3239.patch, LUCENE-3239.patch its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057781#comment-13057781 ] Simon Willnauer commented on LUCENE-3239: - bq. +1 as a start alright I'll kick it in... we are on 1.6 YAY! drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Attachments: LUCENE-3239.patch, LUCENE-3239.patch its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer reassigned LUCENE-3239: --- Assignee: Simon Willnauer drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Assignee: Simon Willnauer Attachments: LUCENE-3239.patch, LUCENE-3239.patch its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3265) Cut over to Java 6 API where needed / possible
Cut over to Java 6 API where needed / possible -- Key: LUCENE-3265 URL: https://issues.apache.org/jira/browse/LUCENE-3265 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Simon Willnauer Priority: Minor Fix For: 4.0 since we are on 1.6 on trunk we should try to reduce the duplications like in ArrayUtils and cut over to Java 6 API -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3266) Improve FileLocking based on Java 1.6
Improve FileLocking based on Java 1.6 -- Key: LUCENE-3266 URL: https://issues.apache.org/jira/browse/LUCENE-3266 Project: Lucene - Java Issue Type: Improvement Components: core/store Affects Versions: 4.0 Reporter: Simon Willnauer Priority: Minor Fix For: 4.0 Snippet from NativeFSLockFactory: {noformat} /* * The javadocs for FileChannel state that you should have * a single instance of a FileChannel (per JVM) for all * locking against a given file (locks are tracked per * FileChannel instance in Java 1.4/1.5). Even using the same * FileChannel instance is not completely thread-safe with Java * 1.4/1.5 though. To work around this, we have a single (static) * HashSet that contains the file paths of all currently * locked locks. This protects against possible cases * where different Directory instances in one JVM (each * with their own NativeFSLockFactory instance) have set * the same lock dir and lock prefix. However, this will not * work when LockFactorys are created by different * classloaders (eg multiple webapps). * * TODO: Java 1.6 tracks system wide locks in a thread safe manner * (same FileChannel instance or not), so we may want to * change this when Lucene moves to Java 1.6. */ {noformat} since we are on 1.6 we should improve this if possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3239) drop java 5 support
[ https://issues.apache.org/jira/browse/LUCENE-3239?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer resolved LUCENE-3239. - Resolution: Fixed Fix Version/s: 4.0 Lucene Fields: [New, Patch Available] (was: [New]) moving out here, created LUCENE-3265 and LUCENE-3266 as followup issues drop java 5 support - Key: LUCENE-3239 URL: https://issues.apache.org/jira/browse/LUCENE-3239 Project: Lucene - Java Issue Type: Task Reporter: Robert Muir Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3239.patch, LUCENE-3239.patch its been discussed here and there, but I think we need to drop java 5 support, for these reasons: * its totally untested by any continual build process. Testing java5 only when there is a release candidate ready is not enough. If we are to claim support then we need a hudson actually running the tests with java 5. * its now unmaintained, so bugs have to either be hacked around, tests disabled, warnings placed, but some things simply cannot be fixed... we cannot actually support something that is no longer maintained: we do find JRE bugs (http://wiki.apache.org/lucene-java/SunJavaBugs) and its important that bugs actually get fixed: cannot do everything with hacks. * because of its limitations, we do things like allow 20% slower grouping speed. I find it hard to believe we are sacrificing performance for this. So, in summary: because we don't test it at all, because its buggy and unmaintained, and because we are sacrificing performance, I think we need to cutover the build system for the next release to require java 6. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3264) crank up faceting module tests
[ https://issues.apache.org/jira/browse/LUCENE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057792#comment-13057792 ] Robert Muir commented on LUCENE-3264: - {quote} Previously the tests took 1m20s to run, now they take 2m55s. I guess it's because previously we only created RAMDirs, while now newDirectory picks FSDir from time to time (10%?). {quote} I don't think its from FSDir, this is now very very rarely picked. Anyway, as said in the issue summary, for a number of reasons, I don't want to address this on this issue, I want to address the coverage first. {quote} FacetTestUtils.close*() can be removed and calls replaced by IOUtils.closeSafely. This is not critical, just remove redundant code. {quote} ah, you are right. let's change this. {quote} You added a TODO to CategoryListIteratorTest about the test failing if TieredMP is used. In general TieredMP is not good for the taxonomy index, which relies on Lucene doc IDs, and therefore segments must be merged in-order. LTW uses LMP specifically because of that. I will look into the test to understand why would it care about doc IDs, since it doesn't using the taxonomy index at all. {quote} Right, as you said this is for the main index, not the taxonomy index. So I think the test just relies upon lucene doc ids, but I didnt want to just change the test without saying why. {quote} There are few places with code like: assertTrue(Would like to test this with deletions!,indexReader.hasDeletions()), and assertTrue(Would like to test this with deletions!,indexReader.numDeletedDocs() 0) which you removed. Any reason? {quote} Mostly to prevent the tests from failing. RandomIndexWriter randomly optimizes some times, so occasionally there are no deletions. I think this is fine (actually better) as far as coverage... then the deleted docs is occasionally null, etc. {quote} You added a TODO to TestScoredDocIDsUtils (about reader is read-only) – you're right, the comment can be deleted. {quote} OK, I'll nuke this. {quote} We can add RandomTaxonomyWriter as a follow-on commit. {quote} Yes, lets do this separate. crank up faceting module tests -- Key: LUCENE-3264 URL: https://issues.apache.org/jira/browse/LUCENE-3264 Project: Lucene - Java Issue Type: Test Components: modules/facet Reporter: Robert Muir Assignee: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3264.patch The faceting module has a large set of good tests. lets switch them over to use all of our test infra (randomindexwriter, random iwconfig, mockanalyzer, newDirectory, ...) I don't want to address multipliers and atLeast() etc on this issue, I think we should follow up with that on a separate issue, that also looks at speed and making sure the nightly build is exhaustive. for now, lets just get the coverage in, it will be good to do before any refactoring. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3265) Cut over to Java 6 API where needed / possible
[ https://issues.apache.org/jira/browse/LUCENE-3265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057794#comment-13057794 ] Robert Muir commented on LUCENE-3265: - I think we should be careful here: any performance tests need to also be done on -client! Cut over to Java 6 API where needed / possible -- Key: LUCENE-3265 URL: https://issues.apache.org/jira/browse/LUCENE-3265 Project: Lucene - Java Issue Type: Task Affects Versions: 4.0 Reporter: Simon Willnauer Priority: Minor Fix For: 4.0 since we are on 1.6 on trunk we should try to reduce the duplications like in ArrayUtils and cut over to Java 6 API -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-flexscoring-branch - Build # 66 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-flexscoring-branch/66/ 3 tests failed. REGRESSION: org.apache.solr.client.solrj.embedded.MultiCoreExampleJettyTest.testDistributed Error Message: Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. - java.lang.RuntimeException: java.io.FileNotFoundException: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-flexscoring-branch/checkout/solr/example/multicore/core0/data/index/org.apache.solr.core.RefCntRamDirectory@38ca6cea lockFactory=org.apache.lucene.store.simplefslockfact...@6af2da21-write.lock (No such file or directory) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:378) at org.apache.solr.core.SolrCore.init(SolrCore.java:501) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:653) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:406) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:291) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:240) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:93) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.ServletHandler.updateMappings(ServletHandler.java:1104) at org.mortbay.jetty.servlet.ServletHandler.setFilterMappings(ServletHandler.java:1140) at org.mortbay.jetty.servlet.ServletHandler.addFilterWithMapping(ServletHandler.java:940) at org.mortbay.jetty.servlet.ServletHandler.addFilterWithMapping(ServletHandler.java:895) at org.mortbay.jetty.servlet.Context.addFilter(Context.java:207) at org.apache.solr.client.solrj.embedded.JettySolrRunner$1.lifeCycleStarted(JettySolrRunner.java:98) at org.mortbay.component.AbstractLifeCycle.setStarted(AbstractLifeCycle.java:140) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:52) at org.apache.solr.client.solrj.embedded.JettySolrRunner.start(JettySolrRunner.java:123) at org.apache.solr.client.sol Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. - java.lang.RuntimeException: java.io.FileNotFoundException: /usr/home/hudson/hudson-slave/workspace/Lucene-Solr-tests-only-flexscoring-branch/checkout/solr/example/multicore/core0/data/index/org.apache.solr.core.RefCntRamDirectory@38ca6cea lockFactory=org.apache.lucene.store.simplefslockfact...@6af2da21-write.lock (No such file or directory) at org.apache.solr.core.SolrCore.initIndex(SolrCore.java:378) at org.apache.solr.core.SolrCore.init(SolrCore.java:501) at org.apache.solr.core.CoreContainer.create(CoreContainer.java:653) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:406) at org.apache.solr.core.CoreContainer.load(CoreContainer.java:291) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:240) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:93) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713) at org.mortbay.jetty.servlet.ServletHandler.updateMappings(ServletHandler.java:1104) at org.mortbay.jetty.servlet.ServletHandler.setFilterMappings(ServletHandler.java:1140) at org.mortbay.jetty.servlet.ServletHandler.addFilterWithMapping(ServletHandler.java:940) at org.mortbay.jetty.servlet.ServletHandler.addFilterWithMapping(ServletHandler.java:895) at org.mortbay.jetty.servlet.Context.addFilter(Context.java:207) at org.apache.solr.client.solrj.embedded.JettySolrRunner$1.lifeCycleStarted(JettySolrRunner.java:98) at org.mortbay.component.AbstractLifeCycle.setStarted(AbstractLifeCycle.java:140) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:52) at org.apache.solr.client.solrj.embedded.JettySolrRunner.start(JettySolrRunner.java:123) at org.apache.solr.client.sol request: http://localhost:15720/example/core0/update?commit=truewaitFlush=truewaitSearcher=truewt=javabinversion=2 Stack Trace: request: http://localhost:15720/example/core0/update?commit=truewaitFlush=truewaitSearcher=truewt=javabinversion=2 at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:435) at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244) at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105) at
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Hi Robert, you reverted a use of Arrays.copyOf() on native types which is *exactly* implemented like this in Arrays.java code! The slow ones are T T[] copyOf(T[] array, int newlen) (because they use reflection). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: rm...@apache.org [mailto:rm...@apache.org] Sent: Thursday, June 30, 2011 2:42 PM To: comm...@lucene.apache.org Subject: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB yteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB yteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/or g/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1 =1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeB yteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf +++ eByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { -buffer = Arrays.copyOf(buffer, newLength); +// It actually should be: (Java 1.7, when its intrinsic on all machines) +// buffer = Arrays.copyOf(buffer, newLength); +byte[] newBuffer = new byte[newLength]; +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); +buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Robert I agree but doesn't that apply to Arrays.copyOf(Object[],int) only? here we use a specialized primitive version? simon On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics are used it can only get faster. For object types it will probably be a bit slower for -client because of a runtime check for the component type. Dawid On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Robert, as noted in my other eMail, ist only slow for the generic Object[] method (as it uses j.l.reflect.Array.newInstance(Class componentType)). We are talking here about byte[], and the Arrays method is implemented with the same 3 lines of code, Simon replaced. The only difference is a Math.min() which is intrinsic (it is used, as Arrays.copyOf supports shrinking size, so the System.arrayCopy() needs upper limit to not AIOOBE). Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Robert Muir [mailto:rcm...@gmail.com] Sent: Thursday, June 30, 2011 3:05 PM To: dev@lucene.apache.org; simon.willna...@gmail.com Cc: comm...@lucene.apache.org Subject: Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/ org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r 1=1141509r2=1141510view=diff == === = --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Un +++ safeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { -buffer = Arrays.copyOf(buffer, newLength); +// It actually should be: (Java 1.7, when its intrinsic on all + machines) +// buffer = Arrays.copyOf(buffer, newLength); +byte[] newBuffer = new byte[newLength]; +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); +buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
We had an issue about this with FST's array growing in Mike's code, in facts ist *much* slower for generic Arrays' T[] copyOf(T[]...), with T extends Object (uses slow reflection). For primitives it can only get faster in later JVMs, this is why we want to change all ArrayUtils.grow() to use this (and we don’t have a generic one there for above reason). - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of Dawid Weiss Sent: Thursday, June 30, 2011 3:11 PM To: dev@lucene.apache.org Subject: Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics are used it can only get faster. For object types it will probably be a bit slower for -client because of a runtime check for the component type. Dawid On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java /org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510 r1=1141509r2=1141510view=diff == == == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/U +++ nsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { -buffer = Arrays.copyOf(buffer, newLength); +// It actually should be: (Java 1.7, when its intrinsic on all + machines) +// buffer = Arrays.copyOf(buffer, newLength); +byte[] newBuffer = new byte[newLength]; +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); +buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
On Thu, Jun 30, 2011 at 3:26 PM, Uwe Schindler u...@thetaphi.de wrote: We had an issue about this with FST's array growing in Mike's code, in facts ist *much* slower for generic Arrays' T[] copyOf(T[]...), with T extends Object (uses slow reflection). For primitives it can only get faster in later JVMs, this is why we want to change all ArrayUtils.grow() to use this (and we don’t have a generic one there for above reason). +1 - I don't see why this would be any slower... if we can get improvements we should go for it. The issues and bugreports are all for non-primitive copyOf methods so I don't see how this should affect us simon - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: dawid.we...@gmail.com [mailto:dawid.we...@gmail.com] On Behalf Of Dawid Weiss Sent: Thursday, June 30, 2011 3:11 PM To: dev@lucene.apache.org Subject: Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsafe ByteArrayOutputStream.java Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics are used it can only get faster. For object types it will probably be a bit slower for -client because of a runtime check for the component type. Dawid On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java /org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510 r1=1141509r2=1141510view=diff == == == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/Unsaf eByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/U +++ nsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all + machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2565) Prevent IW#close and cut over to IW#commit
[ https://issues.apache.org/jira/browse/SOLR-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057829#comment-13057829 ] Mark Miller commented on SOLR-2565: --- Committed - there is still some wiki work to do. Prevent IW#close and cut over to IW#commit -- Key: SOLR-2565 URL: https://issues.apache.org/jira/browse/SOLR-2565 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Simon Willnauer Fix For: 4.0 Attachments: SOLR-2565.patch Spinnoff from SOLR-2193. We already have a branch to work on this issue here https://svn.apache.org/repos/asf/lucene/dev/branches/solr2193 The main goal here is to prevent solr from closing the IW and use IW#commit instead. AFAIK the main issues here are: The update handler needs an overhaul. A few goals I think we might want to look at: 1. Expose the SolrIndexWriter in the api or add the proper abstractions to get done what we now do with special casing: 2. Stop closing the IndexWriter and start using commit (still lazy IW init though). 3. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level. 4. Address the current issues we face because multiple original/'reloaded' cores can have a different IndexWriter on the same index. Eventually this is a preparation for NRT support in Solr which I will create a followup issue for. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Reopened] (SOLR-2193) Re-architect Update Handler
[ https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reopened SOLR-2193: --- Assignee: Mark Miller (was: Robert Muir) Re-architect Update Handler --- Key: SOLR-2193 URL: https://issues.apache.org/jira/browse/SOLR-2193 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.0 Attachments: SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch The update handler needs an overhaul. A few goals I think we might want to look at: 1. Cleanup - drop DirectUpdateHandler(2) line - move to something like UpdateHandler, DefaultUpdateHandler 2. Expose the SolrIndexWriter in the api or add the proper abstractions to get done what we now do with special casing: if (directupdatehandler2) success else failish 3. Stop closing the IndexWriter and start using commit (still lazy IW init though). 4. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level. 5. Keep NRT support in mind. 6. Keep microsharding in mind (maintain logical index as multiple physical indexes) 7. Address the current issues we face because multiple original/'reloaded' cores can have a different IndexWriter on the same index. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2193) Re-architect Update Handler
[ https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057833#comment-13057833 ] Mark Miller commented on SOLR-2193: --- This issue is superceded by: SOLR-2565 Prevent IW#close and cut over to IW#commit Re-architect Update Handler --- Key: SOLR-2193 URL: https://issues.apache.org/jira/browse/SOLR-2193 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.0 Attachments: SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch The update handler needs an overhaul. A few goals I think we might want to look at: 1. Cleanup - drop DirectUpdateHandler(2) line - move to something like UpdateHandler, DefaultUpdateHandler 2. Expose the SolrIndexWriter in the api or add the proper abstractions to get done what we now do with special casing: if (directupdatehandler2) success else failish 3. Stop closing the IndexWriter and start using commit (still lazy IW init though). 4. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level. 5. Keep NRT support in mind. 6. Keep microsharding in mind (maintain logical index as multiple physical indexes) 7. Address the current issues we face because multiple original/'reloaded' cores can have a different IndexWriter on the same index. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2193) Re-architect Update Handler
[ https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057834#comment-13057834 ] Mark Miller commented on SOLR-2193: --- bq. Curious; why is the resolution status invalid? Dunno - it's not invalid. I've re-resolved as duplicate Re-architect Update Handler --- Key: SOLR-2193 URL: https://issues.apache.org/jira/browse/SOLR-2193 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.0 Attachments: SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch The update handler needs an overhaul. A few goals I think we might want to look at: 1. Cleanup - drop DirectUpdateHandler(2) line - move to something like UpdateHandler, DefaultUpdateHandler 2. Expose the SolrIndexWriter in the api or add the proper abstractions to get done what we now do with special casing: if (directupdatehandler2) success else failish 3. Stop closing the IndexWriter and start using commit (still lazy IW init though). 4. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level. 5. Keep NRT support in mind. 6. Keep microsharding in mind (maintain logical index as multiple physical indexes) 7. Address the current issues we face because multiple original/'reloaded' cores can have a different IndexWriter on the same index. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (SOLR-2193) Re-architect Update Handler
[ https://issues.apache.org/jira/browse/SOLR-2193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller resolved SOLR-2193. --- Resolution: Duplicate Re-architect Update Handler --- Key: SOLR-2193 URL: https://issues.apache.org/jira/browse/SOLR-2193 Project: Solr Issue Type: Improvement Reporter: Mark Miller Assignee: Mark Miller Fix For: 4.0 Attachments: SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch, SOLR-2193.patch The update handler needs an overhaul. A few goals I think we might want to look at: 1. Cleanup - drop DirectUpdateHandler(2) line - move to something like UpdateHandler, DefaultUpdateHandler 2. Expose the SolrIndexWriter in the api or add the proper abstractions to get done what we now do with special casing: if (directupdatehandler2) success else failish 3. Stop closing the IndexWriter and start using commit (still lazy IW init though). 4. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level. 5. Keep NRT support in mind. 6. Keep microsharding in mind (maintain logical index as multiple physical indexes) 7. Address the current issues we face because multiple original/'reloaded' cores can have a different IndexWriter on the same index. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Assigned] (SOLR-2565) Prevent IW#close and cut over to IW#commit
[ https://issues.apache.org/jira/browse/SOLR-2565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller reassigned SOLR-2565: - Assignee: Mark Miller Prevent IW#close and cut over to IW#commit -- Key: SOLR-2565 URL: https://issues.apache.org/jira/browse/SOLR-2565 Project: Solr Issue Type: Improvement Components: update Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Mark Miller Fix For: 4.0 Attachments: SOLR-2565.patch Spinnoff from SOLR-2193. We already have a branch to work on this issue here https://svn.apache.org/repos/asf/lucene/dev/branches/solr2193 The main goal here is to prevent solr from closing the IW and use IW#commit instead. AFAIK the main issues here are: The update handler needs an overhaul. A few goals I think we might want to look at: 1. Expose the SolrIndexWriter in the api or add the proper abstractions to get done what we now do with special casing: 2. Stop closing the IndexWriter and start using commit (still lazy IW init though). 3. Drop iwAccess, iwCommit locks and sync mostly at the Lucene level. 4. Address the current issues we face because multiple original/'reloaded' cores can have a different IndexWriter on the same index. Eventually this is a preparation for NRT support in Solr which I will create a followup issue for. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[Lucene.Net] Possible bug in Lucene with Prefix Search and Danish Locale
I think that the code here shows a bug in Lucene.NET, see http://gist.github.com/1056231. This happens when using 2.9.2. After some digging I think that it's due to the way it does a Prefix search. The main problem is shown by this code http://gist.github.com/1056242. If the Locale is Danish, this returns FALSE, weird eh!! daab.StartsWith(da) //false But this works as expected daab.StartsWith(da, StringComparison.InvariantCulture) //true The line of code that has this problem is the TermCompare(..) function in PrefixTermEnum.cs, see http://svn.apache.org/repos/asf/incubator/lucene.net/trunk/src/core/Search/PrefixTermEnum.cs
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #166: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/166/ No tests ran. Build Log (for compile errors): [...truncated 10375 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene.Net] Possible bug in Lucene with Prefix Search and Danish Locale
Hey Matt, This is issue 420: https://issues.apache.org/jira/browse/LUCENENET-420 I think the theory so far has been that the user should manage the culture rather than Lucene. If you disagree could you post on that issue ticket? Thanks, -Ben - Original Message - From: Matt Warren mattd...@gmail.com To: lucene-net-...@lucene.apache.org Cc: Sent: Thursday, June 30, 2011 9:28 AM Subject: [Lucene.Net] Possible bug in Lucene with Prefix Search and Danish Locale I think that the code here shows a bug in Lucene.NET, see http://gist.github.com/1056231. This happens when using 2.9.2. After some digging I think that it's due to the way it does a Prefix search. The main problem is shown by this code http://gist.github.com/1056242. If the Locale is Danish, this returns FALSE, weird eh!! daab.StartsWith(da) //false But this works as expected daab.StartsWith(da, StringComparison.InvariantCulture) //true The line of code that has this problem is the TermCompare(..) function in PrefixTermEnum.cs, see http://svn.apache.org/repos/asf/incubator/lucene.net/trunk/src/core/Search/PrefixTermEnum.cs
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
I think Dawid is correct here... so we should change this back? still personally when I see array clone() or copyOf() it makes me concerned, I know these are as fast as arraycopy in recent versions, but depending on which variant is used, and whether you use -server, can be slower... in general I just don't want us to have performance regressions on say windows 32bit over this stuff, personally I think arraycopy is a sure fire bet every time, but Ill concede the point that copyOf might not be slower for the primitive versions... I think in jdk7 we will not have this issue as -client sorta goes away in favor of the tiered thing? anyway, I think we should proceed with caution here as far as moving things over to copyOf, I don't see any evidence that its ever faster, but its definitely sometimes slower. On Jun 30, 2011 9:12 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics are used it can only get faster. For object types it will probably be a bit slower for -client because of a runtime check for the component type. Dawid On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { -buffer = Arrays.copyOf(buffer, newLength); +// It actually should be: (Java 1.7, when its intrinsic on all machines) +// buffer = Arrays.copyOf(buffer, newLength); +byte[] newBuffer = new byte[newLength]; +System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); +buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene.Net] Possible bug in Lucene with Prefix Search and Danish Locale
Thanks for the info, after reading issue 420 it makes sense now On 30 June 2011 15:38, Ben West bwsithspaw...@yahoo.com wrote: Hey Matt, This is issue 420: https://issues.apache.org/jira/browse/LUCENENET-420 I think the theory so far has been that the user should manage the culture rather than Lucene. If you disagree could you post on that issue ticket? Thanks, -Ben - Original Message - From: Matt Warren mattd...@gmail.com To: lucene-net-...@lucene.apache.org Cc: Sent: Thursday, June 30, 2011 9:28 AM Subject: [Lucene.Net] Possible bug in Lucene with Prefix Search and Danish Locale I think that the code here shows a bug in Lucene.NET, see http://gist.github.com/1056231. This happens when using 2.9.2. After some digging I think that it's due to the way it does a Prefix search. The main problem is shown by this code http://gist.github.com/1056242. If the Locale is Danish, this returns FALSE, weird eh!! daab.StartsWith(da) //false But this works as expected daab.StartsWith(da, StringComparison.InvariantCulture) //true The line of code that has this problem is the TermCompare(..) function in PrefixTermEnum.cs, see http://svn.apache.org/repos/asf/incubator/lucene.net/trunk/src/core/Search/PrefixTermEnum.cs
[jira] [Updated] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Simon Willnauer updated LUCENE-3216: Attachment: LUCENE-3216.patch one more iteration adding a NestedCompoundDirectory that uses the parents openInputSlice method for efficiency. Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3260) need a test that uses termsenum.seekExact() (which returns true), then calls next()
[ https://issues.apache.org/jira/browse/LUCENE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057896#comment-13057896 ] Shai Erera commented on LUCENE-3260: I see. Thanks for the clarification. +1 to commit. need a test that uses termsenum.seekExact() (which returns true), then calls next() --- Key: LUCENE-3260 URL: https://issues.apache.org/jira/browse/LUCENE-3260 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Attachments: LUCENE-3260.patch i tried to do some seekExact (where the result must exist) then next()ing in the faceting module, and it seems like there could be a bug here. I think we should add a test that mixes seekExact/seekCeil/next like this, to ensure that if seekExact returns true, that the enum is properly positioned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057894#comment-13057894 ] Michael McCandless commented on LUCENE-3216: Looks great! So this means, if you use default StandardCodec, and 3 fields store doc values, and main CFS is off but doc values CFS is on, you'll see a cfs file holding the 3-6 sub-files that your docvalues created, right? But eg if some fields use another codec, then that codec will have its own CFS for any fields it has with docvalues (this is the TODO)? That's seems fine for starters. I like CodecConfig, but I'm not sure it should hold things specific only to 1 codec, like the Pulsing cutoff? The other settings seem more widely applicable... though I guess even terms cache size is not used by various codecs, but it is by enough to have it in CodecConfig, I think? CodecConfig needs @experimental? For the nested test... couldn't you createCompoundOutput directly from an opened CompoundFileDirectory? (Vs creating externally copying in). Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057900#comment-13057900 ] Simon Willnauer commented on LUCENE-3216: - {quote} So this means, if you use default StandardCodec, and 3 fields store doc values, and main CFS is off but doc values CFS is on, you'll see a cfs file holding the 3-6 sub-files that your docvalues created, right?{quote} Correct! {quote} But eg if some fields use another codec, then that codec will have its own CFS for any fields it has with docvalues (this is the TODO)? That's seems fine for starters.{quote} again correct. So what I have in mind is a global cfs that a codec can pull via PerDocWriteState or something that holds all of them but for now having this per codec is fine IMO. I will create a follow up for this. bq. For the nested test... couldn't you createCompoundOutput directly from an opened CompoundFileDirectory? (Vs creating externally copying in). Yes I could but this functionality is tricky and not needed currently so I left it out for now. {quote}I like CodecConfig, but I'm not sure it should hold things specific only to 1 codec, like the Pulsing cutoff? The other settings seem more widely applicable... though I guess even terms cache size is not used by various codecs, but it is by enough to have it in CodecConfig, I think?{quote} I am not sure here, I had the same thought but when you look at Solr and other high level users they need to configure stuff somehow so I put all reasonable core stuff in there. I think its ok to have this for only one codec. Thoughts? Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3264) crank up faceting module tests
[ https://issues.apache.org/jira/browse/LUCENE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057901#comment-13057901 ] Robert Muir commented on LUCENE-3264: - {quote} I don't understand. I thought that you said so regarding introducing atLeast and iterations, and I'm ok with that. I was just asking, since all you've done is move to use newDir, newIWC and RandomIW, how come the tests running time got that much longer? If it's not FSDir, do you have any idea what can cause that? Will RandomIW stall indexing randomly, or maybe it's newIWC which chooses to flush more often? {quote} I think the slowdown is basically linear (the tests run 2x or 3x as slow). Let me explain some of the reasons why you have this slowdown over just normal indexing without using randomiw/mockdirectorywrapper/etc: # we call checkIndex on every directory we create after its closed. I think this is the right thing to do always... it does slow down the tests a bit. # we do sometimes get crappy indexing params, crazy merge params, ridiculous IndexReader/Writer params (e.g. termIndexInterval=1). I think sometimes these non-optimal params slow things down. # occasionally we do things like randomly fully or partially optimize, yield(), etc. So while Lucene's defaults are pretty good, we are testing a bunch of non-default parameters and doing a bunch of other crazy things... so these slow down the tests! That being said, I'm working on the speed issue at least a little here, because I really want to get this test improvements in, although I really didn't want to work on this here (I think 1 minute extra *temporarily* to the build is no big deal for the additional coverage). crank up faceting module tests -- Key: LUCENE-3264 URL: https://issues.apache.org/jira/browse/LUCENE-3264 Project: Lucene - Java Issue Type: Test Components: modules/facet Reporter: Robert Muir Assignee: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3264.patch The faceting module has a large set of good tests. lets switch them over to use all of our test infra (randomindexwriter, random iwconfig, mockanalyzer, newDirectory, ...) I don't want to address multipliers and atLeast() etc on this issue, I think we should follow up with that on a separate issue, that also looks at speed and making sure the nightly build is exhaustive. for now, lets just get the coverage in, it will be good to do before any refactoring. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057903#comment-13057903 ] Simon Willnauer commented on LUCENE-2793: - Varun this patch looks great. I am about to commit it. Can you now work through the nocommits, fix them or post questions here? simon Directory createOutput and openInput should take an IOContext - Key: LUCENE-2793 URL: https://issues.apache.org/jira/browse/LUCENE-2793 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Varun Thacker Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch Today for merging we pass down a larger readBufferSize than for searching because we get better performance. I think we should generalize this to a class (IOContext), which would hold the buffer size, but then could hold other flags like DIRECT (bypass OS's buffer cache), SEQUENTIAL, etc. Then, we can make the DirectIOLinuxDirectory fully usable because we would only use DIRECT/SEQUENTIAL during merging. This will require fixing how IW pools readers, so that a reader opened for merging is not then used for searching, and vice/versa. Really, it's only all the open file handles that need to be different -- we could in theory share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3264) crank up faceting module tests
[ https://issues.apache.org/jira/browse/LUCENE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057906#comment-13057906 ] Shai Erera commented on LUCENE-3264: Thanks Robert. This makes sense to me. bq. although I really didn't want to work on this here (I think 1 minute extra temporarily to the build is no big deal for the additional coverage) I apologize if that caused you to do that work here. I really only wanted to understand. By all means, commit the changes. The explanation makes sense and I'm ok with it. We can speed up things later. crank up faceting module tests -- Key: LUCENE-3264 URL: https://issues.apache.org/jira/browse/LUCENE-3264 Project: Lucene - Java Issue Type: Test Components: modules/facet Reporter: Robert Muir Assignee: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3264.patch, LUCENE-3264.patch The faceting module has a large set of good tests. lets switch them over to use all of our test infra (randomindexwriter, random iwconfig, mockanalyzer, newDirectory, ...) I don't want to address multipliers and atLeast() etc on this issue, I think we should follow up with that on a separate issue, that also looks at speed and making sure the nightly build is exhaustive. for now, lets just get the coverage in, it will be good to do before any refactoring. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3260) need a test that uses termsenum.seekExact() (which returns true), then calls next()
[ https://issues.apache.org/jira/browse/LUCENE-3260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael McCandless resolved LUCENE-3260. Resolution: Fixed Fix Version/s: 4.0 need a test that uses termsenum.seekExact() (which returns true), then calls next() --- Key: LUCENE-3260 URL: https://issues.apache.org/jira/browse/LUCENE-3260 Project: Lucene - Java Issue Type: Bug Reporter: Robert Muir Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3260.patch i tried to do some seekExact (where the result must exist) then next()ing in the faceting module, and it seems like there could be a bug here. I think we should add a test that mixes seekExact/seekCeil/next like this, to ensure that if seekExact returns true, that the enum is properly positioned. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057935#comment-13057935 ] Robert Muir commented on LUCENE-3216: - {quote} I am not sure here, I had the same thought but when you look at Solr and other high level users they need to configure stuff somehow so I put all reasonable core stuff in there. I think its ok to have this for only one codec. Thoughts? {quote} I don't like CodecConfig actually. It doesn't make sense that it contains all these codec-specific parameters, which should be private to the codec. I think lucene's codecs should just be APIs and have ordinary ctors. As far as higher-level stuff like Solr, we can improve it there so its easier for users to configure this stuff, for example the Solr codec configuration allows you to specify a codecproviderfactory that takes arbitrary nested xml and parses it however you want. The only problem is we don't have a *concrete* (e.g. non-mock/test) implementation in Solr that actually exposes all of what lucene can offer... I would prefer we instead just do this, and make a SolrCodecProviderFactory that lets you configure skip intervals, pulsing cutoffs, and all these other codec-specific options in a type-safe way. Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (SOLR-1674) improve analysis tests, cut over to new API
[ https://issues.apache.org/jira/browse/SOLR-1674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mark Miller updated SOLR-1674: -- Assignee: Robert Muir (was: Mark Miller) improve analysis tests, cut over to new API --- Key: SOLR-1674 URL: https://issues.apache.org/jira/browse/SOLR-1674 Project: Solr Issue Type: Test Components: Schema and Analysis Reporter: Robert Muir Assignee: Robert Muir Fix For: 4.0 Attachments: SOLR-1674.patch, SOLR-1674.patch, SOLR-1674_speedup.patch This patch * converts all analysis tests to use the new tokenstream api * converts most tests to use the more stringent assertion mechanisms from lucene * adds new tests to improve coverage Most bugs found by more stringent testing have been fixed, with the exception of SynonymFilter. The problems with this filter are more serious, the previous tests were essentially a no-op. The new tests for SynonymFilter test the current behavior, but have FIXMEs with what I think the old test wanted to expect in the comments. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS-MAVEN] Lucene-Solr-Maven-trunk #167: POMs out of sync
Build: https://builds.apache.org/job/Lucene-Solr-Maven-trunk/167/ No tests ran. Build Log (for compile errors): [...truncated 7426 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Although there are a lot of people using Lucene.Net, this is our contribution report for the past 5 years. https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r eport.contributions%3AcontributionreportNext=Next DIGY -Original Message- From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Thursday, June 30, 2011 8:16 PM To: Rory Plaire; lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? As someone from the nhibernate project We stopped following hibernate a while ago, and haven't regretted it We have mire features, less bugs and better code base Sent from my Windows Phone From: Rory Plaire Sent: Thursday, June 30, 2011 19:58 To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't want to drag this out much longer, but I am curious with people who hold the line-by-line sentiment - are you NHibernate users? -r On Thu, Jun 30, 2011 at 2:39 AM, Noel Lysaght lysag...@hotmail.com wrote: Can I just plug in my bit and say I agree 100% with what Moray has outlined below. If we move away from the line by line port then over time we'll loose out on the momentum that is Lucene and the improvements that they make. It is only if the Lucene.NET community has expertise in search, a deep knowledge of the project and the community can guarantee that the knowledge will survive members coming and going should such a consideration be give. When Lucene.NET has stood on it's feet for a number of years after it has moved out of Apache incubation should consideration be given to abandoning a line by line port. By all means extend and wrap the libraries in .NET equivalents and .NET goodness like LINQ (we do this internally in our company at the moment); but leave the core of the project on a line by line port. Just my tu-pence worth. Kind Regards Noel -Original Message- From: Moray McConnachie Sent: Thursday, June 30, 2011 10:25 AM To: lucene-net-user@lucene.apache.**orglucene-net-u...@lucene.apache.org Cc: lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't think I'm as hard core on this as Neal, but remember: the history of the Lucene.NET project is that all the intellectual work, all the understanding of search, all the new features come from the Lucene Java folks. Theirs is an immensely respected project, and I trust them to add new features that will be well-tested and well-researched, and to have a decent roadmap which I can trust they will execute on. Now I know there's been an influx of capable developers to Lucene.NET who are ready, willing and (I'm going to assume) able to add a lot more value in a generic .NET implementation as they change it. But it'll take a while before I trust a .NET dedicated framework which is significantly diverged from Java in the way I do the line-by-line version. And at what stage is it not just not a line-by-line port, but not a port at all? At the same time, I recognise that if this project is going to continue, and attract good developers, it has to change in this direction. So that said, I can see why a line-by-line port might not be sustainable. And most people don't need it. But most of us using Lucene in production systems do need a system that we can trust and rely on. So let me chime in with someone else's plea, to keep the general structure close to Lucene, to keep the same general objects and inheritance set-up, and to keep the same method names, even if you add other methods and classes to provide additional functionality. ABSOLUTELY the same file formats. End users benefit a lot from a high degree of similarity, with good documentation and help being available from the Java community. Yours, Moray --**--- Moray McConnachie Director of IT+44 1865 261 600 Oxford Analytica http://www.oxan.com -Original Message- From: Granroth, Neal V. [mailto:neal.granroth@**thermofisher.comneal.granr...@thermofisher.com ] Sent: 29 June 2011 20:47 To: lucene-net-user@lucene.apache.**orglucene-net-u...@lucene.apache.org Cc: lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? This is has been discussed many times. Lucene.NET is not valid, the code cannot be trusted, if it is not a line-by-line port. It ceases to be Lucene. - Neal -Original Message- From: Scott Lombard [mailto:lombardenator@gmail.**comlombardena...@gmail.com ] Sent: Wednesday, June 29, 2011 1:58 PM To: lucene-net-dev@lucene.apache.**org lucene-net-...@lucene.apache.org;
[jira] [Commented] (LUCENE-3216) Store DocValues per segment instead of per field
[ https://issues.apache.org/jira/browse/LUCENE-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057967#comment-13057967 ] Simon Willnauer commented on LUCENE-3216: - I will back out the config stuff and make it default to CFS. Somehow somebody who needs it eventually will figure it out how to make it non-private whatever. Store DocValues per segment instead of per field Key: LUCENE-3216 URL: https://issues.apache.org/jira/browse/LUCENE-3216 Project: Lucene - Java Issue Type: Improvement Components: core/index Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Simon Willnauer Fix For: 4.0 Attachments: LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216.patch, LUCENE-3216_floats.patch currently we are storing docvalues per field which results in at least one file per field that uses docvalues (or at most two per field per segment depending on the impl.). Yet, we should try to by default pack docvalues into a single file if possible. To enable this we need to hold all docvalues in memory during indexing and write them to disk once we flush a segment. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
On Thu, Jun 30, 2011 at 4:44 PM, Robert Muir rcm...@gmail.com wrote: I think Dawid is correct here... so we should change this back? still personally when I see array clone() or copyOf() it makes me concerned, I know these are as fast as arraycopy in recent versions, but depending on which variant is used, and whether you use -server, can be slower... in general I just don't want us to have performance regressions on say windows 32bit over this stuff, personally I think arraycopy is a sure fire bet every time, but Ill concede the point that copyOf might not be slower for the primitive versions... I think in jdk7 we will not have this issue as -client sorta goes away in favor of the tiered thing? anyway, I think we should proceed with caution here as far as moving things over to copyOf, I don't see any evidence that its ever faster, but its definitely sometimes slower. I don't seen any evidence that this is any slower though. simon On Jun 30, 2011 9:12 AM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: Arrays.copyOf(primitive) is actually System.arraycopy by default. If intrinsics are used it can only get faster. For object types it will probably be a bit slower for -client because of a runtime check for the component type. Dawid On Thu, Jun 30, 2011 at 3:05 PM, Robert Muir rcm...@gmail.com wrote: because on windows 32bit at least, -client is still the default on most jres out there. i realize people don't care about -client, but i will police foo[].clone() / arrays.copyOf etc to prevent problems. There are comments about this stuff on the relevant bug reports (oracle's site is down, sorry) linked to this issue. https://issues.apache.org/jira/browse/LUCENE-2674 Sorry, I don't think we should use foo[].clone() / arrays.copyOf, I think we should always use arraycopy. On Thu, Jun 30, 2011 at 8:55 AM, Simon Willnauer simon.willna...@googlemail.com wrote: hmm are you concerned about the extra Math.min that happens in the copyOf method? I don't how that relates to intrinsic and java 1.7 I don't have strong feelings here just checking if you mix something up in the comment you put there... I am happy to keep the old and now current code simon On Thu, Jun 30, 2011 at 2:42 PM, rm...@apache.org wrote: Author: rmuir Date: Thu Jun 30 12:42:17 2011 New Revision: 1141510 URL: http://svn.apache.org/viewvc?rev=1141510view=rev Log: LUCENE-3239: remove use of slow Arrays.copyOf Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Modified: lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java URL: http://svn.apache.org/viewvc/lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java?rev=1141510r1=1141509r2=1141510view=diff == --- lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java (original) +++ lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java Thu Jun 30 12:42:17 2011 @@ -2,7 +2,6 @@ package org.apache.lucene.util; import java.io.IOException; import java.io.OutputStream; -import java.util.Arrays; /** * Licensed to the Apache Software Foundation (ASF) under one or more @@ -72,7 +71,11 @@ public class UnsafeByteArrayOutputStream } private void grow(int newLength) { - buffer = Arrays.copyOf(buffer, newLength); + // It actually should be: (Java 1.7, when its intrinsic on all machines) + // buffer = Arrays.copyOf(buffer, newLength); + byte[] newBuffer = new byte[newLength]; + System.arraycopy(buffer, 0, newBuffer, 0, buffer.length); + buffer = newBuffer; } /** - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3264) crank up faceting module tests
[ https://issues.apache.org/jira/browse/LUCENE-3264?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Muir resolved LUCENE-3264. - Resolution: Fixed ok, committed and backported. I think we should open followup issue(s): * speed up the top-k sampling tests (but make sure they are thorough on nightly etc still) * make a RandomTaxonomyWriter * look at any hardcoded constants like #docs etc and see if we can in general add randomization. crank up faceting module tests -- Key: LUCENE-3264 URL: https://issues.apache.org/jira/browse/LUCENE-3264 Project: Lucene - Java Issue Type: Test Components: modules/facet Reporter: Robert Muir Assignee: Robert Muir Fix For: 3.4, 4.0 Attachments: LUCENE-3264.patch, LUCENE-3264.patch The faceting module has a large set of good tests. lets switch them over to use all of our test infra (randomindexwriter, random iwconfig, mockanalyzer, newDirectory, ...) I don't want to address multipliers and atLeast() etc on this issue, I think we should follow up with that on a separate issue, that also looks at speed and making sure the nightly build is exhaustive. for now, lets just get the coverage in, it will be good to do before any refactoring. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3246) Invert IR.getDelDocs - IR.getLiveDocs
[ https://issues.apache.org/jira/browse/LUCENE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057979#comment-13057979 ] Uwe Schindler commented on LUCENE-3246: --- Hi Mike, As we have now both variants to read/write BitVectors, would it be not a good idea to automatically use the old encoding for liveDocs, if more than 50% of all bits are unset? This would save disk space if a segments has more deletetions than live docs. Not sure if this can easily be implemented and is worth the complexity (that we already have because of both versions)? The patch looks fine! Invert IR.getDelDocs - IR.getLiveDocs -- Key: LUCENE-3246 URL: https://issues.apache.org/jira/browse/LUCENE-3246 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3246-IndexSplitters.patch, LUCENE-3246.patch, LUCENE-3246.patch Spinoff from LUCENE-1536, where we need to fix the low level filtering we do for deleted docs to match Filters (ie, a set bit means the doc is accepted) so that filters can be pushed all the way down to the enums when possible/appropriate. This change also inverts the meaning first arg to TermsEnum.docs/AndPositions (renames from skipDocs to liveDocs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3179) OpenBitSet.prevSetBit()
[ https://issues.apache.org/jira/browse/LUCENE-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13057980#comment-13057980 ] Uwe Schindler commented on LUCENE-3179: --- Any other comments/microbenchmarks from other committers? Dawid and Paul? I would like to commit this if nobody objects! What should we do with the then obsolete BitUtils methods? OpenBitSet.prevSetBit() --- Key: LUCENE-3179 URL: https://issues.apache.org/jira/browse/LUCENE-3179 Project: Lucene - Java Issue Type: Improvement Reporter: Paul Elschot Assignee: Paul Elschot Priority: Minor Fix For: 3.3, 4.0 Attachments: LUCENE-3179-fix.patch, LUCENE-3179-fix.patch, LUCENE-3179-long-ntz.patch, LUCENE-3179-long-ntz.patch, LUCENE-3179.patch, LUCENE-3179.patch, LUCENE-3179.patch, TestBitUtil.java, TestOpenBitSet.patch Find a previous set bit in an OpenBitSet. Useful for parent testing in nested document query execution LUCENE-2454 . -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
I don't seen any evidence that this is any slower though. You need to run with -client (if the machine is a beast this is tricky because x64 will pick -server regardless of the command-line setting) and you need to be copying generic arrays. I think this can be shown -- a caliper benchmark would be perfect to demonstrate this in isolation; I may write one if I find a spare moment. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
I think it's important Lucene keeps good performance on ordinary machines/envs. It's really quite dangerous that the active Lucene devs all use beasts for development/testing. We draw false conclusions. So we really should be testing with -client and if indeed generified Arrays.copyOf (and anything else) is risky in such envs we should not use it when System.arraycopy works more consistently. Mike McCandless http://blog.mikemccandless.com On Thu, Jun 30, 2011 at 2:50 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I don't seen any evidence that this is any slower though. You need to run with -client (if the machine is a beast this is tricky because x64 will pick -server regardless of the command-line setting) and you need to be copying generic arrays. I think this can be shown -- a caliper benchmark would be perfect to demonstrate this in isolation; I may write one if I find a spare moment. Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the internals and index formats are significantly different including nixing the current vint file format and using byte[] array slices for Terms instead of char[]. So while porting 1 to 1 while require less knowledge or thought, its most likely going to require more hours of work. And Its definitely not going to guarantee the stability of the code or that its great code. I'd have to say that its not as stable as most would believe at the moment. Most of the tests avoid anything that remotely looks like it knows about the DRY principle and there is a static constructor in the core test case that throws an exception if it doesn't find an environment variable TEMP which will fail 90% of the tests and nunit will be unable to give you a clear reason why. Just to name a few issues I came across working towards getting Lucene.Net into CI. I haven't even started really digging in under the covers of the code yet. So my suggestion is to chew on this a bit more and build consensus, avoid fracturing people into sides. Be open to reservations and concerns that others have and continue to address them. - Michael On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote: Although there are a lot of people using Lucene.Net, this is our contribution report for the past 5 years. https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r eport.contributions%3AcontributionreportNext=Next DIGY -Original Message- From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Thursday, June 30, 2011 8:16 PM To: Rory Plaire; lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? As someone from the nhibernate project We stopped following hibernate a while ago, and haven't regretted it We have mire features, less bugs and better code base Sent from my Windows Phone From: Rory Plaire Sent: Thursday, June 30, 2011 19:58 To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't want to drag this out much longer, but I am curious with people who hold the line-by-line sentiment - are you NHibernate users? -r On Thu, Jun 30, 2011 at 2:39 AM, Noel Lysaght lysag...@hotmail.com wrote: Can I just plug in my bit and say I agree 100% with what Moray has outlined below. If we move away from the line by line port then over time we'll loose out on the momentum that is Lucene and the improvements that they make. It is only if the Lucene.NET community has expertise in search, a deep knowledge of the project and the community can guarantee that the knowledge will survive members coming and going should such a consideration be give. When Lucene.NET has stood on it's feet for a number of years after it has moved out of Apache incubation should consideration be given to abandoning a line by line port. By all means extend and wrap the libraries in .NET equivalents and .NET goodness like LINQ (we do this internally in our company at the moment); but leave the core of the project on a line by line port. Just my tu-pence worth. Kind Regards Noel -Original Message- From: Moray McConnachie Sent: Thursday, June 30, 2011 10:25 AM To: lucene-net-user@lucene.apache.**org lucene-net-u...@lucene.apache.org Cc: lucene-net-dev@incubator.**apache.orglucene-net-...@incubator.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't think I'm as hard core on this as Neal, but remember: the history of the Lucene.NET project is that all the intellectual work, all the understanding of search, all the new features come from the Lucene Java folks. Theirs is an immensely respected project, and I trust them to add new features that will be well-tested and well-researched, and to have a decent roadmap which I can trust they will execute on. Now I know there's been an influx of capable developers to Lucene.NET who are ready, willing and (I'm going to assume) able to add a lot more value in a generic .NET implementation as they change it. But it'll take a while before I trust a .NET dedicated framework which is significantly diverged from Java in the way I do the line-by-line version. And at what
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
I think it's important Lucene keeps good performance on ordinary machines/envs. Not that this voice will help in anything, but I think the above is virtually impossible to achieve unless you have a bunch of machines, OSs and VMs to continually test on and a consistent set of benchmarks plotted over time... and of course check every single commit for regression over all these combinations. And even then you'd always find a case of something being faster or slower on some combination of hardware/ software; optimizing for these differences makes little sense to me (people struggling with performance on some weird software/hardware combination can always change the VM vendor or a VM switch). Sorry for being so pessimistically unconstructive... :( Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2341) explore morfologik integration
[ https://issues.apache.org/jira/browse/LUCENE-2341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dawid Weiss resolved LUCENE-2341. - Resolution: Fixed In trunk. Long live 1.6 support. explore morfologik integration -- Key: LUCENE-2341 URL: https://issues.apache.org/jira/browse/LUCENE-2341 Project: Lucene - Java Issue Type: New Feature Components: modules/analysis Reporter: Robert Muir Assignee: Dawid Weiss Fix For: 4.0 Attachments: LUCENE-2341.diff, LUCENE-2341.diff, LUCENE-2341.diff, LUCENE-2341.diff, LUCENE-2341.patch, LUCENE-2341.patch, morfologik-fsa-1.5.2.jar, morfologik-polish-1.5.2.jar, morfologik-stemming-1.5.2.jar Dawid Weiss mentioned on LUCENE-2298 that there is another Polish stemmer available: http://sourceforge.net/projects/morfologik/ This works differently than LUCENE-2298, and ideally would be another option for users. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-3.x - Build # 9208 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9208/ 1 tests failed. REGRESSION: org.apache.lucene.facet.util.TestScoredDocIDsUtils.testWithDeletions Error Message: Wrong number of (live) documents expected:65 but was:64 Stack Trace: junit.framework.AssertionFailedError: Wrong number of (live) documents expected:65 but was:64 at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1277) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1195) at org.apache.lucene.facet.util.TestScoredDocIDsUtils.testWithDeletions(TestScoredDocIDsUtils.java:142) Build Log (for compile errors): [...truncated 8816 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
Fair enough, and I agree. Though the least we could do is rotate in a Windows env, where Java runs with -client, to our Jenkins. But simple-to-follow rules like Don't use Arrays.copyOf; use System.arraycopy instead (if indeed System.arraycopy seems to generally not be slower) seem like a no-brainer. Why risk Arrays.copyOf, anytime? Shouldn't we never use it...? Mike McCandless http://blog.mikemccandless.com On Thu, Jun 30, 2011 at 3:09 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I think it's important Lucene keeps good performance on ordinary machines/envs. Not that this voice will help in anything, but I think the above is virtually impossible to achieve unless you have a bunch of machines, OSs and VMs to continually test on and a consistent set of benchmarks plotted over time... and of course check every single commit for regression over all these combinations. And even then you'd always find a case of something being faster or slower on some combination of hardware/ software; optimizing for these differences makes little sense to me (people struggling with performance on some weird software/hardware combination can always change the VM vendor or a VM switch). Sorry for being so pessimistically unconstructive... :( Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-3.x - Build # 9208 - Failure
this one reproduces, and just beasting the test, looks like this test fails ~ 2% of the time on trunk and branch_3x On Thu, Jun 30, 2011 at 3:24 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-3.x/9208/ 1 tests failed. REGRESSION: org.apache.lucene.facet.util.TestScoredDocIDsUtils.testWithDeletions Error Message: Wrong number of (live) documents expected:65 but was:64 Stack Trace: junit.framework.AssertionFailedError: Wrong number of (live) documents expected:65 but was:64 at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1277) at org.apache.lucene.util.LuceneTestCase$LuceneTestCaseRunner.runChild(LuceneTestCase.java:1195) at org.apache.lucene.facet.util.TestScoredDocIDsUtils.testWithDeletions(TestScoredDocIDsUtils.java:142) Build Log (for compile errors): [...truncated 8816 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[JENKINS] Lucene-Solr-tests-only-trunk - Build # 9205 - Failure
Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9205/ All tests passed Build Log (for compile errors): [...truncated 10841 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [JENKINS] Lucene-Solr-tests-only-trunk - Build # 9205 - Failure
javadocs failed. I'll fix it. Dawid On Thu, Jun 30, 2011 at 9:35 PM, Apache Jenkins Server jenk...@builds.apache.org wrote: Build: https://builds.apache.org/job/Lucene-Solr-tests-only-trunk/9205/ All tests passed Build Log (for compile errors): [...truncated 10841 lines...] - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the internals and index formats are significantly different including nixing the current vint file format and using byte[] array slices for Terms instead of char[]. So while porting 1 to 1 while require less knowledge or thought, its most likely going to require more hours of work. And Its definitely not going to guarantee the stability of the code or that its great code. I'd have to say that its not as stable as most would believe at the moment. Most of the tests avoid anything that remotely looks like it knows about the DRY principle and there is a static constructor in the core test case that throws an exception if it doesn't find an environment variable TEMP which will fail 90% of the tests and nunit will be unable to give you a clear reason why. Just to name a few issues I came across working towards getting Lucene.Net into CI. I haven't even started really digging in under the covers of the code yet. So my suggestion is to chew on this a bit more and build consensus, avoid fracturing people into sides. Be open to reservations and concerns that others have and continue to address them. - Michael On Thu, Jun 30, 2011 at 2:10 PM, Digy digyd...@gmail.com wrote: Although there are a lot of people using Lucene.Net, this is our contribution report for the past 5 years. https://issues.apache.org/jira/secure/ConfigureReport.jspa?atl_token=A5KQ-2Q AV-T4JA-FDED|3204f7e696067a386144705075c074e991db2a2b|linversionId=-1issue Status=allselectedProjectId=12310290reportKey=com.sourcelabs.jira.plugin.r eport.contributions%3AcontributionreportNext=Next DIGY -Original Message- From: Ayende Rahien [mailto:aye...@ayende.com] Sent: Thursday, June 30, 2011 8:16 PM To: Rory Plaire; lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? As someone from the nhibernate project We stopped following hibernate a while ago, and haven't regretted it We have mire features, less bugs and better code base Sent from my Windows Phone From: Rory Plaire Sent: Thursday, June 30, 2011 19:58 To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? I don't want to drag this out much longer, but I am curious with people who hold the line-by-line sentiment - are you NHibernate users? -r On Thu, Jun 30, 2011 at 2:39 AM, Noel Lysaght lysag...@hotmail.com wrote: Can I just plug in my bit and say I agree 100% with what Moray has outlined below. If we move away from the line by line port then over time we'll loose out on the momentum that is Lucene and the improvements that they make. It is only if the Lucene.NET community has expertise in search, a deep knowledge of the project and the community can guarantee that the knowledge will survive members coming and going should such a
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I don't seen any evidence that this is any slower though. You need to run with -client (if the machine is a beast this is tricky because x64 will pick -server regardless of the command-line setting) and you need to be copying generic arrays. I think this can be shown -- a caliper benchmark would be perfect to demonstrate this in isolation; I may write one if I find a spare moment. this is what I want to see. I don't want to discuss based on some bug reported for a non-primitive version of copyOf thats all. its pointless to discuss if there is no evidence which I don't see. I am happy with arraycopy I would just have appreciated a discussion before backing the change out. simon Dawid - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-1879) Parallel incremental indexing
[ https://issues.apache.org/jira/browse/LUCENE-1879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058072#comment-13058072 ] hao yan commented on LUCENE-1879: - Hi, Michael Is there any lastest progress on this topic? I am very interested in this! Parallel incremental indexing - Key: LUCENE-1879 URL: https://issues.apache.org/jira/browse/LUCENE-1879 Project: Lucene - Java Issue Type: New Feature Components: core/index Reporter: Michael Busch Assignee: Michael Busch Fix For: 4.0 Attachments: parallel_incremental_indexing.tar A new feature that allows building parallel indexes and keeping them in sync on a docID level, independent of the choice of the MergePolicy/MergeScheduler. Find details on the wiki page for this feature: http://wiki.apache.org/lucene-java/ParallelIncrementalIndexing Discussion on java-dev: http://markmail.org/thread/ql3oxzkob7aqf3jd -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed?
Ok I think I asked the wrong question. I am trying to figure out where to put my time. I was thinking about working on the automated porting system, but when I saw the response to the .NET 4.0 discussions I started to question if that is the right direction. The community seemed to be more interested in the .NET features. The complexity of the automated tool is going to become very high and will probably end up with a line-for-line style port. So I keep asking my self is the automated tool worth it. I don't think it is. I like the method has been Digy is using for porting the code. So I guess for me the real question is Digy where did you see 2.9.4g going next and what do you need help on? Scott -Original Message- From: Digy [mailto:digyd...@gmail.com] Sent: Thursday, June 30, 2011 4:20 PM To: lucene-net-...@lucene.apache.org Subject: RE: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, You interpret the report as whoever commits code wins? But when I look at it, I see a lof of talk, no work. .Net community is not interested in contributing. I really don't understand what hinders people to work on Lucene.Net. As I did for 2.9.4g, grab the code, do whatever you want on it and submit back. If it doesn't fit to the project's direction it can still find a place in contrib or in branch. All of the approaches can live side by side happily in the Lucene.Net repository. Troy, I also don't understand why do you wait for 2.9.4g? It is a *branch* and has nothing to do with the trunk. It need not be an offical release and can live in branch as a PoC. As a result, I got bored to listen to this should be done that way. What I want to see is I did it that way, should we continue with this. DIGY -Original Message- From: Troy Howard [mailto:thowar...@gmail.com] Sent: Thursday, June 30, 2011 10:47 PM To: lucene-net-...@lucene.apache.org Subject: Re: [Lucene.Net] Is a Lucene.Net Line-by-Line Jave port needed? Michael, I agree with everything you said. My point in saying whoever commits code wins was to illustrate the reality of how and why the project has the current form. Building consensus is difficult. It is an essential first step before we can do something like make a list of bit-sized pieces of work that others can work on. This is why my real message of Let's find a way to accommodate both is so important. It allows us to build consensus, so that we can settle on a direction and structure our work. Until we accomplish that, it really is whoever commits code wins, and that is an unhealthy and unmaintainable way to operate. From a technical perspective, your statements about the unit tests are completely accurate. They really need a LOT of reworking. That's the very first step before making any significant changes. Part of the problem is that the tests themselves are not well written. The other part is that the Lucene object model was not designed for testability, and it makes writing good tests more difficult, and certain tests might not be possible. It will be difficult to write good unit tests without re-structuring. The biggest issue is the use of abstract classes with base behaviour vs interfaces or fully abstracted classes. Makes mocking tough. This is the direction I was going when I started the Lucere project. I'd like to start in on that work after the 2.9.4g release. Thanks, Troy On Thu, Jun 30, 2011 at 12:04 PM, Michael Herndon mhern...@wickedsoftware.net wrote: I'd say that is all the more reasons that we need to work smarter and not harder. I'd also say thats a good reason to make sure we build consensus rather than just saying whoever commits code wins. And its a damn good reason to focus on the effort of growing the number of contributors and lowing the barrier to submitting patches, breaking things down into pieces that people would feel confident to work on without being overwhelmed by the complexity of Lucene.Net. There is a pretty big gap between Lucene 2.9.x to Lucene 4.0 and the internals and index formats are significantly different including nixing the current vint file format and using byte[] array slices for Terms instead of char[]. So while porting 1 to 1 while require less knowledge or thought, its most likely going to require more hours of work. And Its definitely not going to guarantee the stability of the code or that its great code. I'd have to say that its not as stable as most would believe at the moment. Most of the tests avoid anything that remotely looks like it knows about the DRY principle and there is a static constructor in the core test case that throws an exception if it doesn't find an environment variable TEMP which will fail 90% of the tests and nunit will be unable to give you a clear reason why. Just to name a few issues I came across working towards getting
Re: managing CHANGES.txt?
: There's no sense in CHANGES being a 'rolling list', when someone looks : at 4.0 they should be able to see whats DIFFERENT aka what CHANGED : from the past release. I agree completely, the disagreement is *which* past release the list should be relative to. I don't know how many more ways i can say it: I believe that the list of changes for 4.0 should be labled (and contain) Changes since 3.0 -- because that is the most recent past release sith a common development history. When we only had a single trunk and the 3.0 release branch was forked from the same place as the 2.9 release branch it made sense to think of the 3.0 changes list as Changes since 2.9 because they were genuine success of eachother -- any code in 2.9 was by definition in 3.0 unless it was modified/removed by somehting listed in the 3.0 changes. That is not going to be true for 3.3 and 4.0 (or 3.4 and 4.0, or 3.7 and 4.0 or whatever our last 3.x release is before 4.0). The list of changes for a release should always make it clear *exactly* what is differnet between that release and the previous release with common lineage of source code -- it may sound weird, but it's what i believe and it's consistent with how we've done bug fix releases in the past -- they've refered to changes since their parent release, not since the last calendar release. Since no one seems to agree with me on this, I've tried to let this go (twice!) by stating my position and conceeding that it's not concensus -- but if you keep reviving the argument, i'll happily keep restating my beliefs. -Hoss - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: svn commit: r1141510 - /lucene/dev/trunk/modules/facet/src/java/org/apache/lucene/util/UnsafeByteArrayOutputStream.java
On Thu, Jun 30, 2011 at 4:45 PM, Simon Willnauer simon.willna...@googlemail.com wrote: On Thu, Jun 30, 2011 at 8:50 PM, Dawid Weiss dawid.we...@cs.put.poznan.pl wrote: I don't seen any evidence that this is any slower though. You need to run with -client (if the machine is a beast this is tricky because x64 will pick -server regardless of the command-line setting) and you need to be copying generic arrays. I think this can be shown -- a caliper benchmark would be perfect to demonstrate this in isolation; I may write one if I find a spare moment. this is what I want to see. I don't want to discuss based on some bug reported for a non-primitive version of copyOf thats all. its pointless to discuss if there is no evidence which I don't see. I am happy with arraycopy I would just have appreciated a discussion before backing the change out. I think the burden of proof here is on Arrays.copyOf. Ie, until we can prove (through benchmarking in different envs) that it can be trusted, we should just stick with System.arraycopy to reduce the risk. Mike - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3246) Invert IR.getDelDocs - IR.getLiveDocs
[ https://issues.apache.org/jira/browse/LUCENE-3246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058107#comment-13058107 ] Michael McCandless commented on LUCENE-3246: bq. As we have now both variants to read/write BitVectors, would it be not a good idea to automatically use the old encoding for liveDocs, if more than 50% of all bits are unset? That seems like a good idea? Ie, handle both sparse set and sparse unset compactly? Though it should be unusual that you have so many deletes against a segment (esp. because TMP now targets such segs more aggressively). We should do this under a new issue (the old code also didn't handle the many deletions case sparsely either, just the few deletions case). Invert IR.getDelDocs - IR.getLiveDocs -- Key: LUCENE-3246 URL: https://issues.apache.org/jira/browse/LUCENE-3246 Project: Lucene - Java Issue Type: Improvement Components: core/index Reporter: Michael McCandless Assignee: Michael McCandless Fix For: 4.0 Attachments: LUCENE-3246-IndexSplitters.patch, LUCENE-3246.patch, LUCENE-3246.patch Spinoff from LUCENE-1536, where we need to fix the low level filtering we do for deleted docs to match Filters (ie, a set bit means the doc is accepted) so that filters can be pushed all the way down to the enums when possible/appropriate. This change also inverts the meaning first arg to TermsEnum.docs/AndPositions (renames from skipDocs to liveDocs). -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (SOLR-2623) Solr JMX MBeans do not survive core reloads
[ https://issues.apache.org/jira/browse/SOLR-2623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058112#comment-13058112 ] Hoss Man commented on SOLR-2623: Alexey: at first glance, i think i would prefer Shalin's suggestion over your patch. My main hesitation about your approach is the parameterized close method -- If we really go that route i'd much rather see something like a SolrCore.preCloseToReleaesResources() method. But more fundementally, if we unregister the MBeans before creating the new core, there is a window of time when the old core is responding to requests, but can't be monitored (and if anything goes wrong with creating the new core, the old one will continue to handle requests indefinitely but be totally unmonitorable. That said: i suspect the fix might even be easier then what Shalin proposed (which would require making SolrCore passing itself into the JmxMonitoredMap) ... can't we essentially change JmxMonitoredMap.unregsiter(String,SolrInfoMBean) to have psuedo code like this.. {code} if (server.isRegistered(name)) { MBean existing = server.getMBean(name) if (existing intsanceof SolrDynamicMBean existing.getSolrInfoMBean() == this.get(name)) { server.unregisterMBean(name); } else { // :NOOP: MBean is not ours } } {code} ...adding a package protected SolrDynamicMBean.getSolrInfoMBean() seems less invasive then passing the SolrCore to another class Solr JMX MBeans do not survive core reloads --- Key: SOLR-2623 URL: https://issues.apache.org/jira/browse/SOLR-2623 Project: Solr Issue Type: Bug Components: multicore Affects Versions: 1.4, 1.4.1, 3.1, 3.2 Reporter: Alexey Serba Assignee: Shalin Shekhar Mangar Priority: Minor Attachments: SOLR-2623.patch, SOLR-2623.patch, SOLR-2623.patch Solr JMX MBeans do not survive core reloads {noformat:title=Steps to reproduce} sh cd example sh vi multicore/core0/conf/solrconfig.xml # enable jmx sh java -Dcom.sun.management.jmxremote -Dsolr.solr.home=multicore -jar start.jar sh echo 'open 8842 # 8842 is java pid domain solr/core0 beans ' | java -jar jmxterm-1.0-alpha-4-uber.jar solr/core0:id=core0,type=core solr/core0:id=org.apache.solr.handler.StandardRequestHandler,type=org.apache.solr.handler.StandardRequestHandler solr/core0:id=org.apache.solr.handler.StandardRequestHandler,type=standard solr/core0:id=org.apache.solr.handler.XmlUpdateRequestHandler,type=/update solr/core0:id=org.apache.solr.handler.XmlUpdateRequestHandler,type=org.apache.solr.handler.XmlUpdateRequestHandler ... solr/core0:id=org.apache.solr.search.SolrIndexSearcher,type=searcher solr/core0:id=org.apache.solr.update.DirectUpdateHandler2,type=updateHandler sh curl 'http://localhost:8983/solr/admin/cores?action=RELOADcore=core0' sh echo 'open 8842 # 8842 is java pid domain solr/core0 beans ' | java -jar jmxterm-1.0-alpha-4-uber.jar # there's only one bean left after Solr core reload solr/core0:id=org.apache.solr.search.SolrIndexSearcher,type=Searcher@2e831a91 main {noformat} The root cause of this is Solr core reload behavior: # create new core (which overwrites existing registered MBeans) # register new core and close old one (we remove/un-register MBeans on oldCore.close) The correct sequence is: # unregister MBeans from old core # create and register new core # close old core without touching MBeans -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2793) Directory createOutput and openInput should take an IOContext
[ https://issues.apache.org/jira/browse/LUCENE-2793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058116#comment-13058116 ] Michael McCandless commented on LUCENE-2793: To address the nocommits about losing the larger buffer size during merging, should we add set/getMergeBufferSize and set/getDefaultBufferSize to those Dir impls that do buffering? (And default to what they are today on trunk, I think 1 KB and 4 KB?) Directory createOutput and openInput should take an IOContext - Key: LUCENE-2793 URL: https://issues.apache.org/jira/browse/LUCENE-2793 Project: Lucene - Java Issue Type: Improvement Components: core/store Reporter: Michael McCandless Assignee: Varun Thacker Labels: gsoc2011, lucene-gsoc-11, mentor Attachments: LUCENE-2793-nrt.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch, LUCENE-2793.patch Today for merging we pass down a larger readBufferSize than for searching because we get better performance. I think we should generalize this to a class (IOContext), which would hold the buffer size, but then could hold other flags like DIRECT (bypass OS's buffer cache), SEQUENTIAL, etc. Then, we can make the DirectIOLinuxDirectory fully usable because we would only use DIRECT/SEQUENTIAL during merging. This will require fixing how IW pools readers, so that a reader opened for merging is not then used for searching, and vice/versa. Really, it's only all the open file handles that need to be different -- we could in theory share del docs, norms, etc, if that were somehow possible. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Created] (LUCENE-3267) check-legal-lucene always checks contrib/queries/lib
check-legal-lucene always checks contrib/queries/lib Key: LUCENE-3267 URL: https://issues.apache.org/jira/browse/LUCENE-3267 Project: Lucene - Java Issue Type: Bug Components: general/build Reporter: Chris Male Priority: Minor I've been noticing for awhile that the check-legal-lucene always checks /contrib/queries/lib, no matter where it is. Consequently it never finds the directory. This seems like a waste in our build and for the life of me I have no idea why it is necessary. Offending line is: {code} arg value=${basedir}/contrib/queries/lib / {code} in check-legal-lucene Patch will remove this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-3241) Remove Lucene core's FunctionQuery impls
[ https://issues.apache.org/jira/browse/LUCENE-3241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Male resolved LUCENE-3241. Resolution: Fixed Committed revision 1141747. Remove Lucene core's FunctionQuery impls Key: LUCENE-3241 URL: https://issues.apache.org/jira/browse/LUCENE-3241 Project: Lucene - Java Issue Type: Sub-task Components: core/search Reporter: Chris Male Assignee: Chris Male Fix For: 4.0 Attachments: LUCENE-3241.patch, LUCENE-3241.patch As part of the consolidation of FunctionQuerys, we want to remove Lucene core's impls. Included in this work, we will make sure that all the functionality provided by the core impls is also provided by the new module. Any tests will be ported across too, to increase the test coverage. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Resolved] (LUCENE-2883) Consolidate Solr Lucene FunctionQuery into modules
[ https://issues.apache.org/jira/browse/LUCENE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Male resolved LUCENE-2883. Resolution: Fixed Committed revision 1141749. Its done. Finally. Consolidate Solr Lucene FunctionQuery into modules - Key: LUCENE-2883 URL: https://issues.apache.org/jira/browse/LUCENE-2883 Project: Lucene - Java Issue Type: Task Components: core/search Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Chris Male Labels: gsoc2011, lucene-gsoc-11, mentor Fix For: 4.0 Attachments: LUCENE-2883.patch Spin-off from the [dev list | http://www.mail-archive.com/dev@lucene.apache.org/msg13261.html] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3267) check-legal-lucene always checks contrib/queries/lib
[ https://issues.apache.org/jira/browse/LUCENE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058138#comment-13058138 ] Chris Male commented on LUCENE-3267: Woops, committed wrong thing with this issue number. Oh well. check-legal-lucene always checks contrib/queries/lib Key: LUCENE-3267 URL: https://issues.apache.org/jira/browse/LUCENE-3267 Project: Lucene - Java Issue Type: Bug Components: general/build Reporter: Chris Male Priority: Minor Attachments: LUCENE-3267.patch I've been noticing for awhile that the check-legal-lucene always checks /contrib/queries/lib, no matter where it is. Consequently it never finds the directory. This seems like a waste in our build and for the life of me I have no idea why it is necessary. Offending line is: {code} arg value=${basedir}/contrib/queries/lib / {code} in check-legal-lucene Patch will remove this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Updated] (LUCENE-3267) check-legal-lucene always checks contrib/queries/lib
[ https://issues.apache.org/jira/browse/LUCENE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chris Male updated LUCENE-3267: --- Attachment: LUCENE-3267.patch Actual patch for this issue. Removes the offending /contrib/queries/lib hardcoded check. Everything seems good. I'll commit tomorrow. check-legal-lucene always checks contrib/queries/lib Key: LUCENE-3267 URL: https://issues.apache.org/jira/browse/LUCENE-3267 Project: Lucene - Java Issue Type: Bug Components: general/build Reporter: Chris Male Priority: Minor Attachments: LUCENE-3267.patch I've been noticing for awhile that the check-legal-lucene always checks /contrib/queries/lib, no matter where it is. Consequently it never finds the directory. This seems like a waste in our build and for the life of me I have no idea why it is necessary. Offending line is: {code} arg value=${basedir}/contrib/queries/lib / {code} in check-legal-lucene Patch will remove this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-2883) Consolidate Solr Lucene FunctionQuery into modules
[ https://issues.apache.org/jira/browse/LUCENE-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058142#comment-13058142 ] Robert Muir commented on LUCENE-2883: - Thanks for all your hard refactoring work here Chris! Consolidate Solr Lucene FunctionQuery into modules - Key: LUCENE-2883 URL: https://issues.apache.org/jira/browse/LUCENE-2883 Project: Lucene - Java Issue Type: Task Components: core/search Affects Versions: 4.0 Reporter: Simon Willnauer Assignee: Chris Male Labels: gsoc2011, lucene-gsoc-11, mentor Fix For: 4.0 Attachments: LUCENE-2883.patch Spin-off from the [dev list | http://www.mail-archive.com/dev@lucene.apache.org/msg13261.html] -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
[jira] [Commented] (LUCENE-3267) check-legal-lucene always checks contrib/queries/lib
[ https://issues.apache.org/jira/browse/LUCENE-3267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13058145#comment-13058145 ] Robert Muir commented on LUCENE-3267: - +1 check-legal-lucene always checks contrib/queries/lib Key: LUCENE-3267 URL: https://issues.apache.org/jira/browse/LUCENE-3267 Project: Lucene - Java Issue Type: Bug Components: general/build Reporter: Chris Male Priority: Minor Attachments: LUCENE-3267.patch I've been noticing for awhile that the check-legal-lucene always checks /contrib/queries/lib, no matter where it is. Consequently it never finds the directory. This seems like a waste in our build and for the life of me I have no idea why it is necessary. Offending line is: {code} arg value=${basedir}/contrib/queries/lib / {code} in check-legal-lucene Patch will remove this. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org