RE: Lucene 2.9 status (to port to Lucene.Net)
Hi Mike, This is great feedback on the new Collector API, Uwe. Thanks! - Likewise. It's awesome that you no longer have to warm your searchers... but be careful when a large segment merge commits. I know this, but in our case (e.g. creating a IN-SQL list, collecting measurement parameters from the documents) the warming is not really needed, it would only be a problem if it is very often (the index is updated every 20 minutes) and it must reload the whole field cache (takes 3-5 seconds on our machine). So a large merge taking 1-2 seconds for cache reloading is no problem (the users have the same problem with sorted results). If our index gets bigger, I will add warming in my search/cache implementation after reopening, for that it would be nice, to have the list of reopened segments (I think there was a issue about it, or is there an implementation?). In our case, most time takes the query in the SQL data warehouse after it, so 1 second additionally for building the SQL query is not much. Did you hit any snags/problems/etc. that we should fix before releasing 2.9? Until now, I have not seen any further problems. What I have seen befor is already implemented in Lucene with our active issue communication and all these issues :-) I still wait for the step towards moving trie (and also the new automaton regex query) to core and the modularization (hopefully before 2.9, to not create new APIs that change/deprecate later). Uwe Mike On Sun, Apr 26, 2009 at 9:54 AM, Uwe Schindler u...@thetaphi.de wrote: Some status update: George, did you mean LUCENE-1516 below? (LUCENE-1313 is a further improvement to near real-time search that's still being iterated on). In general I would say 2.9 seems to be in rather active development still ;) I too would love to hear about production/beta use of 2.9. George maybe you should re-ask on java-user? Here! I updated www.pangaea.de to Lucene-trunk today (because of incomplete hashcode in TrieRangeQuery)... Works perfect, but I do not use the realtime parts. And 10 days before the same, no problems :-) Currently I rewrite parts of my code to Collector to go away from HitCollector (without score, so optimizations)! The reopen() and sorting is fine, almost no time is consumed for sorted searches after reopening indexes every 20 minutes with just some new and small segments with changed documents. No extra warming is needed. I rewrote my collectors now to use the new API. Even through the number of methods to overwrite in the new collector is 3 instead of 1, the code got shorter (because the collect methods now can throw IOExceptions, great!!!). What is also perfect is the way how to use a FieldCache: Just retrieve the FieldCache array (e.g. getInts()) in the setNextReader() method and use the value array in the collect() method with the docid as index. Now I am able to e.g. retrieve cached values even after an index reopen without warming (same with sort). In the past you had to use a cache array for the whole index. The docBase is not used in my code, as I directly access the index readers. So users now have both possibilities: use the supplied reader or use the docBase as index offset into the searcher/main reader. Really cool! The overhead of score calculation can be left out, if not needed, also cool! One of my collectors is used retrieve the database ids (integers) for building up a SQL IN (...) from the field cache based on the collected hits. In the past this was very complicated, because FieldCache was slow after reopening and getting stored fields (the ids) is also very slow (inner search loop). Now it's just 10 lines of code and no score is involved. The new code is working now in production at PANGAEA. Another change to be done here is Field.Store.COMPRESS and replace by manually compressed binary stored fields, but this is only to get rid of the deprecated warnings. But this cannot be done without complete reindexing. Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 status (to port to Lucene.Net)
On Tue, Apr 28, 2009 at 8:10 AM, Uwe Schindler u...@thetaphi.de wrote: It's awesome that you no longer have to warm your searchers... but be careful when a large segment merge commits. I know this, but in our case (e.g. creating a IN-SQL list, collecting measurement parameters from the documents) the warming is not really needed, it would only be a problem if it is very often (the index is updated every 20 minutes) and it must reload the whole field cache (takes 3-5 seconds on our machine). So a large merge taking 1-2 seconds for cache reloading is no problem (the users have the same problem with sorted results). If our index gets bigger, I will add warming in my search/cache implementation after reopening, for that it would be nice, to have the list of reopened segments (I think there was a issue about it, or is there an implementation?). In our case, most time takes the query in the SQL data warehouse after it, so 1 second additionally for building the SQL query is not much. OK that's great. Did you hit any snags/problems/etc. that we should fix before releasing 2.9? Until now, I have not seen any further problems. What I have seen befor is already implemented in Lucene with our active issue communication and all these issues :-) Tell me about it... hard to keep them all straight! Lot's of great improvements in 2.9... I still wait for the step towards moving trie (and also the new automaton regex query) to core and the modularization (hopefully before 2.9, to not create new APIs that change/deprecate later). +1 We need to do something about modularization / move trie to core before 2.9. Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 status (to port to Lucene.Net)
Some status update: George, did you mean LUCENE-1516 below? (LUCENE-1313 is a further improvement to near real-time search that's still being iterated on). In general I would say 2.9 seems to be in rather active development still ;) I too would love to hear about production/beta use of 2.9. George maybe you should re-ask on java-user? Here! I updated www.pangaea.de to Lucene-trunk today (because of incomplete hashcode in TrieRangeQuery)... Works perfect, but I do not use the realtime parts. And 10 days before the same, no problems :-) Currently I rewrite parts of my code to Collector to go away from HitCollector (without score, so optimizations)! The reopen() and sorting is fine, almost no time is consumed for sorted searches after reopening indexes every 20 minutes with just some new and small segments with changed documents. No extra warming is needed. I rewrote my collectors now to use the new API. Even through the number of methods to overwrite in the new collector is 3 instead of 1, the code got shorter (because the collect methods now can throw IOExceptions, great!!!). What is also perfect is the way how to use a FieldCache: Just retrieve the FieldCache array (e.g. getInts()) in the setNextReader() method and use the value array in the collect() method with the docid as index. Now I am able to e.g. retrieve cached values even after an index reopen without warming (same with sort). In the past you had to use a cache array for the whole index. The docBase is not used in my code, as I directly access the index readers. So users now have both possibilities: use the supplied reader or use the docBase as index offset into the searcher/main reader. Really cool! The overhead of score calculation can be left out, if not needed, also cool! One of my collectors is used retrieve the database ids (integers) for building up a SQL IN (...) from the field cache based on the collected hits. In the past this was very complicated, because FieldCache was slow after reopening and getting stored fields (the ids) is also very slow (inner search loop). Now it's just 10 lines of code and no score is involved. The new code is working now in production at PANGAEA. Another change to be done here is Field.Store.COMPRESS and replace by manually compressed binary stored fields, but this is only to get rid of the deprecated warnings. But this cannot be done without complete reindexing. Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 status (to port to Lucene.Net)
This is great feedback on the new Collector API, Uwe. Thanks! It's awesome that you no longer have to warm your searchers... but be careful when a large segment merge commits. Did you hit any snags/problems/etc. that we should fix before releasing 2.9? Mike On Sun, Apr 26, 2009 at 9:54 AM, Uwe Schindler u...@thetaphi.de wrote: Some status update: George, did you mean LUCENE-1516 below? (LUCENE-1313 is a further improvement to near real-time search that's still being iterated on). In general I would say 2.9 seems to be in rather active development still ;) I too would love to hear about production/beta use of 2.9. George maybe you should re-ask on java-user? Here! I updated www.pangaea.de to Lucene-trunk today (because of incomplete hashcode in TrieRangeQuery)... Works perfect, but I do not use the realtime parts. And 10 days before the same, no problems :-) Currently I rewrite parts of my code to Collector to go away from HitCollector (without score, so optimizations)! The reopen() and sorting is fine, almost no time is consumed for sorted searches after reopening indexes every 20 minutes with just some new and small segments with changed documents. No extra warming is needed. I rewrote my collectors now to use the new API. Even through the number of methods to overwrite in the new collector is 3 instead of 1, the code got shorter (because the collect methods now can throw IOExceptions, great!!!). What is also perfect is the way how to use a FieldCache: Just retrieve the FieldCache array (e.g. getInts()) in the setNextReader() method and use the value array in the collect() method with the docid as index. Now I am able to e.g. retrieve cached values even after an index reopen without warming (same with sort). In the past you had to use a cache array for the whole index. The docBase is not used in my code, as I directly access the index readers. So users now have both possibilities: use the supplied reader or use the docBase as index offset into the searcher/main reader. Really cool! The overhead of score calculation can be left out, if not needed, also cool! One of my collectors is used retrieve the database ids (integers) for building up a SQL IN (...) from the field cache based on the collected hits. In the past this was very complicated, because FieldCache was slow after reopening and getting stored fields (the ids) is also very slow (inner search loop). Now it's just 10 lines of code and no score is involved. The new code is working now in production at PANGAEA. Another change to be done here is Field.Store.COMPRESS and replace by manually compressed binary stored fields, but this is only to get rid of the deprecated warnings. But this cannot be done without complete reindexing. Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 status (to port to Lucene.Net)
George, did you mean LUCENE-1516 below? (LUCENE-1313 is a further improvement to near real-time search that's still being iterated on). In general I would say 2.9 seems to be in rather active development still ;) I too would love to hear about production/beta use of 2.9. George maybe you should re-ask on java-user? Mike On Sat, Apr 18, 2009 at 7:12 PM, George Aroush geo...@aroush.net wrote: Thanks all for your input on this subject. So, if I decide to grab the current code off the trunk, is it: 1) Usable for production use? 2) Is LUCENE-1313 (Realtime search), in the current trunk, stable and ready for use? Put another way, is anyone using the current trunk code in production, or even as beta? -- George From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com] Sent: Thursday, April 16, 2009 5:13 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 status (to port to Lucene.Net) LUCENE-1313 relies on LUCENE-1516 which is in trunk. If you have other questions George, feel free to ask. On Thu, Apr 16, 2009 at 8:04 AM, George Aroush geo...@aroush.net wrote: Thanks Mike. A quick follow up question. What's the status of http://issues.apache.org/jira/browse/LUCENE-1313? Can this work be applied to Lucene 2.4.1 and still get it's benefit or are there other dependency / issues with it that prevents us from doing so? If anyone else knows, I welcome your input. -- George -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, April 16, 2009 8:36 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 status (to port to Lucene.Net) Hi George, There's been a sudden burst of activity lately on 2.9 development... I know there are some biggish remaining features we may want to get into 2.9: * The new field cache (LUCENE-831; still being iterated/mulled), * Possible major rework of Field / Document index-time vs search-time Document * Applying filters via random-access API when possible performant (LUCENE-1536) * Possible further optimizations to how collection works (LUCENE-1593) * Maybe breaking core + contrib into a more uniform set of modules (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here) -- the Modularization uber-thread. * Further improvements to near-realtime search (using RAMDir for small recently flushed segments) * Many other small things and probably some big ones that I'm forgetting now :) So things are still in flux, and I'm really not sure on a release date at this point. Late last year, I was hoping for early this year, but it's no longer early this year ;) Mike On Wed, Apr 15, 2009 at 9:17 PM, George Aroush geo...@aroush.net wrote: Hi Folks, This is George Aroush, I'm one of the committers on Lucene.Net - a port of Java Lucene to C# Lucene. I'm looking at the current trunk code of yet to be released Lucene 2.9 and I would like to port it to Lucene.Net. If I do this now, we get the benefit of keeping our code base and release dates much closer to Java Lucene. However, this comes with a cost of carrying over unfinished work, known defects, and I have to keep an eye on new code that get committed into Java Lucene which must be ported over in a timely fashion. To help me determine when is a good time to start the port -- keep in mind, I will be taking the latest code off SVN -- I like to hear from the Java Lucene committers (and users who are playing or using Lucene 2.9 off SVN) about those questions: 1) how stable the current code in the trunk is, 2) do you still have feature work to deliver or just bug fixes, and 3) what's your target date to release Java Lucene 2.9 #1 is important, such that is anyone using it in production? Yes, I did look at the current open issues in JIRA, but that doesn't help me answer the above questions. Regards, -- George - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 status (to port to Lucene.Net)
George, did you mean LUCENE-1516 below? (LUCENE-1313 is a further improvement to near real-time search that's still being iterated on). In general I would say 2.9 seems to be in rather active development still ;) I too would love to hear about production/beta use of 2.9. George maybe you should re-ask on java-user? Here! I updated www.pangaea.de to Lucene-trunk today (because of incomplete hashcode in TrieRangeQuery)... Works perfect, but I do not use the realtime parts. And 10 days before the same, no problems :-) Currently I rewrite parts of my code to Collector to go away from HitCollector (without score, so optimizations)! The reopen() and sorting is fine, almost no time is consumed for sorted searches after reopening indexes every 20 minutes with just some new and small segments with changed documents. No extra warming is needed. Another change to be done here is Field.Store.COMPRESS and replace by manually compressed binary stored fields, but this is only to get rid of the deprecated warnings. But this cannot be done without complete reindexing. Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 status (to port to Lucene.Net)
Thanks all for your input on this subject. So, if I decide to grab the current code off the trunk, is it: 1) Usable for production use? 2) Is LUCENE-1313 (Realtime search), in the current trunk, stable and ready for use? Put another way, is anyone using the current trunk code in production, or even as beta? -- George _ From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com] Sent: Thursday, April 16, 2009 5:13 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 status (to port to Lucene.Net) LUCENE-1313 relies on LUCENE-1516 which is in trunk. If you have other questions George, feel free to ask. On Thu, Apr 16, 2009 at 8:04 AM, George Aroush geo...@aroush.net wrote: Thanks Mike. A quick follow up question. What's the status of http://issues.apache.org/jira/browse/LUCENE-1313? Can this work be applied to Lucene 2.4.1 and still get it's benefit or are there other dependency / issues with it that prevents us from doing so? If anyone else knows, I welcome your input. -- George -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, April 16, 2009 8:36 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 status (to port to Lucene.Net) Hi George, There's been a sudden burst of activity lately on 2.9 development... I know there are some biggish remaining features we may want to get into 2.9: * The new field cache (LUCENE-831; still being iterated/mulled), * Possible major rework of Field / Document index-time vs search-time Document * Applying filters via random-access API when possible performant (LUCENE-1536) * Possible further optimizations to how collection works (LUCENE-1593) * Maybe breaking core + contrib into a more uniform set of modules (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here) -- the Modularization uber-thread. * Further improvements to near-realtime search (using RAMDir for small recently flushed segments) * Many other small things and probably some big ones that I'm forgetting now :) So things are still in flux, and I'm really not sure on a release date at this point. Late last year, I was hoping for early this year, but it's no longer early this year ;) Mike On Wed, Apr 15, 2009 at 9:17 PM, George Aroush geo...@aroush.net wrote: Hi Folks, This is George Aroush, I'm one of the committers on Lucene.Net - a port of Java Lucene to C# Lucene. I'm looking at the current trunk code of yet to be released Lucene 2.9 and I would like to port it to Lucene.Net. If I do this now, we get the benefit of keeping our code base and release dates much closer to Java Lucene. However, this comes with a cost of carrying over unfinished work, known defects, and I have to keep an eye on new code that get committed into Java Lucene which must be ported over in a timely fashion. To help me determine when is a good time to start the port -- keep in mind, I will be taking the latest code off SVN -- I like to hear from the Java Lucene committers (and users who are playing or using Lucene 2.9 off SVN) about those questions: 1) how stable the current code in the trunk is, 2) do you still have feature work to deliver or just bug fixes, and 3) what's your target date to release Java Lucene 2.9 #1 is important, such that is anyone using it in production? Yes, I did look at the current open issues in JIRA, but that doesn't help me answer the above questions. Regards, -- George - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 status (to port to Lucene.Net)
Hi George, There's been a sudden burst of activity lately on 2.9 development... I know there are some biggish remaining features we may want to get into 2.9: * The new field cache (LUCENE-831; still being iterated/mulled), * Possible major rework of Field / Document index-time vs search-time Document * Applying filters via random-access API when possible performant (LUCENE-1536) * Possible further optimizations to how collection works (LUCENE-1593) * Maybe breaking core + contrib into a more uniform set of modules (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here) -- the Modularization uber-thread. * Further improvements to near-realtime search (using RAMDir for small recently flushed segments) * Many other small things and probably some big ones that I'm forgetting now :) So things are still in flux, and I'm really not sure on a release date at this point. Late last year, I was hoping for early this year, but it's no longer early this year ;) Mike On Wed, Apr 15, 2009 at 9:17 PM, George Aroush geo...@aroush.net wrote: Hi Folks, This is George Aroush, I'm one of the committers on Lucene.Net - a port of Java Lucene to C# Lucene. I'm looking at the current trunk code of yet to be released Lucene 2.9 and I would like to port it to Lucene.Net. If I do this now, we get the benefit of keeping our code base and release dates much closer to Java Lucene. However, this comes with a cost of carrying over unfinished work, known defects, and I have to keep an eye on new code that get committed into Java Lucene which must be ported over in a timely fashion. To help me determine when is a good time to start the port -- keep in mind, I will be taking the latest code off SVN -- I like to hear from the Java Lucene committers (and users who are playing or using Lucene 2.9 off SVN) about those questions: 1) how stable the current code in the trunk is, 2) do you still have feature work to deliver or just bug fixes, and 3) what's your target date to release Java Lucene 2.9 #1 is important, such that is anyone using it in production? Yes, I did look at the current open issues in JIRA, but that doesn't help me answer the above questions. Regards, -- George - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 status (to port to Lucene.Net)
Thanks Mike. A quick follow up question. What's the status of http://issues.apache.org/jira/browse/LUCENE-1313? Can this work be applied to Lucene 2.4.1 and still get it's benefit or are there other dependency / issues with it that prevents us from doing so? If anyone else knows, I welcome your input. -- George -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, April 16, 2009 8:36 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 status (to port to Lucene.Net) Hi George, There's been a sudden burst of activity lately on 2.9 development... I know there are some biggish remaining features we may want to get into 2.9: * The new field cache (LUCENE-831; still being iterated/mulled), * Possible major rework of Field / Document index-time vs search-time Document * Applying filters via random-access API when possible performant (LUCENE-1536) * Possible further optimizations to how collection works (LUCENE-1593) * Maybe breaking core + contrib into a more uniform set of modules (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here) -- the Modularization uber-thread. * Further improvements to near-realtime search (using RAMDir for small recently flushed segments) * Many other small things and probably some big ones that I'm forgetting now :) So things are still in flux, and I'm really not sure on a release date at this point. Late last year, I was hoping for early this year, but it's no longer early this year ;) Mike On Wed, Apr 15, 2009 at 9:17 PM, George Aroush geo...@aroush.net wrote: Hi Folks, This is George Aroush, I'm one of the committers on Lucene.Net - a port of Java Lucene to C# Lucene. I'm looking at the current trunk code of yet to be released Lucene 2.9 and I would like to port it to Lucene.Net. If I do this now, we get the benefit of keeping our code base and release dates much closer to Java Lucene. However, this comes with a cost of carrying over unfinished work, known defects, and I have to keep an eye on new code that get committed into Java Lucene which must be ported over in a timely fashion. To help me determine when is a good time to start the port -- keep in mind, I will be taking the latest code off SVN -- I like to hear from the Java Lucene committers (and users who are playing or using Lucene 2.9 off SVN) about those questions: 1) how stable the current code in the trunk is, 2) do you still have feature work to deliver or just bug fixes, and 3) what's your target date to release Java Lucene 2.9 #1 is important, such that is anyone using it in production? Yes, I did look at the current open issues in JIRA, but that doesn't help me answer the above questions. Regards, -- George - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 status (to port to Lucene.Net)
I wouldn't be surprised if it didnt depend on a couple other little issues - Jason or Mike would probably have to tell you that. It does count a bit on LUCENE-1483 if you want to use it with FieldCaches or cached Filters though. It would still work with 1483, but would be much slower in those cases. - Mark George Aroush wrote: Thanks Mike. A quick follow up question. What's the status of http://issues.apache.org/jira/browse/LUCENE-1313? Can this work be applied to Lucene 2.4.1 and still get it's benefit or are there other dependency / issues with it that prevents us from doing so? If anyone else knows, I welcome your input. -- George -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, April 16, 2009 8:36 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 status (to port to Lucene.Net) Hi George, There's been a sudden burst of activity lately on 2.9 development... I know there are some biggish remaining features we may want to get into 2.9: * The new field cache (LUCENE-831; still being iterated/mulled), * Possible major rework of Field / Document index-time vs search-time Document * Applying filters via random-access API when possible performant (LUCENE-1536) * Possible further optimizations to how collection works (LUCENE-1593) * Maybe breaking core + contrib into a more uniform set of modules (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here) -- the Modularization uber-thread. * Further improvements to near-realtime search (using RAMDir for small recently flushed segments) * Many other small things and probably some big ones that I'm forgetting now :) So things are still in flux, and I'm really not sure on a release date at this point. Late last year, I was hoping for early this year, but it's no longer early this year ;) Mike On Wed, Apr 15, 2009 at 9:17 PM, George Aroush geo...@aroush.net wrote: Hi Folks, This is George Aroush, I'm one of the committers on Lucene.Net - a port of Java Lucene to C# Lucene. I'm looking at the current trunk code of yet to be released Lucene 2.9 and I would like to port it to Lucene.Net. If I do this now, we get the benefit of keeping our code base and release dates much closer to Java Lucene. However, this comes with a cost of carrying over unfinished work, known defects, and I have to keep an eye on new code that get committed into Java Lucene which must be ported over in a timely fashion. To help me determine when is a good time to start the port -- keep in mind, I will be taking the latest code off SVN -- I like to hear from the Java Lucene committers (and users who are playing or using Lucene 2.9 off SVN) about those questions: 1) how stable the current code in the trunk is, 2) do you still have feature work to deliver or just bug fixes, and 3) what's your target date to release Java Lucene 2.9 #1 is important, such that is anyone using it in production? Yes, I did look at the current open issues in JIRA, but that doesn't help me answer the above questions. Regards, -- George - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 status (to port to Lucene.Net)
Whoops - should read: It should still work *without* 1483 but would be much slower in those cases (reloading the filter/fieldcache per reader rather than per segment). Mark Miller wrote: I wouldn't be surprised if it didnt depend on a couple other little issues - Jason or Mike would probably have to tell you that. It does count a bit on LUCENE-1483 if you want to use it with FieldCaches or cached Filters though. It would still work with 1483, but would be much slower in those cases. - Mark George Aroush wrote: Thanks Mike. A quick follow up question. What's the status of http://issues.apache.org/jira/browse/LUCENE-1313? Can this work be applied to Lucene 2.4.1 and still get it's benefit or are there other dependency / issues with it that prevents us from doing so? If anyone else knows, I welcome your input. -- George -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 status (to port to Lucene.Net)
These issues all depend so much on each other, i would suggest to simply try Lucene-2.9-dev trunk (e.g. from downloaded from Hudson). We have this running here without any problems. The problem with unreleased Lucene is more, that if you try new features, there may be non-compatible changes until the release, so you must keep track on changes on the components you try out. In general: If everything works for you, and you have backups of your indexes, you can simply try out. If it works correctly, just use it! Patching the relased version may make it more unstable than using the development tree, that is more tested by all our committers :) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: George Aroush [mailto:geo...@aroush.net] Sent: Thursday, April 16, 2009 5:05 PM To: java-dev@lucene.apache.org Subject: RE: Lucene 2.9 status (to port to Lucene.Net) Thanks Mike. A quick follow up question. What's the status of http://issues.apache.org/jira/browse/LUCENE-1313? Can this work be applied to Lucene 2.4.1 and still get it's benefit or are there other dependency / issues with it that prevents us from doing so? If anyone else knows, I welcome your input. -- George -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, April 16, 2009 8:36 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 status (to port to Lucene.Net) Hi George, There's been a sudden burst of activity lately on 2.9 development... I know there are some biggish remaining features we may want to get into 2.9: * The new field cache (LUCENE-831; still being iterated/mulled), * Possible major rework of Field / Document index-time vs search-time Document * Applying filters via random-access API when possible performant (LUCENE-1536) * Possible further optimizations to how collection works (LUCENE-1593) * Maybe breaking core + contrib into a more uniform set of modules (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here) -- the Modularization uber-thread. * Further improvements to near-realtime search (using RAMDir for small recently flushed segments) * Many other small things and probably some big ones that I'm forgetting now :) So things are still in flux, and I'm really not sure on a release date at this point. Late last year, I was hoping for early this year, but it's no longer early this year ;) Mike On Wed, Apr 15, 2009 at 9:17 PM, George Aroush geo...@aroush.net wrote: Hi Folks, This is George Aroush, I'm one of the committers on Lucene.Net - a port of Java Lucene to C# Lucene. I'm looking at the current trunk code of yet to be released Lucene 2.9 and I would like to port it to Lucene.Net. If I do this now, we get the benefit of keeping our code base and release dates much closer to Java Lucene. However, this comes with a cost of carrying over unfinished work, known defects, and I have to keep an eye on new code that get committed into Java Lucene which must be ported over in a timely fashion. To help me determine when is a good time to start the port -- keep in mind, I will be taking the latest code off SVN -- I like to hear from the Java Lucene committers (and users who are playing or using Lucene 2.9 off SVN) about those questions: 1) how stable the current code in the trunk is, 2) do you still have feature work to deliver or just bug fixes, and 3) what's your target date to release Java Lucene 2.9 #1 is important, such that is anyone using it in production? Yes, I did look at the current open issues in JIRA, but that doesn't help me answer the above questions. Regards, -- George - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 status (to port to Lucene.Net)
LUCENE-1313 relies on LUCENE-1516 which is in trunk. If you have other questions George, feel free to ask. On Thu, Apr 16, 2009 at 8:04 AM, George Aroush geo...@aroush.net wrote: Thanks Mike. A quick follow up question. What's the status of http://issues.apache.org/jira/browse/LUCENE-1313? Can this work be applied to Lucene 2.4.1 and still get it's benefit or are there other dependency / issues with it that prevents us from doing so? If anyone else knows, I welcome your input. -- George -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Thursday, April 16, 2009 8:36 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 status (to port to Lucene.Net) Hi George, There's been a sudden burst of activity lately on 2.9 development... I know there are some biggish remaining features we may want to get into 2.9: * The new field cache (LUCENE-831; still being iterated/mulled), * Possible major rework of Field / Document index-time vs search-time Document * Applying filters via random-access API when possible performant (LUCENE-1536) * Possible further optimizations to how collection works (LUCENE-1593) * Maybe breaking core + contrib into a more uniform set of modules (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here) -- the Modularization uber-thread. * Further improvements to near-realtime search (using RAMDir for small recently flushed segments) * Many other small things and probably some big ones that I'm forgetting now :) So things are still in flux, and I'm really not sure on a release date at this point. Late last year, I was hoping for early this year, but it's no longer early this year ;) Mike On Wed, Apr 15, 2009 at 9:17 PM, George Aroush geo...@aroush.net wrote: Hi Folks, This is George Aroush, I'm one of the committers on Lucene.Net - a port of Java Lucene to C# Lucene. I'm looking at the current trunk code of yet to be released Lucene 2.9 and I would like to port it to Lucene.Net. If I do this now, we get the benefit of keeping our code base and release dates much closer to Java Lucene. However, this comes with a cost of carrying over unfinished work, known defects, and I have to keep an eye on new code that get committed into Java Lucene which must be ported over in a timely fashion. To help me determine when is a good time to start the port -- keep in mind, I will be taking the latest code off SVN -- I like to hear from the Java Lucene committers (and users who are playing or using Lucene 2.9 off SVN) about those questions: 1) how stable the current code in the trunk is, 2) do you still have feature work to deliver or just bug fixes, and 3) what's your target date to release Java Lucene 2.9 #1 is important, such that is anyone using it in production? Yes, I did look at the current open issues in JIRA, but that doesn't help me answer the above questions. Regards, -- George - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Lucene 2.9 status (to port to Lucene.Net)
Hi Folks, This is George Aroush, I'm one of the committers on Lucene.Net - a port of Java Lucene to C# Lucene. I'm looking at the current trunk code of yet to be released Lucene 2.9 and I would like to port it to Lucene.Net. If I do this now, we get the benefit of keeping our code base and release dates much closer to Java Lucene. However, this comes with a cost of carrying over unfinished work, known defects, and I have to keep an eye on new code that get committed into Java Lucene which must be ported over in a timely fashion. To help me determine when is a good time to start the port -- keep in mind, I will be taking the latest code off SVN -- I like to hear from the Java Lucene committers (and users who are playing or using Lucene 2.9 off SVN) about those questions: 1) how stable the current code in the trunk is, 2) do you still have feature work to deliver or just bug fixes, and 3) what's your target date to release Java Lucene 2.9 #1 is important, such that is anyone using it in production? Yes, I did look at the current open issues in JIRA, but that doesn't help me answer the above questions. Regards, -- George - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org