RE: Lucene 2.9 status (to port to Lucene.Net)

2009-04-28 Thread Uwe Schindler
Hi Mike,

 This is great feedback on the new Collector API, Uwe.  Thanks!

- Likewise.

 It's awesome that you no longer have to warm your searchers... but be
 careful when a large segment merge commits.

I know this, but in our case (e.g. creating a IN-SQL list, collecting
measurement parameters from the documents) the warming is not really needed,
it would only be a problem if it is very often (the index is updated every
20 minutes) and it must reload the whole field cache (takes 3-5 seconds on
our machine). So a large merge taking 1-2 seconds for cache reloading is no
problem (the users have the same problem with sorted results). If our index
gets bigger, I will add warming in my search/cache implementation after
reopening, for that it would be nice, to have the list of reopened segments
(I think there was a issue about it, or is there an implementation?).
In our case, most time takes the query in the SQL data warehouse after it,
so 1 second additionally for building the SQL query is not much.
 
 Did you hit any snags/problems/etc. that we should fix before releasing
 2.9?

Until now, I have not seen any further problems. What I have seen befor is
already implemented in Lucene with our active issue communication and all
these issues :-)

I still wait for the step towards moving trie (and also the new automaton
regex query) to core and the modularization (hopefully before 2.9, to not
create new APIs that change/deprecate later).

Uwe

 Mike
 
 On Sun, Apr 26, 2009 at 9:54 AM, Uwe Schindler u...@thetaphi.de wrote:
  Some status update:
 
   George, did you mean LUCENE-1516 below?  (LUCENE-1313 is a further
   improvement to near real-time search that's still being iterated on).
  
   In general I would say 2.9 seems to be in rather active development
  still
   ;)
  
   I too would love to hear about production/beta use of 2.9.  George
   maybe you should re-ask on java-user?
 
  Here! I updated www.pangaea.de to Lucene-trunk today (because of
  incomplete
  hashcode in TrieRangeQuery)... Works perfect, but I do not use the
  realtime
  parts. And 10 days before the same, no problems :-)
 
  Currently I rewrite parts of my code to Collector to go away from
  HitCollector (without score, so optimizations)! The reopen() and
 sorting
  is
  fine, almost no time is consumed for sorted searches after reopening
  indexes
  every 20 minutes with just some new and small segments with changed
  documents. No extra warming is needed.
 
  I rewrote my collectors now to use the new API. Even through the number
 of
  methods to overwrite in the new collector is 3 instead of 1, the code
 got
  shorter (because the collect methods now can throw IOExceptions,
 great!!!).
  What is also perfect is the way how to use a FieldCache: Just retrieve
 the
  FieldCache array (e.g. getInts()) in the setNextReader() method and use
 the
  value array in the collect() method with the docid as index. Now I am
 able
  to e.g. retrieve cached values even after an index reopen without
 warming
  (same with sort). In the past you had to use a cache array for the whole
  index. The docBase is not used in my code, as I directly access the
 index
  readers. So users now have both possibilities: use the supplied reader
 or
  use the docBase as index offset into the searcher/main reader. Really
 cool!
 
  The overhead of score calculation can be left out, if not needed, also
 cool!
 
  One of my collectors is used retrieve the database ids (integers) for
  building up a SQL IN (...) from the field cache based on the collected
  hits. In the past this was very complicated, because FieldCache was slow
  after reopening and getting stored fields (the ids) is also very slow
 (inner
  search loop). Now it's just 10 lines of code and no score is involved.
 
  The new code is working now in production at PANGAEA.
 
  Another change to be done here is Field.Store.COMPRESS and replace by
  manually compressed binary stored fields, but this is only to get rid
 of
  the
  deprecated warnings. But this cannot be done without complete
 reindexing.
 
  Uwe
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-28 Thread Michael McCandless
On Tue, Apr 28, 2009 at 8:10 AM, Uwe Schindler u...@thetaphi.de wrote:

 It's awesome that you no longer have to warm your searchers... but be
 careful when a large segment merge commits.

 I know this, but in our case (e.g. creating a IN-SQL list, collecting
 measurement parameters from the documents) the warming is not really needed,
 it would only be a problem if it is very often (the index is updated every
 20 minutes) and it must reload the whole field cache (takes 3-5 seconds on
 our machine). So a large merge taking 1-2 seconds for cache reloading is no
 problem (the users have the same problem with sorted results). If our index
 gets bigger, I will add warming in my search/cache implementation after
 reopening, for that it would be nice, to have the list of reopened segments
 (I think there was a issue about it, or is there an implementation?).
 In our case, most time takes the query in the SQL data warehouse after it,
 so 1 second additionally for building the SQL query is not much.

OK that's great.

 Did you hit any snags/problems/etc. that we should fix before releasing
 2.9?

 Until now, I have not seen any further problems. What I have seen befor is
 already implemented in Lucene with our active issue communication and all
 these issues :-)

Tell me about it... hard to keep them all straight!  Lot's of great
improvements in 2.9...

 I still wait for the step towards moving trie (and also the new automaton
 regex query) to core and the modularization (hopefully before 2.9, to not
 create new APIs that change/deprecate later).

+1

We need to do something about modularization / move trie to core before 2.9.

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 status (to port to Lucene.Net)

2009-04-26 Thread Uwe Schindler
Some status update:

  George, did you mean LUCENE-1516 below?  (LUCENE-1313 is a further
  improvement to near real-time search that's still being iterated on).
 
  In general I would say 2.9 seems to be in rather active development
 still
  ;)
 
  I too would love to hear about production/beta use of 2.9.  George
  maybe you should re-ask on java-user?
 
 Here! I updated www.pangaea.de to Lucene-trunk today (because of
 incomplete
 hashcode in TrieRangeQuery)... Works perfect, but I do not use the
 realtime
 parts. And 10 days before the same, no problems :-)
 
 Currently I rewrite parts of my code to Collector to go away from
 HitCollector (without score, so optimizations)! The reopen() and sorting
 is
 fine, almost no time is consumed for sorted searches after reopening
 indexes
 every 20 minutes with just some new and small segments with changed
 documents. No extra warming is needed.

I rewrote my collectors now to use the new API. Even through the number of
methods to overwrite in the new collector is 3 instead of 1, the code got
shorter (because the collect methods now can throw IOExceptions, great!!!).
What is also perfect is the way how to use a FieldCache: Just retrieve the
FieldCache array (e.g. getInts()) in the setNextReader() method and use the
value array in the collect() method with the docid as index. Now I am able
to e.g. retrieve cached values even after an index reopen without warming
(same with sort). In the past you had to use a cache array for the whole
index. The docBase is not used in my code, as I directly access the index
readers. So users now have both possibilities: use the supplied reader or
use the docBase as index offset into the searcher/main reader. Really cool!

The overhead of score calculation can be left out, if not needed, also cool!

One of my collectors is used retrieve the database ids (integers) for
building up a SQL IN (...) from the field cache based on the collected
hits. In the past this was very complicated, because FieldCache was slow
after reopening and getting stored fields (the ids) is also very slow (inner
search loop). Now it's just 10 lines of code and no score is involved.

The new code is working now in production at PANGAEA.

 Another change to be done here is Field.Store.COMPRESS and replace by
 manually compressed binary stored fields, but this is only to get rid of
 the
 deprecated warnings. But this cannot be done without complete reindexing.
 
 Uwe
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-26 Thread Michael McCandless
This is great feedback on the new Collector API, Uwe.  Thanks!

It's awesome that you no longer have to warm your searchers... but be
careful when a large segment merge commits.

Did you hit any snags/problems/etc. that we should fix before releasing 2.9?

Mike

On Sun, Apr 26, 2009 at 9:54 AM, Uwe Schindler u...@thetaphi.de wrote:
 Some status update:

  George, did you mean LUCENE-1516 below?  (LUCENE-1313 is a further
  improvement to near real-time search that's still being iterated on).
 
  In general I would say 2.9 seems to be in rather active development
 still
  ;)
 
  I too would love to hear about production/beta use of 2.9.  George
  maybe you should re-ask on java-user?

 Here! I updated www.pangaea.de to Lucene-trunk today (because of
 incomplete
 hashcode in TrieRangeQuery)... Works perfect, but I do not use the
 realtime
 parts. And 10 days before the same, no problems :-)

 Currently I rewrite parts of my code to Collector to go away from
 HitCollector (without score, so optimizations)! The reopen() and sorting
 is
 fine, almost no time is consumed for sorted searches after reopening
 indexes
 every 20 minutes with just some new and small segments with changed
 documents. No extra warming is needed.

 I rewrote my collectors now to use the new API. Even through the number of
 methods to overwrite in the new collector is 3 instead of 1, the code got
 shorter (because the collect methods now can throw IOExceptions, great!!!).
 What is also perfect is the way how to use a FieldCache: Just retrieve the
 FieldCache array (e.g. getInts()) in the setNextReader() method and use the
 value array in the collect() method with the docid as index. Now I am able
 to e.g. retrieve cached values even after an index reopen without warming
 (same with sort). In the past you had to use a cache array for the whole
 index. The docBase is not used in my code, as I directly access the index
 readers. So users now have both possibilities: use the supplied reader or
 use the docBase as index offset into the searcher/main reader. Really cool!

 The overhead of score calculation can be left out, if not needed, also cool!

 One of my collectors is used retrieve the database ids (integers) for
 building up a SQL IN (...) from the field cache based on the collected
 hits. In the past this was very complicated, because FieldCache was slow
 after reopening and getting stored fields (the ids) is also very slow (inner
 search loop). Now it's just 10 lines of code and no score is involved.

 The new code is working now in production at PANGAEA.

 Another change to be done here is Field.Store.COMPRESS and replace by
 manually compressed binary stored fields, but this is only to get rid of
 the
 deprecated warnings. But this cannot be done without complete reindexing.

 Uwe


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-24 Thread Michael McCandless
George, did you mean LUCENE-1516 below?  (LUCENE-1313 is a further
improvement to near real-time search that's still being iterated on).

In general I would say 2.9 seems to be in rather active development still ;)

I too would love to hear about production/beta use of 2.9.  George
maybe you should re-ask on java-user?

Mike

On Sat, Apr 18, 2009 at 7:12 PM, George Aroush geo...@aroush.net wrote:
 Thanks all for your input on this subject.

 So, if I decide to grab the current code off the trunk, is it:

 1) Usable for production use?
 2) Is LUCENE-1313 (Realtime search), in the current trunk, stable and ready
 for use?

 Put another way, is anyone using the current trunk code in production, or
 even as beta?

 -- George

 
 From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com]
 Sent: Thursday, April 16, 2009 5:13 PM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 status (to port to Lucene.Net)

 LUCENE-1313 relies on LUCENE-1516 which is in trunk.  If you have other
 questions George, feel free to ask.

 On Thu, Apr 16, 2009 at 8:04 AM, George Aroush geo...@aroush.net wrote:

 Thanks Mike.

 A quick follow up question.  What's the status of
 http://issues.apache.org/jira/browse/LUCENE-1313?  Can this work be
 applied
 to Lucene 2.4.1 and still get it's benefit or are there other dependency /
 issues with it that prevents us from doing so?

 If anyone else knows, I welcome your input.

 -- George

  -Original Message-
  From: Michael McCandless [mailto:luc...@mikemccandless.com]
  Sent: Thursday, April 16, 2009 8:36 AM
  To: java-dev@lucene.apache.org
  Subject: Re: Lucene 2.9 status (to port to Lucene.Net)
 
  Hi George,
 
  There's been a sudden burst of activity lately on 2.9 development...
 
  I know there are some biggish remaining features we may want
  to get into 2.9:
 
    * The new field cache (LUCENE-831; still being iterated/mulled),
 
    * Possible major rework of Field / Document  index-time vs
      search-time Document
 
    * Applying filters via random-access API when possible  performant
      (LUCENE-1536)
 
    * Possible further optimizations to how collection works
     (LUCENE-1593)
 
    * Maybe breaking core + contrib into a more uniform set of modules
      (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here)
      -- the Modularization uber-thread.
 
    * Further improvements to near-realtime search (using RAMDir for
      small recently flushed segments)
 
    * Many other small things and probably some big ones that I'm
      forgetting now :)
 
  So things are still in flux, and I'm really not sure on a
  release date at this point.  Late last year, I was hoping for
  early this year, but it's no longer early this year ;)
 
  Mike
 
  On Wed, Apr 15, 2009 at 9:17 PM, George Aroush
  geo...@aroush.net wrote:
   Hi Folks,
  
   This is George Aroush, I'm one of the committers on Lucene.Net - a
   port of Java Lucene to C# Lucene.
  
   I'm looking at the current trunk code of yet to be released
  Lucene 2.9
   and I would like to port it to Lucene.Net.  If I do this
  now, we get
   the benefit of keeping our code base and release dates much
  closer to Java Lucene.
   However, this comes with a cost of carrying over unfinished work,
   known defects, and I have to keep an eye on new code that get
   committed into Java Lucene which must be ported over in a
  timely fashion.
  
   To help me determine when is a good time to start the port
  -- keep in
   mind, I will be taking the latest code off SVN -- I like to
  hear from
   the Java Lucene committers (and users who are playing or
  using Lucene
   2.9 off SVN) about those questions:
  
   1) how stable the current code in the trunk is,
   2) do you still have feature work to deliver or just bug fixes, and
   3) what's your target date to release Java Lucene 2.9
  
   #1 is important, such that is anyone using it in production?
  
   Yes, I did look at the current open issues in JIRA, but
  that doesn't
   help me answer the above questions.
  
   Regards,
  
   -- George
  
  
  
  -
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org
  
  
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 status (to port to Lucene.Net)

2009-04-24 Thread Uwe Schindler
 George, did you mean LUCENE-1516 below?  (LUCENE-1313 is a further
 improvement to near real-time search that's still being iterated on).
 
 In general I would say 2.9 seems to be in rather active development still
 ;)
 
 I too would love to hear about production/beta use of 2.9.  George
 maybe you should re-ask on java-user?

Here! I updated www.pangaea.de to Lucene-trunk today (because of incomplete
hashcode in TrieRangeQuery)... Works perfect, but I do not use the realtime
parts. And 10 days before the same, no problems :-)

Currently I rewrite parts of my code to Collector to go away from
HitCollector (without score, so optimizations)! The reopen() and sorting is
fine, almost no time is consumed for sorted searches after reopening indexes
every 20 minutes with just some new and small segments with changed
documents. No extra warming is needed.

Another change to be done here is Field.Store.COMPRESS and replace by
manually compressed binary stored fields, but this is only to get rid of the
deprecated warnings. But this cannot be done without complete reindexing.

Uwe


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 status (to port to Lucene.Net)

2009-04-18 Thread George Aroush
Thanks all for your input on this subject.
 
So, if I decide to grab the current code off the trunk, is it:
 
1) Usable for production use?
2) Is LUCENE-1313 (Realtime search), in the current trunk, stable and ready
for use?
 
Put another way, is anyone using the current trunk code in production, or
even as beta?
 
-- George



  _  

From: Jason Rutherglen [mailto:jason.rutherg...@gmail.com] 
Sent: Thursday, April 16, 2009 5:13 PM
To: java-dev@lucene.apache.org
Subject: Re: Lucene 2.9 status (to port to Lucene.Net)


LUCENE-1313 relies on LUCENE-1516 which is in trunk.  If you have other
questions George, feel free to ask.


On Thu, Apr 16, 2009 at 8:04 AM, George Aroush geo...@aroush.net wrote:


Thanks Mike.

A quick follow up question.  What's the status of
http://issues.apache.org/jira/browse/LUCENE-1313?  Can this work be applied
to Lucene 2.4.1 and still get it's benefit or are there other dependency /
issues with it that prevents us from doing so?

If anyone else knows, I welcome your input.

-- George


 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Thursday, April 16, 2009 8:36 AM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 status (to port to Lucene.Net)

 Hi George,

 There's been a sudden burst of activity lately on 2.9 development...

 I know there are some biggish remaining features we may want
 to get into 2.9:

   * The new field cache (LUCENE-831; still being iterated/mulled),

   * Possible major rework of Field / Document  index-time vs
 search-time Document

   * Applying filters via random-access API when possible  performant
 (LUCENE-1536)

   * Possible further optimizations to how collection works
(LUCENE-1593)

   * Maybe breaking core + contrib into a more uniform set of modules
 (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here)
 -- the Modularization uber-thread.

   * Further improvements to near-realtime search (using RAMDir for
 small recently flushed segments)

   * Many other small things and probably some big ones that I'm
 forgetting now :)

 So things are still in flux, and I'm really not sure on a
 release date at this point.  Late last year, I was hoping for
 early this year, but it's no longer early this year ;)

 Mike

 On Wed, Apr 15, 2009 at 9:17 PM, George Aroush
 geo...@aroush.net wrote:
  Hi Folks,
 
  This is George Aroush, I'm one of the committers on Lucene.Net - a
  port of Java Lucene to C# Lucene.
 
  I'm looking at the current trunk code of yet to be released
 Lucene 2.9
  and I would like to port it to Lucene.Net.  If I do this
 now, we get
  the benefit of keeping our code base and release dates much
 closer to Java Lucene.
  However, this comes with a cost of carrying over unfinished work,
  known defects, and I have to keep an eye on new code that get
  committed into Java Lucene which must be ported over in a
 timely fashion.
 
  To help me determine when is a good time to start the port
 -- keep in
  mind, I will be taking the latest code off SVN -- I like to
 hear from
  the Java Lucene committers (and users who are playing or
 using Lucene
  2.9 off SVN) about those questions:
 
  1) how stable the current code in the trunk is,
  2) do you still have feature work to deliver or just bug fixes, and
  3) what's your target date to release Java Lucene 2.9
 
  #1 is important, such that is anyone using it in production?
 
  Yes, I did look at the current open issues in JIRA, but
 that doesn't
  help me answer the above questions.
 
  Regards,
 
  -- George
 
 
 
 -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org






Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread Michael McCandless
Hi George,

There's been a sudden burst of activity lately on 2.9 development...

I know there are some biggish remaining features we may want to get
into 2.9:

  * The new field cache (LUCENE-831; still being iterated/mulled),

  * Possible major rework of Field / Document  index-time vs
search-time Document

  * Applying filters via random-access API when possible  performant
(LUCENE-1536)

  * Possible further optimizations to how collection works
   (LUCENE-1593)

  * Maybe breaking core + contrib into a more uniform set of modules
(and figuring out how Trie(Numeric)RangeQuery/Filter fits in here)
-- the Modularization uber-thread.

  * Further improvements to near-realtime search (using RAMDir for
small recently flushed segments)

  * Many other small things and probably some big ones that I'm
forgetting now :)

So things are still in flux, and I'm really not sure on a release date
at this point.  Late last year, I was hoping for early this year, but
it's no longer early this year ;)

Mike

On Wed, Apr 15, 2009 at 9:17 PM, George Aroush geo...@aroush.net wrote:
 Hi Folks,

 This is George Aroush, I'm one of the committers on Lucene.Net - a port of
 Java Lucene to C# Lucene.

 I'm looking at the current trunk code of yet to be released Lucene 2.9 and I
 would like to port it to Lucene.Net.  If I do this now, we get the benefit
 of keeping our code base and release dates much closer to Java Lucene.
 However, this comes with a cost of carrying over unfinished work, known
 defects, and I have to keep an eye on new code that get committed into Java
 Lucene which must be ported over in a timely fashion.

 To help me determine when is a good time to start the port -- keep in mind,
 I will be taking the latest code off SVN -- I like to hear from the Java
 Lucene committers (and users who are playing or using Lucene 2.9 off SVN)
 about those questions:

 1) how stable the current code in the trunk is,
 2) do you still have feature work to deliver or just bug fixes, and
 3) what's your target date to release Java Lucene 2.9

 #1 is important, such that is anyone using it in production?

 Yes, I did look at the current open issues in JIRA, but that doesn't help me
 answer the above questions.

 Regards,

 -- George


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread George Aroush
Thanks Mike.

A quick follow up question.  What's the status of
http://issues.apache.org/jira/browse/LUCENE-1313?  Can this work be applied
to Lucene 2.4.1 and still get it's benefit or are there other dependency /
issues with it that prevents us from doing so?

If anyone else knows, I welcome your input.

-- George

 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com] 
 Sent: Thursday, April 16, 2009 8:36 AM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 status (to port to Lucene.Net)
 
 Hi George,
 
 There's been a sudden burst of activity lately on 2.9 development...
 
 I know there are some biggish remaining features we may want 
 to get into 2.9:
 
   * The new field cache (LUCENE-831; still being iterated/mulled),
 
   * Possible major rework of Field / Document  index-time vs
 search-time Document
 
   * Applying filters via random-access API when possible  performant
 (LUCENE-1536)
 
   * Possible further optimizations to how collection works
(LUCENE-1593)
 
   * Maybe breaking core + contrib into a more uniform set of modules
 (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here)
 -- the Modularization uber-thread.
 
   * Further improvements to near-realtime search (using RAMDir for
 small recently flushed segments)
 
   * Many other small things and probably some big ones that I'm
 forgetting now :)
 
 So things are still in flux, and I'm really not sure on a 
 release date at this point.  Late last year, I was hoping for 
 early this year, but it's no longer early this year ;)
 
 Mike
 
 On Wed, Apr 15, 2009 at 9:17 PM, George Aroush 
 geo...@aroush.net wrote:
  Hi Folks,
 
  This is George Aroush, I'm one of the committers on Lucene.Net - a 
  port of Java Lucene to C# Lucene.
 
  I'm looking at the current trunk code of yet to be released 
 Lucene 2.9 
  and I would like to port it to Lucene.Net.  If I do this 
 now, we get 
  the benefit of keeping our code base and release dates much 
 closer to Java Lucene.
  However, this comes with a cost of carrying over unfinished work, 
  known defects, and I have to keep an eye on new code that get 
  committed into Java Lucene which must be ported over in a 
 timely fashion.
 
  To help me determine when is a good time to start the port 
 -- keep in 
  mind, I will be taking the latest code off SVN -- I like to 
 hear from 
  the Java Lucene committers (and users who are playing or 
 using Lucene 
  2.9 off SVN) about those questions:
 
  1) how stable the current code in the trunk is,
  2) do you still have feature work to deliver or just bug fixes, and
  3) what's your target date to release Java Lucene 2.9
 
  #1 is important, such that is anyone using it in production?
 
  Yes, I did look at the current open issues in JIRA, but 
 that doesn't 
  help me answer the above questions.
 
  Regards,
 
  -- George
 
 
  
 -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org
 


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread Mark Miller
I wouldn't be surprised if it didnt depend on a couple other little 
issues - Jason or Mike would probably have to tell you that.


It does count a bit on LUCENE-1483 if you want to use it with 
FieldCaches or cached Filters though. It would still work with 1483, but 
would be much slower in those cases.


- Mark

George Aroush wrote:

Thanks Mike.

A quick follow up question.  What's the status of
http://issues.apache.org/jira/browse/LUCENE-1313?  Can this work be applied
to Lucene 2.4.1 and still get it's benefit or are there other dependency /
issues with it that prevents us from doing so?

If anyone else knows, I welcome your input.

-- George

  

-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com] 
Sent: Thursday, April 16, 2009 8:36 AM

To: java-dev@lucene.apache.org
Subject: Re: Lucene 2.9 status (to port to Lucene.Net)

Hi George,

There's been a sudden burst of activity lately on 2.9 development...

I know there are some biggish remaining features we may want 
to get into 2.9:


  * The new field cache (LUCENE-831; still being iterated/mulled),

  * Possible major rework of Field / Document  index-time vs
search-time Document

  * Applying filters via random-access API when possible  performant
(LUCENE-1536)

  * Possible further optimizations to how collection works
   (LUCENE-1593)

  * Maybe breaking core + contrib into a more uniform set of modules
(and figuring out how Trie(Numeric)RangeQuery/Filter fits in here)
-- the Modularization uber-thread.

  * Further improvements to near-realtime search (using RAMDir for
small recently flushed segments)

  * Many other small things and probably some big ones that I'm
forgetting now :)

So things are still in flux, and I'm really not sure on a 
release date at this point.  Late last year, I was hoping for 
early this year, but it's no longer early this year ;)


Mike

On Wed, Apr 15, 2009 at 9:17 PM, George Aroush 
geo...@aroush.net wrote:


Hi Folks,

This is George Aroush, I'm one of the committers on Lucene.Net - a 
port of Java Lucene to C# Lucene.


I'm looking at the current trunk code of yet to be released 
  
Lucene 2.9 

and I would like to port it to Lucene.Net.  If I do this 
  
now, we get 

the benefit of keeping our code base and release dates much 
  

closer to Java Lucene.

However, this comes with a cost of carrying over unfinished work, 
known defects, and I have to keep an eye on new code that get 
committed into Java Lucene which must be ported over in a 
  

timely fashion.

To help me determine when is a good time to start the port 
  
-- keep in 

mind, I will be taking the latest code off SVN -- I like to 
  
hear from 

the Java Lucene committers (and users who are playing or 
  
using Lucene 


2.9 off SVN) about those questions:

1) how stable the current code in the trunk is,
2) do you still have feature work to deliver or just bug fixes, and
3) what's your target date to release Java Lucene 2.9

#1 is important, such that is anyone using it in production?

Yes, I did look at the current open issues in JIRA, but 
  
that doesn't 


help me answer the above questions.

Regards,

-- George



  

-


To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


  

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org





-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

  



--
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread Mark Miller
Whoops - should read: It should still work *without* 1483 but would be 
much slower in those cases (reloading the filter/fieldcache per reader 
rather than per segment).


Mark Miller wrote:
I wouldn't be surprised if it didnt depend on a couple other little 
issues - Jason or Mike would probably have to tell you that.


It does count a bit on LUCENE-1483 if you want to use it with 
FieldCaches or cached Filters though. It would still work with 1483, 
but would be much slower in those cases.


- Mark

George Aroush wrote:

Thanks Mike.

A quick follow up question.  What's the status of
http://issues.apache.org/jira/browse/LUCENE-1313?  Can this work be 
applied
to Lucene 2.4.1 and still get it's benefit or are there other 
dependency /

issues with it that prevents us from doing so?

If anyone else knows, I welcome your input.

-- George






--
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread Uwe Schindler
These issues all depend so much on each other, i would suggest to simply try
Lucene-2.9-dev trunk (e.g. from downloaded from Hudson). We have this
running here without any problems. The problem with unreleased Lucene is
more, that if you try new features, there may be non-compatible changes
until the release, so you must keep track on changes on the components you
try out.
In general: If everything works for you, and you have backups of your
indexes, you can simply try out. If it works correctly, just use it!
Patching the relased version may make it more unstable than using the
development tree, that is more tested by all our committers :)

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: George Aroush [mailto:geo...@aroush.net]
 Sent: Thursday, April 16, 2009 5:05 PM
 To: java-dev@lucene.apache.org
 Subject: RE: Lucene 2.9 status (to port to Lucene.Net)
 
 Thanks Mike.
 
 A quick follow up question.  What's the status of
 http://issues.apache.org/jira/browse/LUCENE-1313?  Can this work be
 applied
 to Lucene 2.4.1 and still get it's benefit or are there other dependency /
 issues with it that prevents us from doing so?
 
 If anyone else knows, I welcome your input.
 
 -- George
 
  -Original Message-
  From: Michael McCandless [mailto:luc...@mikemccandless.com]
  Sent: Thursday, April 16, 2009 8:36 AM
  To: java-dev@lucene.apache.org
  Subject: Re: Lucene 2.9 status (to port to Lucene.Net)
 
  Hi George,
 
  There's been a sudden burst of activity lately on 2.9 development...
 
  I know there are some biggish remaining features we may want
  to get into 2.9:
 
* The new field cache (LUCENE-831; still being iterated/mulled),
 
* Possible major rework of Field / Document  index-time vs
  search-time Document
 
* Applying filters via random-access API when possible  performant
  (LUCENE-1536)
 
* Possible further optimizations to how collection works
 (LUCENE-1593)
 
* Maybe breaking core + contrib into a more uniform set of modules
  (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here)
  -- the Modularization uber-thread.
 
* Further improvements to near-realtime search (using RAMDir for
  small recently flushed segments)
 
* Many other small things and probably some big ones that I'm
  forgetting now :)
 
  So things are still in flux, and I'm really not sure on a
  release date at this point.  Late last year, I was hoping for
  early this year, but it's no longer early this year ;)
 
  Mike
 
  On Wed, Apr 15, 2009 at 9:17 PM, George Aroush
  geo...@aroush.net wrote:
   Hi Folks,
  
   This is George Aroush, I'm one of the committers on Lucene.Net - a
   port of Java Lucene to C# Lucene.
  
   I'm looking at the current trunk code of yet to be released
  Lucene 2.9
   and I would like to port it to Lucene.Net.  If I do this
  now, we get
   the benefit of keeping our code base and release dates much
  closer to Java Lucene.
   However, this comes with a cost of carrying over unfinished work,
   known defects, and I have to keep an eye on new code that get
   committed into Java Lucene which must be ported over in a
  timely fashion.
  
   To help me determine when is a good time to start the port
  -- keep in
   mind, I will be taking the latest code off SVN -- I like to
  hear from
   the Java Lucene committers (and users who are playing or
  using Lucene
   2.9 off SVN) about those questions:
  
   1) how stable the current code in the trunk is,
   2) do you still have feature work to deliver or just bug fixes, and
   3) what's your target date to release Java Lucene 2.9
  
   #1 is important, such that is anyone using it in production?
  
   Yes, I did look at the current open issues in JIRA, but
  that doesn't
   help me answer the above questions.
  
   Regards,
  
   -- George
  
  
  
  -
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org
  
  
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 status (to port to Lucene.Net)

2009-04-16 Thread Jason Rutherglen
LUCENE-1313 relies on LUCENE-1516 which is in trunk.  If you have other
questions George, feel free to ask.

On Thu, Apr 16, 2009 at 8:04 AM, George Aroush geo...@aroush.net wrote:

 Thanks Mike.

 A quick follow up question.  What's the status of
 http://issues.apache.org/jira/browse/LUCENE-1313?  Can this work be
 applied
 to Lucene 2.4.1 and still get it's benefit or are there other dependency /
 issues with it that prevents us from doing so?

 If anyone else knows, I welcome your input.

 -- George

  -Original Message-
  From: Michael McCandless [mailto:luc...@mikemccandless.com]
  Sent: Thursday, April 16, 2009 8:36 AM
  To: java-dev@lucene.apache.org
  Subject: Re: Lucene 2.9 status (to port to Lucene.Net)
 
  Hi George,
 
  There's been a sudden burst of activity lately on 2.9 development...
 
  I know there are some biggish remaining features we may want
  to get into 2.9:
 
* The new field cache (LUCENE-831; still being iterated/mulled),
 
* Possible major rework of Field / Document  index-time vs
  search-time Document
 
* Applying filters via random-access API when possible  performant
  (LUCENE-1536)
 
* Possible further optimizations to how collection works
 (LUCENE-1593)
 
* Maybe breaking core + contrib into a more uniform set of modules
  (and figuring out how Trie(Numeric)RangeQuery/Filter fits in here)
  -- the Modularization uber-thread.
 
* Further improvements to near-realtime search (using RAMDir for
  small recently flushed segments)
 
* Many other small things and probably some big ones that I'm
  forgetting now :)
 
  So things are still in flux, and I'm really not sure on a
  release date at this point.  Late last year, I was hoping for
  early this year, but it's no longer early this year ;)
 
  Mike
 
  On Wed, Apr 15, 2009 at 9:17 PM, George Aroush
  geo...@aroush.net wrote:
   Hi Folks,
  
   This is George Aroush, I'm one of the committers on Lucene.Net - a
   port of Java Lucene to C# Lucene.
  
   I'm looking at the current trunk code of yet to be released
  Lucene 2.9
   and I would like to port it to Lucene.Net.  If I do this
  now, we get
   the benefit of keeping our code base and release dates much
  closer to Java Lucene.
   However, this comes with a cost of carrying over unfinished work,
   known defects, and I have to keep an eye on new code that get
   committed into Java Lucene which must be ported over in a
  timely fashion.
  
   To help me determine when is a good time to start the port
  -- keep in
   mind, I will be taking the latest code off SVN -- I like to
  hear from
   the Java Lucene committers (and users who are playing or
  using Lucene
   2.9 off SVN) about those questions:
  
   1) how stable the current code in the trunk is,
   2) do you still have feature work to deliver or just bug fixes, and
   3) what's your target date to release Java Lucene 2.9
  
   #1 is important, such that is anyone using it in production?
  
   Yes, I did look at the current open issues in JIRA, but
  that doesn't
   help me answer the above questions.
  
   Regards,
  
   -- George
  
  
  
  -
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org
  
  
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




Lucene 2.9 status (to port to Lucene.Net)

2009-04-15 Thread George Aroush
Hi Folks,

This is George Aroush, I'm one of the committers on Lucene.Net - a port of
Java Lucene to C# Lucene.

I'm looking at the current trunk code of yet to be released Lucene 2.9 and I
would like to port it to Lucene.Net.  If I do this now, we get the benefit
of keeping our code base and release dates much closer to Java Lucene.
However, this comes with a cost of carrying over unfinished work, known
defects, and I have to keep an eye on new code that get committed into Java
Lucene which must be ported over in a timely fashion.

To help me determine when is a good time to start the port -- keep in mind,
I will be taking the latest code off SVN -- I like to hear from the Java
Lucene committers (and users who are playing or using Lucene 2.9 off SVN)
about those questions:

1) how stable the current code in the trunk is,
2) do you still have feature work to deliver or just bug fixes, and
3) what's your target date to release Java Lucene 2.9

#1 is important, such that is anyone using it in production?

Yes, I did look at the current open issues in JIRA, but that doesn't help me
answer the above questions.

Regards,

-- George


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org