RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Uwe Schindler
 On Sun, Oct 04, 2009 at 05:53:14AM -0400, Michael McCandless wrote:
 
1 Do we prevent config settings from changing after creating an
  IW/IR?
 
 Any settings conveyed via a settings object ought to be final if you want
 pluggable index components.  Otherwise, you need some nightmarish
 notification
 system to propagate settings down into your subcomponents, which may or
 may
 not be prepared to handle the value modifications.

+1, this is an argument in my opinion for final members/settings.

By the way, there is a third possibility for passing configuration settings:
The idea is to enable passing settings to IR/IW and its flexible indexing
components by the same technique like JAXP does it (please don't hit me!):
Pass a Properties or MapString,? to the ctor/open. The keys are predefined
constants. Maybe our previous idea of an IndexConfiguration class is a
subclass of HashMapString,? with all the constants and some easy-to-use
setter methods for very often-used settings (like index dir) and some
reasonable defaults.

This allows us to pass these properties to any flex indexing component
without need to modify/extend it to support the additional properties. The
flexible indexing component just defines its own property names (e.g. as
URNs, URLs, using its class name as prefix,...). Property names are always
String, values any type (therefore MapString,?). With Java 5, integer
props and so on are no bad syntax problem because of autoboxing (no need
to pass new Integer() or Integer.valueOf()).

Another good thing is, that implementors of e.g. XML config files like in
Solr, can simple pass all elements in config to this map.

Uwe


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Uwe Schindler
  On Sun, Oct 04, 2009 at 05:53:14AM -0400, Michael McCandless wrote:
 
 1 Do we prevent config settings from changing after creating an
   IW/IR?
 
  Any settings conveyed via a settings object ought to be final if you
 want
  pluggable index components.  Otherwise, you need some nightmarish
  notification
  system to propagate settings down into your subcomponents, which may or
  may
  not be prepared to handle the value modifications.
 
 +1, this is an argument in my opinion for final members/settings.
 
 By the way, there is a third possibility for passing configuration
 settings:
 The idea is to enable passing settings to IR/IW and its flexible indexing
 components by the same technique like JAXP does it (please don't hit me!):
 Pass a Properties or MapString,? to the ctor/open. The keys are
 predefined
 constants. Maybe our previous idea of an IndexConfiguration class is a
 subclass of HashMapString,? with all the constants and some easy-to-use
 setter methods for very often-used settings (like index dir) and some
 reasonable defaults.
 
 This allows us to pass these properties to any flex indexing component
 without need to modify/extend it to support the additional properties. The
 flexible indexing component just defines its own property names (e.g. as
 URNs, URLs, using its class name as prefix,...). Property names are always
 String, values any type (therefore MapString,?). With Java 5, integer
 props and so on are no bad syntax problem because of autoboxing (no need
 to pass new Integer() or Integer.valueOf()).
 
 Another good thing is, that implementors of e.g. XML config files like in
 Solr, can simple pass all elements in config to this map.

Another option for extensibility with type safety, properties would not
have, would be Attributes. Just pass an AttributeSource as configuration.
And the default index properties are one attribute where custom extensions
can define own ones.

Uwe


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Marvin Humphrey
On Mon, Oct 05, 2009 at 08:27:20AM +0200, Uwe Schindler wrote:

 Pass a Properties or MapString,? to the ctor/open. The keys are predefined
 constants. Maybe our previous idea of an IndexConfiguration class is a
 subclass of HashMapString,? with all the constants and some easy-to-use
 setter methods for very often-used settings (like index dir) and some
 reasonable defaults.

Interesting.  The design we worked out for Lucy's Segment class (prototype in
KS devel branch) uses hash/array/string data to store arbitrary metadata on
behalf of segment components, written as JSON to seg_NNN/segmeta.json.  In
that case, though, each component is responsible for generating and consuming
its own data.  That's different from having the user supply data via such a
format.

I still think you're going to want an extensible builder class.

 This allows us to pass these properties to any flex indexing component
 without need to modify/extend it to support the additional properties. The
 flexible indexing component just defines its own property names (e.g. as
 URNs, URLs, using its class name as prefix,...). 

But how do you determine what the flex indexing components *are*?  In theory,
you can pass class names and sufficient arguments to build them up via your
big ball of data, but then you're essentially creating a new language, with
all the headaches that entails. 

In KS, Indexer/IndexReader configuration is divided between three classes.

  * Schema: field definitions.
  * Architecture: Settings that never change for the life of the index.
  * IndexManager: Settings that can change per index/search session.

Schema isn't worth discussing -- Lucy will have it, Lucene won't, end of
story.  Architecture and IndexManager, though, are fairly close to what's
being discussed.

Architecture is responsible for e.g. determining which plugabble components
get registered.  It's the builder class.

IndexManager is where things like merging and locking policies reside.

 Property names are always String, values any type (therefore MapString,?).
 With Java 5, integer props and so on are no bad syntax problem because of
 autoboxing (no need to pass new Integer() or Integer.valueOf()).

Argument validation gets to be a headache when you pass around complex data
structures.  It's doable, but messy and hard to grok.  Going through dedicated
methods is cleaner and safer.

 Another good thing is, that implementors of e.g. XML config files like in
 Solr, can simple pass all elements in config to this map.

I go back and forth on this.  At some point, the volume of data becomes
overwhelming and it becomes easier to swap in the name of a class where most
of the data can reside in nice, reliable, structured code.

Marvin Humphrey


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Uwe Schindler
Hi Marvin,

  Property names are always String, values any type (therefore
 MapString,?).
  With Java 5, integer props and so on are no bad syntax problem because
 of
  autoboxing (no need to pass new Integer() or Integer.valueOf()).
 
 Argument validation gets to be a headache when you pass around complex
 data
 structures.  It's doable, but messy and hard to grok.  Going through
 dedicated
 methods is cleaner and safer.
 
  Another good thing is, that implementors of e.g. XML config files like
 in
  Solr, can simple pass all elements in config to this map.
 
 I go back and forth on this.  At some point, the volume of data becomes
 overwhelming and it becomes easier to swap in the name of a class where
 most
 of the data can reside in nice, reliable, structured code.

See my second mail. The recently introduced Attributes and AttributeSource
would solve this. Each component just defines its attribute interface and
impl class and you pass in an AttributeSource as configuration. Then you can
do:

AttributeSource cfg = new AttributeSource();

ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class);
compCfg.setMergeScheduler(FooScheduler.class);

MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class);
mergeCfg.setWhateverProp(1234);
...
IndexWriter iw = new IndexWriter(dir, cfg);

(this is just brainstorming not yet thoroughly thought about).

Uwe



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Earwin Burrfoot
On Mon, Oct 5, 2009 at 12:01, Uwe Schindler u...@thetaphi.de wrote:
 Hi Marvin,

  Property names are always String, values any type (therefore
 MapString,?).
  With Java 5, integer props and so on are no bad syntax problem because
 of
  autoboxing (no need to pass new Integer() or Integer.valueOf()).

 Argument validation gets to be a headache when you pass around complex
 data
 structures.  It's doable, but messy and hard to grok.  Going through
 dedicated
 methods is cleaner and safer.

  Another good thing is, that implementors of e.g. XML config files like
 in
  Solr, can simple pass all elements in config to this map.

 I go back and forth on this.  At some point, the volume of data becomes
 overwhelming and it becomes easier to swap in the name of a class where
 most
 of the data can reside in nice, reliable, structured code.

 See my second mail. The recently introduced Attributes and AttributeSource
 would solve this. Each component just defines its attribute interface and
 impl class and you pass in an AttributeSource as configuration. Then you can
 do:

 AttributeSource cfg = new AttributeSource();

 ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class);
 compCfg.setMergeScheduler(FooScheduler.class);

 MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class);
 mergeCfg.setWhateverProp(1234);
 ...
 IndexWriter iw = new IndexWriter(dir, cfg);

 (this is just brainstorming not yet thoroughly thought about).

This approach suggests IW creates its components, and while doing so
provides them your AS instance.
I personally prefer creating all these components myself, configuring
them (at the moment of creation) and passing them to IW in one way or
another.
This requires way less code, you don't have to invent elaborate
schemes of passing through your custom per-component settings and
selecting which exact component types IW should use, you don't risk
construct/postConstruct/postpostConstruct-style things.

-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Uwe Schindler
  See my second mail. The recently introduced Attributes and
 AttributeSource
  would solve this. Each component just defines its attribute interface
 and
  impl class and you pass in an AttributeSource as configuration. Then you
 can
  do:
 
  AttributeSource cfg = new AttributeSource();
 
  ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class);
  compCfg.setMergeScheduler(FooScheduler.class);
 
  MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class);
  mergeCfg.setWhateverProp(1234);
  ...
  IndexWriter iw = new IndexWriter(dir, cfg);
 
  (this is just brainstorming not yet thoroughly thought about).
 
 This approach suggests IW creates its components, and while doing so
 provides them your AS instance.
 I personally prefer creating all these components myself, configuring
 them (at the moment of creation) and passing them to IW in one way or
 another.
 This requires way less code, you don't have to invent elaborate
 schemes of passing through your custom per-component settings and
 selecting which exact component types IW should use, you don't risk
 construct/postConstruct/postpostConstruct-style things.


Not really. That was just brainstorming. But you can pass also instances
instead of class names through attributesource. AttributeSurce only provides
type safety for the various configuration settings (which are interfaces).
But you could also create an attribute that gets the pointer to the
component. So compCfg.setMergeScheduler(FooScheduler.class); could also be
compConfig.addComponent(new FooScheduler(...));

The AttributeSource approach has one other good thing:
If you want to use the default settings for one attribute, you do not have
to add it to the AS (or you can forget it). With the properties approach,
you have to hardcode the parameter defaults and validation everywhere. As
the consumer of an AttributeSource gets the attribute also by an
addAttribute-call (see current indexing code consuming TokenStreams), this
call would add the missing attribute with its default settings defined by
the implementation class. So in the above example, if you do not want to
provide the whateverProp, leave the whole MergeBarAttribute out. The
consumer (IW) would just call addAttribute(MergeBarAttribute.class), because
it needs the attribute to configure itself. AS would add this attribute with
default settings.

Uwe


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Michael McCandless
I think AS is overkill for conveying configuration of IW/IR?

Suddenly, instead of:

  cfg.setRAMBufferSizeMB(128.0)

I'd have to do something like this?

  
cfg.addAttribute(IndexWriterRAMBufferSizeAttribute.class).setRAMBufferSize(128.0)

It's too cumbersome, I think, for something that ought to be simple.
I'd prefer a dedicated config class with strongly typed setters
exposed.  Of all the pure syntax options so far I'd still prefer the
traditional config object with setters.

Also, I don't think we should roll this out for all Lucene classes.  I
think most classes do just fine accepting args to their ctor.  EG
TermQuery simply takes Term to its ctor.

I do agree IW should not be in the business of brokering changes to
the settings of its sub-components (eg mergeFactor, maxMergeDocs).
You really should make such changes directly via your merge policy.

Finally, I'm not convinced we should lock down all settings after
classes are created.  (I'm not convinced we shouldn't, either).

A merge policy has no trouble changing its mergeFactor,
maxMergeDocs/Size.  IW has no trouble changing the its RAM buffer
size, maxFieldLength, or useCompoundFile.  Sure there are some things
that cannot (or would be very tricky to) change, eg deletion policy.
But then analyzer isn't changeable today, but could be.

But, then, I can also see it'd simplify our code to not have to deal
w/ such changes, reduce chance of subtle bugs, and it seems minor to
go and re-open your IndexWriter if you need to make a settings change?
(Hmm except in an NRT setting, because the reader pool would be reset;
really we need to get the reader pool separated from the IW instance).

Mike

On Mon, Oct 5, 2009 at 4:38 AM, Uwe Schindler u...@thetaphi.de wrote:
  See my second mail. The recently introduced Attributes and
 AttributeSource
  would solve this. Each component just defines its attribute interface
 and
  impl class and you pass in an AttributeSource as configuration. Then you
 can
  do:
 
  AttributeSource cfg = new AttributeSource();
 
  ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class);
  compCfg.setMergeScheduler(FooScheduler.class);
 
  MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class);
  mergeCfg.setWhateverProp(1234);
  ...
  IndexWriter iw = new IndexWriter(dir, cfg);
 
  (this is just brainstorming not yet thoroughly thought about).

 This approach suggests IW creates its components, and while doing so
 provides them your AS instance.
 I personally prefer creating all these components myself, configuring
 them (at the moment of creation) and passing them to IW in one way or
 another.
 This requires way less code, you don't have to invent elaborate
 schemes of passing through your custom per-component settings and
 selecting which exact component types IW should use, you don't risk
 construct/postConstruct/postpostConstruct-style things.


 Not really. That was just brainstorming. But you can pass also instances
 instead of class names through attributesource. AttributeSurce only provides
 type safety for the various configuration settings (which are interfaces).
 But you could also create an attribute that gets the pointer to the
 component. So compCfg.setMergeScheduler(FooScheduler.class); could also be
 compConfig.addComponent(new FooScheduler(...));

 The AttributeSource approach has one other good thing:
 If you want to use the default settings for one attribute, you do not have
 to add it to the AS (or you can forget it). With the properties approach,
 you have to hardcode the parameter defaults and validation everywhere. As
 the consumer of an AttributeSource gets the attribute also by an
 addAttribute-call (see current indexing code consuming TokenStreams), this
 call would add the missing attribute with its default settings defined by
 the implementation class. So in the above example, if you do not want to
 provide the whateverProp, leave the whole MergeBarAttribute out. The
 consumer (IW) would just call addAttribute(MergeBarAttribute.class), because
 it needs the attribute to configure itself. AS would add this attribute with
 default settings.

 Uwe


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Uwe Schindler
Hi Mike,

 I think AS is overkill for conveying configuration of IW/IR?
 
 Suddenly, instead of:
 
   cfg.setRAMBufferSizeMB(128.0)
 
 I'd have to do something like this?
 
 
 cfg.addAttribute(IndexWriterRAMBufferSizeAttribute.class).setRAMBufferSize
 (128.0)
 
 It's too cumbersome, I think, for something that ought to be simple.
 I'd prefer a dedicated config class with strongly typed setters
 exposed.  Of all the pure syntax options so far I'd still prefer the
 traditional config object with setters.

From this point-of-view, it's also overkill for TokenStream. But as AS was
also designed for flexible indexing it would fit very well into this area.

The new query parser is a good example pro attributes. What is an argument
against atts is the fact, that also Michael Bush didn't promote them from
the beginning of this discussion :-) (maybe he needs also one night longer
to think about it).

Good points for AS, are e.g. the type-safety, simplicity to enhance,
built-in defaults (you do not need to check for existence of attributes,
just add them at the point you want to use them, like in your example -
maybe with nicer and shorter names). With generics, AS is as simple to use
like simple get/setters.

 Also, I don't think we should roll this out for all Lucene classes.  I
 think most classes do just fine accepting args to their ctor.  EG
 TermQuery simply takes Term to its ctor.
 
 I do agree IW should not be in the business of brokering changes to
 the settings of its sub-components (eg mergeFactor, maxMergeDocs).
 You really should make such changes directly via your merge policy.

AttributeSource would also help us with e.g. the possibility for later
changes to various attributes. If some of the attributes are fixed after
construction of IW/IR, just throw IllegalStateExceptions.

 Finally, I'm not convinced we should lock down all settings after
 classes are created.  (I'm not convinced we shouldn't, either).
 
 A merge policy has no trouble changing its mergeFactor,
 maxMergeDocs/Size.  IW has no trouble changing the its RAM buffer
 size, maxFieldLength, or useCompoundFile.  Sure there are some things
 that cannot (or would be very tricky to) change, eg deletion policy.
 But then analyzer isn't changeable today, but could be.
 
 But, then, I can also see it'd simplify our code to not have to deal
 w/ such changes, reduce chance of subtle bugs, and it seems minor to
 go and re-open your IndexWriter if you need to make a settings change?
 (Hmm except in an NRT setting, because the reader pool would be reset;
 really we need to get the reader pool separated from the IW instance).
 
 Mike
 
 On Mon, Oct 5, 2009 at 4:38 AM, Uwe Schindler u...@thetaphi.de wrote:
   See my second mail. The recently introduced Attributes and
  AttributeSource
   would solve this. Each component just defines its attribute interface
  and
   impl class and you pass in an AttributeSource as configuration. Then
 you
  can
   do:
  
   AttributeSource cfg = new AttributeSource();
  
   ComponentAttribute compCfg =
 cfg.addAttribute(ComponentAttribute.class);
   compCfg.setMergeScheduler(FooScheduler.class);
  
   MergeBarAttribute mergeCfg =
 cfg.addAttribute(MergeBarAttribute.class);
   mergeCfg.setWhateverProp(1234);
   ...
   IndexWriter iw = new IndexWriter(dir, cfg);
  
   (this is just brainstorming not yet thoroughly thought about).
 
  This approach suggests IW creates its components, and while doing so
  provides them your AS instance.
  I personally prefer creating all these components myself, configuring
  them (at the moment of creation) and passing them to IW in one way or
  another.
  This requires way less code, you don't have to invent elaborate
  schemes of passing through your custom per-component settings and
  selecting which exact component types IW should use, you don't risk
  construct/postConstruct/postpostConstruct-style things.
 
 
  Not really. That was just brainstorming. But you can pass also instances
  instead of class names through attributesource. AttributeSurce only
 provides
  type safety for the various configuration settings (which are
 interfaces).
  But you could also create an attribute that gets the pointer to the
  component. So compCfg.setMergeScheduler(FooScheduler.class); could
 also be
  compConfig.addComponent(new FooScheduler(...));
 
  The AttributeSource approach has one other good thing:
  If you want to use the default settings for one attribute, you do not
 have
  to add it to the AS (or you can forget it). With the properties
 approach,
  you have to hardcode the parameter defaults and validation everywhere.
 As
  the consumer of an AttributeSource gets the attribute also by an
  addAttribute-call (see current indexing code consuming TokenStreams),
 this
  call would add the missing attribute with its default settings defined
 by
  the implementation class. So in the above example, if you do not want to
  provide the whateverProp, leave the whole MergeBarAttribute out. The
  

Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Mark Miller
Michael McCandless wrote:
 I think AS is overkill for conveying configuration of IW/IR?

 Suddenly, instead of:

   cfg.setRAMBufferSizeMB(128.0)

 I'd have to do something like this?

   
 cfg.addAttribute(IndexWriterRAMBufferSizeAttribute.class).setRAMBufferSize(128.0)

 It's too cumbersome, I think, for something that ought to be simple.
 I'd prefer a dedicated config class with strongly typed setters
 exposed.  Of all the pure syntax options so far I'd still prefer the
 traditional config object with setters.
   
+1

 Also, I don't think we should roll this out for all Lucene classes.  I
 think most classes do just fine accepting args to their ctor.  EG
 TermQuery simply takes Term to its ctor.
   
+1
 I do agree IW should not be in the business of brokering changes to
 the settings of its sub-components (eg mergeFactor, maxMergeDocs).
 You really should make such changes directly via your merge policy.
   
Agreed we need to deal with - *but* I personally think it gets tricky.
Users should be able to flip compound on/off easily without dealing with
a mergepolicy IMO. And advanced users that set a mergepolicy shouldn't
have to deal with losing a compound setting they set with IW after
setting a new mergepolicy. Can't I have it both ways :)
 Finally, I'm not convinced we should lock down all settings after
 classes are created.  (I'm not convinced we shouldn't, either).

 A merge policy has no trouble changing its mergeFactor,
 maxMergeDocs/Size.  IW has no trouble changing the its RAM buffer
 size, maxFieldLength, or useCompoundFile.  Sure there are some things
 that cannot (or would be very tricky to) change, eg deletion policy.
 But then analyzer isn't changeable today, but could be.

 But, then, I can also see it'd simplify our code to not have to deal
 w/ such changes, reduce chance of subtle bugs, and it seems minor to
 go and re-open your IndexWriter if you need to make a settings change?
 (Hmm except in an NRT setting, because the reader pool would be reset;
 really we need to get the reader pool separated from the IW instance).

 Mike

 On Mon, Oct 5, 2009 at 4:38 AM, Uwe Schindler u...@thetaphi.de wrote:
   
 See my second mail. The recently introduced Attributes and
 
 AttributeSource
   
 would solve this. Each component just defines its attribute interface
 
 and
   
 impl class and you pass in an AttributeSource as configuration. Then you
 
 can
   
 do:

 AttributeSource cfg = new AttributeSource();

 ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class);
 compCfg.setMergeScheduler(FooScheduler.class);

 MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class);
 mergeCfg.setWhateverProp(1234);
 ...
 IndexWriter iw = new IndexWriter(dir, cfg);

 (this is just brainstorming not yet thoroughly thought about).
 
 This approach suggests IW creates its components, and while doing so
 provides them your AS instance.
 I personally prefer creating all these components myself, configuring
 them (at the moment of creation) and passing them to IW in one way or
 another.
 This requires way less code, you don't have to invent elaborate
 schemes of passing through your custom per-component settings and
 selecting which exact component types IW should use, you don't risk
 construct/postConstruct/postpostConstruct-style things.
   
 Not really. That was just brainstorming. But you can pass also instances
 instead of class names through attributesource. AttributeSurce only provides
 type safety for the various configuration settings (which are interfaces).
 But you could also create an attribute that gets the pointer to the
 component. So compCfg.setMergeScheduler(FooScheduler.class); could also be
 compConfig.addComponent(new FooScheduler(...));

 The AttributeSource approach has one other good thing:
 If you want to use the default settings for one attribute, you do not have
 to add it to the AS (or you can forget it). With the properties approach,
 you have to hardcode the parameter defaults and validation everywhere. As
 the consumer of an AttributeSource gets the attribute also by an
 addAttribute-call (see current indexing code consuming TokenStreams), this
 call would add the missing attribute with its default settings defined by
 the implementation class. So in the above example, if you do not want to
 provide the whateverProp, leave the whole MergeBarAttribute out. The
 consumer (IW) would just call addAttribute(MergeBarAttribute.class), because
 it needs the attribute to configure itself. AS would add this attribute with
 default settings.

 Uwe


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org


 

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, 

Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Earwin Burrfoot
 I think AS is overkill for conveying configuration of IW/IR?
Agree.

 It's too cumbersome, I think, for something that ought to be simple.
 I'd prefer a dedicated config class with strongly typed setters
 exposed.  Of all the pure syntax options so far I'd still prefer the
 traditional config object with setters.
Builders are visually cleaner. But well, it's just my aestetical preference.

 Also, I don't think we should roll this out for all Lucene classes.  I
 think most classes do just fine accepting args to their ctor.  EG
 TermQuery simply takes Term to its ctor.
It's obvious.

 I do agree IW should not be in the business of brokering changes to
 the settings of its sub-components (eg mergeFactor, maxMergeDocs).
 You really should make such changes directly via your merge policy.
Aaaand, you shouldn't do such changes after construction :)

 But, then, I can also see it'd simplify our code to not have to deal
 w/ such changes, reduce chance of subtle bugs, and it seems minor to
 go and re-open your IndexWriter if you need to make a settings change?
 (Hmm except in an NRT setting, because the reader pool would be reset;
 really we need to get the reader pool separated from the IW instance).
Even if recreating IW is costly, you don't change settings that often, isn't it?

Mark:
 Agreed we need to deal with - *but* I personally think it gets tricky.
 Users should be able to flip compound on/off easily without dealing with
 a mergepolicy IMO. And advanced users that set a mergepolicy shouldn't
 have to deal with losing a compound setting they set with IW after
 setting a new mergepolicy. Can't I have it both ways :)
I don't understand why on earth compound setting is a property of MergePolicy.
The question of which segments to merge is really orthogonal to the
way you store these segments on disk.

-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-05 Thread Michael Busch

I think we shouldn't discuss too many different things here.
To begin I'd just like to introduce the IndexConfig class, that will 
hold the parameters we currently pass to the different IndexWriter 
constructors.


If we later need to create different IndexWriter impls we can introduce 
a factory.


If we want to change some IW settings to be mandatory on IW 
instantiation we can move those parameters from IW to the Config class 
then.


If we see in the future the need to pass arguments to the different flex 
index consumers, we can add an AttributeSource or Properties hashmap to 
the config class, or maybe directly to the IndexingChain class. I don't 
really think the IndexWriter needs this flexibility right now and it 
seems like Mike hasn't seen the need thus far while working on 
LUCENE-1458 either.


Adding the Config class and deprecating all other IW constructors will 
not prevent us from doing any of the other things in the future IMO and 
is already a great start to simplify things. So let's do that first and 
discuss the other points separately when the need arises.


 Michael

On 10/5/09 5:40 AM, Uwe Schindler wrote:

Hi Mike,

   

I think AS is overkill for conveying configuration of IW/IR?

Suddenly, instead of:

   cfg.setRAMBufferSizeMB(128.0)

I'd have to do something like this?


cfg.addAttribute(IndexWriterRAMBufferSizeAttribute.class).setRAMBufferSize
(128.0)

It's too cumbersome, I think, for something that ought to be simple.
I'd prefer a dedicated config class with strongly typed setters
exposed.  Of all the pure syntax options so far I'd still prefer the
traditional config object with setters.
 

 From this point-of-view, it's also overkill for TokenStream. But as AS was
also designed for flexible indexing it would fit very well into this area.

The new query parser is a good example pro attributes. What is an argument
against atts is the fact, that also Michael Bush didn't promote them from
the beginning of this discussion :-) (maybe he needs also one night longer
to think about it).

Good points for AS, are e.g. the type-safety, simplicity to enhance,
built-in defaults (you do not need to check for existence of attributes,
just add them at the point you want to use them, like in your example -
maybe with nicer and shorter names). With generics, AS is as simple to use
like simple get/setters.

   

Also, I don't think we should roll this out for all Lucene classes.  I
think most classes do just fine accepting args to their ctor.  EG
TermQuery simply takes Term to its ctor.

I do agree IW should not be in the business of brokering changes to
the settings of its sub-components (eg mergeFactor, maxMergeDocs).
You really should make such changes directly via your merge policy.
 

AttributeSource would also help us with e.g. the possibility for later
changes to various attributes. If some of the attributes are fixed after
construction of IW/IR, just throw IllegalStateExceptions.

   

Finally, I'm not convinced we should lock down all settings after
classes are created.  (I'm not convinced we shouldn't, either).

A merge policy has no trouble changing its mergeFactor,
maxMergeDocs/Size.  IW has no trouble changing the its RAM buffer
size, maxFieldLength, or useCompoundFile.  Sure there are some things
that cannot (or would be very tricky to) change, eg deletion policy.
But then analyzer isn't changeable today, but could be.

But, then, I can also see it'd simplify our code to not have to deal
w/ such changes, reduce chance of subtle bugs, and it seems minor to
go and re-open your IndexWriter if you need to make a settings change?
(Hmm except in an NRT setting, because the reader pool would be reset;
really we need to get the reader pool separated from the IW instance).

Mike

On Mon, Oct 5, 2009 at 4:38 AM, Uwe Schindleru...@thetaphi.de  wrote:
 

See my second mail. The recently introduced Attributes and
   

AttributeSource
 

would solve this. Each component just defines its attribute interface
   

and
 

impl class and you pass in an AttributeSource as configuration. Then
   

you
 

can
 

do:

AttributeSource cfg = new AttributeSource();

ComponentAttribute compCfg =
   

cfg.addAttribute(ComponentAttribute.class);
 

compCfg.setMergeScheduler(FooScheduler.class);

MergeBarAttribute mergeCfg =
   

cfg.addAttribute(MergeBarAttribute.class);
 

mergeCfg.setWhateverProp(1234);
...
IndexWriter iw = new IndexWriter(dir, cfg);

(this is just brainstorming not yet thoroughly thought about).
   

This approach suggests IW creates its components, and while doing so
provides them your AS instance.
I personally prefer creating all these components myself, configuring
them (at the moment of creation) and passing them to IW in one way or
another.
This requires way less code, you don't have to invent elaborate
schemes of passing through your custom per-component settings and
selecting which 

Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-04 Thread Michael Busch

On 10/4/09 3:31 AM, Mark Miller wrote:

Ted Dunning wrote:
   

The builder pattern and the config argument to a factory both have the
advantage that you can limit changes after creating an object.  Some
things are just bad to change in mid-stream.  The config argument is
nice in that you can pass it around to different stake holders, but
the builder can be used a bit like that as well.
 

Yeah that argument has been made. But I've *never* seen it as an issue.
Just seems like a solution looking for a problem. I can see how it's
cleaner, not missing that point. But still doesn't make me like it much.

   
Yeah personally this wasn't a problem for me either. I do like the 
cleanliness though. Also, I'd very much prefer a config object over 
multiple constructors (with the need to deprecate/add with every 
change), as I proposed originally in this thread.


I still don't see an advantage of the builder pattern over the config 
object + factory pattern - and I'm not even sure if we really need a 
factory; IMO passing a config object into a single constructor would be 
sufficient for IW.


 Michael

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-04 Thread Uwe Schindler
  The builder pattern and the config argument to a factory both have the
  advantage that you can limit changes after creating an object.  Some
  things are just bad to change in mid-stream.  The config argument is
  nice in that you can pass it around to different stake holders, but
  the builder can be used a bit like that as well.
 
  Yeah that argument has been made. But I've *never* seen it as an issue.
  Just seems like a solution looking for a problem. I can see how it's
  cleaner, not missing that point. But still doesn't make me like it much.
 
 
 Yeah personally this wasn't a problem for me either. I do like the
 cleanliness though. Also, I'd very much prefer a config object over
 multiple constructors (with the need to deprecate/add with every
 change), as I proposed originally in this thread.
 
 I still don't see an advantage of the builder pattern over the config
 object + factory pattern - and I'm not even sure if we really need a
 factory; IMO passing a config object into a single constructor would be
 sufficient for IW.

For IR the factory would be ok. In my opinion you could also combine both
patterns:

- Each setter in the config object returns itself, so you have the builder
pattern, but you could also use it in classical setter way (this only works
if the builder pattern always returns itself not a new builder object)
- The builder factory .build() just delegates to the ctor/static factory in
IR/IW and passes itself to it).

So you have both possibilities:

IndexReader reader = new IndexReader.Config(dir).setReadOnly(true)
.setTermInfosIndexDivisor(foo).build();

is equal to:

IndexReader.Config config = IndexReader.Config(dir);
config.setReadOnly(true);
config.setTermInfosIndexDivisor(foo);
IndexReader reader = IndexReader.create(config);

Uwe


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-04 Thread Michael McCandless
I don't think we should do both.  Suddenly, all code snippets (in
javadocs, tutorials, email we all send, etc.) can be one pattern or
the other, with each of us choosing based on our preference.  Or,
mixed.

I think this just causes confusion.  It'd suddenly become alot like
differences of opinion on which whitespace style is best.

I'd rather have one clear syntax, and at this point I'd prefer to
stick with the classical setter approach, ie a standalone config
object.

But there are two other (more important than pure syntax!) questions
being debated here:

  1 Do we prevent config settings from changing after creating an
IW/IR?

  2 Do we use factory or ctor to create IW/IR?

On #1, we are technically taking something away.  Are we sure no users
find the freedom to change IW settings mid-stream (ramBufferSizeMB,
mergeFactor) important?  For example, infoStream should remain an IW
setter.  Also, MergePolicy now requires IW instance on construction,
so we'd need to rework that.

On #2, I agree with Michael: until we see a clear reason to hide IW's
concrete impl., we may as well stick with the one impl we have now.
Design for today.

Mike

On Sun, Oct 4, 2009 at 5:33 AM, Uwe Schindler u...@thetaphi.de wrote:
  The builder pattern and the config argument to a factory both have the
  advantage that you can limit changes after creating an object.  Some
  things are just bad to change in mid-stream.  The config argument is
  nice in that you can pass it around to different stake holders, but
  the builder can be used a bit like that as well.
 
  Yeah that argument has been made. But I've *never* seen it as an issue.
  Just seems like a solution looking for a problem. I can see how it's
  cleaner, not missing that point. But still doesn't make me like it much.
 
 
 Yeah personally this wasn't a problem for me either. I do like the
 cleanliness though. Also, I'd very much prefer a config object over
 multiple constructors (with the need to deprecate/add with every
 change), as I proposed originally in this thread.

 I still don't see an advantage of the builder pattern over the config
 object + factory pattern - and I'm not even sure if we really need a
 factory; IMO passing a config object into a single constructor would be
 sufficient for IW.

 For IR the factory would be ok. In my opinion you could also combine both
 patterns:

 - Each setter in the config object returns itself, so you have the builder
 pattern, but you could also use it in classical setter way (this only works
 if the builder pattern always returns itself not a new builder object)
 - The builder factory .build() just delegates to the ctor/static factory in
 IR/IW and passes itself to it).

 So you have both possibilities:

 IndexReader reader = new IndexReader.Config(dir).setReadOnly(true)
 .setTermInfosIndexDivisor(foo).build();

 is equal to:

 IndexReader.Config config = IndexReader.Config(dir);
 config.setReadOnly(true);
 config.setTermInfosIndexDivisor(foo);
 IndexReader reader = IndexReader.create(config);

 Uwe


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-04 Thread Yonik Seeley
On Sun, Oct 4, 2009 at 5:53 AM, Michael McCandless
luc...@mikemccandless.com wrote:
  1 Do we prevent config settings from changing after creating an
    IW/IR?

  2 Do we use factory or ctor to create IW/IR?

 On #1, we are technically taking something away.  Are we sure no users
 find the freedom to change IW settings mid-stream (ramBufferSizeMB,
 mergeFactor) important?

Some of these are important I think - esp changing merge factor or the
max segment size.

Seems like everything that should be fixed at construction time
(simple params at least) can be passed in the config object, and
everything else can remain setters on the IndexWriter.  Of course
things like max segment size have been factored out into the merge
policies... but you get the idea.

-Yonik
http://www.lucidimagination.com

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-04 Thread Earwin Burrfoot
As I stated in my last email, there's zero difference between
settings+static factory and builder except for syntax. Cannot
understand what Mark, Mike are arguing about.
Right now I offer to do two things, in any possible way - eradicate as
much broken/spahetti-like runtime state change from IW and friends as
possible, and kill setting methods that delegate to IW components (eg
MergePolicy), as they are redundant and suddenly break if you supply a
non-default component instance.

On Sun, Oct 4, 2009 at 17:55, Yonik Seeley ysee...@gmail.com wrote:
 On Sun, Oct 4, 2009 at 5:53 AM, Michael McCandless
 luc...@mikemccandless.com wrote:
  1 Do we prevent config settings from changing after creating an
    IW/IR?

  2 Do we use factory or ctor to create IW/IR?

 On #1, we are technically taking something away.  Are we sure no users
 find the freedom to change IW settings mid-stream (ramBufferSizeMB,
 mergeFactor) important?

 Some of these are important I think - esp changing merge factor or the
 max segment size.
The question is - whether anybody's going to change
mergefactor/maxsegment size often enough he can't recreate IW without
dire performance penalties?

 Seems like everything that should be fixed at construction time
 (simple params at least) can be passed in the config object, and
 everything else can remain setters on the IndexWriter.  Of course
 things like max segment size have been factored out into the merge
 policies... but you get the idea.

 -Yonik
 http://www.lucidimagination.com

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-04 Thread Mark Miller
Earwin Burrfoot wrote:
 As I stated in my last email, there's zero difference between
 settings+static factory and builder except for syntax. Cannot
 understand what Mark, Mike are arguing about.
   
Sounds like we are arguing that we don't like the syntax then...
 kill setting methods that delegate to IW components (eg
 MergePolicy), as they are redundant and suddenly break if you supply a
 non-default component instance.
   
I do agree that this is something that should be addressed.

 On Sun, Oct 4, 2009 at 17:55, Yonik Seeley ysee...@gmail.com wrote:
   
 On Sun, Oct 4, 2009 at 5:53 AM, Michael McCandless
 luc...@mikemccandless.com wrote:
 
  1 Do we prevent config settings from changing after creating an
IW/IR?

  2 Do we use factory or ctor to create IW/IR?

 On #1, we are technically taking something away.  Are we sure no users
 find the freedom to change IW settings mid-stream (ramBufferSizeMB,
 mergeFactor) important?
   
 Some of these are important I think - esp changing merge factor or the
 max segment size.
 
 The question is - whether anybody's going to change
 mergefactor/maxsegment size often enough he can't recreate IW without
 dire performance penalties?

   
 Seems like everything that should be fixed at construction time
 (simple params at least) can be passed in the config object, and
 everything else can remain setters on the IndexWriter.  Of course
 things like max segment size have been factored out into the merge
 policies... but you get the idea.

 -Yonik
 http://www.lucidimagination.com

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org


 



   


-- 
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-04 Thread Marvin Humphrey
On Sun, Oct 04, 2009 at 03:04:13PM -0400, Mark Miller wrote:
 Earwin Burrfoot wrote:
  As I stated in my last email, there's zero difference between
  settings+static factory and builder except for syntax. Cannot
  understand what Mark, Mike are arguing about.

 Sounds like we are arguing that we don't like the syntax then...

So, implement the static factory methods as wrappers around the builder
method.

  public static IndexWriter open(Directory dir, Analyzer analyzer) {
return open(new IndexManager(dir), dir, analyzer)
  }

  public static IndexWriter open(IndexManager manager, Directory dir, 
 Analyzer analyzer) {
 return arch.buildIndexWriter(new Architecture(), manager, dir, analyzer);
  }

  public static IndexWriter open(Architecture arch, IndexManager manager, 
 Directory dir, Analyzer analyzer) {
 return arch.buildIndexWriter(manager, dir, analyzer);
  }

IMO, it's important not to force first-time users to grok builder classes in
order to perform basic indexing or searching.

Marvin Humphrey


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-04 Thread Marvin Humphrey
On Sun, Oct 04, 2009 at 05:53:14AM -0400, Michael McCandless wrote:

   1 Do we prevent config settings from changing after creating an
 IW/IR?

Any settings conveyed via a settings object ought to be final if you want
pluggable index components.  Otherwise, you need some nightmarish notification
system to propagate settings down into your subcomponents, which may or may
not be prepared to handle the value modifications.

Marvin Humphrey


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Michael McCandless
On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfoot ear...@gmail.com wrote:

 Builder pattern allows you to switch concrete implementations as you
 please, taking parameters into account or not.

We could also achieve this w/ static factory method.  EG
IndexReader.open(IndexReader.Config) could switch between concrete
impls (it already does today).

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Michael McCandless
On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfoot ear...@gmail.com wrote:
 Call me old fashioned, but I like how the non constructor params are set
 now.
 And what happens when you index some docs, change these params, index
 more docs, change params, commit? Let's throw in some threads?
 You either end up writing really hairy state control code, or just
 leave it broken, with Don't change parameters after you start pumping
 docs through it! plea covering your back somewhere in JavaDocs.
 If nothing else, having stuff 'final' keeps JIT really happy.

This is a good point: are you allowed to change config settings after
creating your IndexWriter/Reader?

Today it's ad hoc.

EG IW does not allow you to swap out your deletion policy, because
it'd be a nightmare to implement.  You also can't swap the analyzer.
But it does let you change your RAM buffer size, CFS or not, merge
factor, etc.  We can remove that flexibility (I'm not sure it's
compelling), so we can make things final.  You can't change read-only
after opening your IndexReader.  I think it'd make sense to move away
from changing settings after construction...

But: the do we disallow changing config settings after construction?
question is really orthogonal to the what syntax do we use for
construction? (builder vs config vs zillions-of-ctors).

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Uwe Schindler
Hi,

The problem is, we have to leave some of the not-yet-deprecated ctors/opens
available for a while (not until 4.0 with our ne policy), but a user
removing all deprecated stuff from his 2.9 release should be able to switch
to 3.0 without changing any code (can even plug the jars in). We also have
to keep the getters/setter avail. If we wanted to change this, 2.9 was the
best option :-(

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Saturday, October 03, 2009 11:35 AM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
 On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfoot ear...@gmail.com wrote:
  Call me old fashioned, but I like how the non constructor params are
 set
  now.
  And what happens when you index some docs, change these params, index
  more docs, change params, commit? Let's throw in some threads?
  You either end up writing really hairy state control code, or just
  leave it broken, with Don't change parameters after you start pumping
  docs through it! plea covering your back somewhere in JavaDocs.
  If nothing else, having stuff 'final' keeps JIT really happy.
 
 This is a good point: are you allowed to change config settings after
 creating your IndexWriter/Reader?
 
 Today it's ad hoc.
 
 EG IW does not allow you to swap out your deletion policy, because
 it'd be a nightmare to implement.  You also can't swap the analyzer.
 But it does let you change your RAM buffer size, CFS or not, merge
 factor, etc.  We can remove that flexibility (I'm not sure it's
 compelling), so we can make things final.  You can't change read-only
 after opening your IndexReader.  I think it'd make sense to move away
 from changing settings after construction...
 
 But: the do we disallow changing config settings after construction?
 question is really orthogonal to the what syntax do we use for
 construction? (builder vs config vs zillions-of-ctors).
 
 Mike
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Michael McCandless
Well, let's first get 3.0 out the door ;)  Then we can salivate over
all sorts of juicy changes for 3.1...

These particular changes (switching syntax from multi-ctors to config
or to builder, disallowing config changes after creation, switching to
concrete impl is hidden) may merit an exception to our back-compat
policy.  Obviously users are bothered by the horror of how many ctors
you are confronted with for IW and IR.

Mike

On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindler u...@thetaphi.de wrote:
 Hi,

 The problem is, we have to leave some of the not-yet-deprecated ctors/opens
 available for a while (not until 4.0 with our ne policy), but a user
 removing all deprecated stuff from his 2.9 release should be able to switch
 to 3.0 without changing any code (can even plug the jars in). We also have
 to keep the getters/setter avail. If we wanted to change this, 2.9 was the
 best option :-(

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de

 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Saturday, October 03, 2009 11:35 AM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 and deprecated IR.open() methods

 On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfoot ear...@gmail.com wrote:
  Call me old fashioned, but I like how the non constructor params are
 set
  now.
  And what happens when you index some docs, change these params, index
  more docs, change params, commit? Let's throw in some threads?
  You either end up writing really hairy state control code, or just
  leave it broken, with Don't change parameters after you start pumping
  docs through it! plea covering your back somewhere in JavaDocs.
  If nothing else, having stuff 'final' keeps JIT really happy.

 This is a good point: are you allowed to change config settings after
 creating your IndexWriter/Reader?

 Today it's ad hoc.

 EG IW does not allow you to swap out your deletion policy, because
 it'd be a nightmare to implement.  You also can't swap the analyzer.
 But it does let you change your RAM buffer size, CFS or not, merge
 factor, etc.  We can remove that flexibility (I'm not sure it's
 compelling), so we can make things final.  You can't change read-only
 after opening your IndexReader.  I think it'd make sense to move away
 from changing settings after construction...

 But: the do we disallow changing config settings after construction?
 question is really orthogonal to the what syntax do we use for
 construction? (builder vs config vs zillions-of-ctors).

 Mike

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Earwin Burrfoot
 Builder pattern allows you to switch concrete implementations as you
 please, taking parameters into account or not.

 We could also achieve this w/ static factory method.  EG
 IndexReader.open(IndexReader.Config) could switch between concrete
 impls (it already does today).
Yes, the choice of 'IW.create(IWSettings, Directory)' VS
'IWSettings.create(Directory)' is purely syntactical (with latter
being more concise, imo), but I was comparing to 'new IW(Settings,
Directory)'.

 Call me old fashioned, but I like how the non constructor params are set
 now.
 And what happens when you index some docs, change these params, index
 more docs, change params, commit? Let's throw in some threads?
 You either end up writing really hairy state control code, or just
 leave it broken, with Don't change parameters after you start pumping
 docs through it! plea covering your back somewhere in JavaDocs.
 If nothing else, having stuff 'final' keeps JIT really happy.

 This is a good point: are you allowed to change config settings after
 creating your IndexWriter/Reader?

 Today it's ad hoc.

 EG IW does not allow you to swap out your deletion policy, because
 it'd be a nightmare to implement.  You also can't swap the analyzer.
 But it does let you change your RAM buffer size, CFS or not, merge
 factor, etc.  We can remove that flexibility (I'm not sure it's
 compelling), so we can make things final.  You can't change read-only
 after opening your IndexReader.  I think it'd make sense to move away
 from changing settings after construction...
I've just remembered some horrible things:

public void setMergeFactor(int mergeFactor) {
  getLogMergePolicy().setMergeFactor(mergeFactor);
}

Let's remove this flexibility too?

 But: the do we disallow changing config settings after construction?
 question is really orthogonal to the what syntax do we use for
 construction? (builder vs config vs zillions-of-ctors).
There's better syntax for both mutable and immutable approach, so it's
not like these two questions are completely orthogonal.

-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Michael Busch
There's also LUCENE-1698! Maybe we can change the policy. Now that 2.9 
is out we should try to get to a conclusion.


 Michael

On 10/3/09 11:54 AM, Michael McCandless wrote:

Well, let's first get 3.0 out the door ;)  Then we can salivate over
all sorts of juicy changes for 3.1...

These particular changes (switching syntax from multi-ctors to config
or to builder, disallowing config changes after creation, switching to
concrete impl is hidden) may merit an exception to our back-compat
policy.  Obviously users are bothered by the horror of how many ctors
you are confronted with for IW and IR.

Mike

On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindleru...@thetaphi.de  wrote:
   

Hi,

The problem is, we have to leave some of the not-yet-deprecated ctors/opens
available for a while (not until 4.0 with our ne policy), but a user
removing all deprecated stuff from his 2.9 release should be able to switch
to 3.0 without changing any code (can even plug the jars in). We also have
to keep the getters/setter avail. If we wanted to change this, 2.9 was the
best option :-(

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 

-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Saturday, October 03, 2009 11:35 AM
To: java-dev@lucene.apache.org
Subject: Re: Lucene 2.9 and deprecated IR.open() methods

On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfootear...@gmail.com  wrote:
   

Call me old fashioned, but I like how the non constructor params are
   

set
   

now.
   

And what happens when you index some docs, change these params, index
more docs, change params, commit? Let's throw in some threads?
You either end up writing really hairy state control code, or just
leave it broken, with Don't change parameters after you start pumping
docs through it! plea covering your back somewhere in JavaDocs.
If nothing else, having stuff 'final' keeps JIT really happy.
 

This is a good point: are you allowed to change config settings after
creating your IndexWriter/Reader?

Today it's ad hoc.

EG IW does not allow you to swap out your deletion policy, because
it'd be a nightmare to implement.  You also can't swap the analyzer.
But it does let you change your RAM buffer size, CFS or not, merge
factor, etc.  We can remove that flexibility (I'm not sure it's
compelling), so we can make things final.  You can't change read-only
after opening your IndexReader.  I think it'd make sense to move away
from changing settings after construction...

But: the do we disallow changing config settings after construction?
question is really orthogonal to the what syntax do we use for
construction? (builder vs config vs zillions-of-ctors).

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
   



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


 

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


   



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Uwe Schindler
But we should not change for 3.0, because people have already much to do to
get their 2.9 compile without deprec. If the work is then obsolete, because
we change this fundamental, we will make a lot of people angry. So I would
do this for 3.1.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Michael Busch [mailto:busch...@gmail.com]
 Sent: Saturday, October 03, 2009 12:15 PM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
 There's also LUCENE-1698! Maybe we can change the policy. Now that 2.9
 is out we should try to get to a conclusion.
 
   Michael
 
 On 10/3/09 11:54 AM, Michael McCandless wrote:
  Well, let's first get 3.0 out the door ;)  Then we can salivate over
  all sorts of juicy changes for 3.1...
 
  These particular changes (switching syntax from multi-ctors to config
  or to builder, disallowing config changes after creation, switching to
  concrete impl is hidden) may merit an exception to our back-compat
  policy.  Obviously users are bothered by the horror of how many ctors
  you are confronted with for IW and IR.
 
  Mike
 
  On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindleru...@thetaphi.de  wrote:
 
  Hi,
 
  The problem is, we have to leave some of the not-yet-deprecated
 ctors/opens
  available for a while (not until 4.0 with our ne policy), but a user
  removing all deprecated stuff from his 2.9 release should be able to
 switch
  to 3.0 without changing any code (can even plug the jars in). We also
 have
  to keep the getters/setter avail. If we wanted to change this, 2.9 was
 the
  best option :-(
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Michael McCandless [mailto:luc...@mikemccandless.com]
  Sent: Saturday, October 03, 2009 11:35 AM
  To: java-dev@lucene.apache.org
  Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
  On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfootear...@gmail.com
 wrote:
 
  Call me old fashioned, but I like how the non constructor params are
 
  set
 
  now.
 
  And what happens when you index some docs, change these params, index
  more docs, change params, commit? Let's throw in some threads?
  You either end up writing really hairy state control code, or just
  leave it broken, with Don't change parameters after you start
 pumping
  docs through it! plea covering your back somewhere in JavaDocs.
  If nothing else, having stuff 'final' keeps JIT really happy.
 
  This is a good point: are you allowed to change config settings after
  creating your IndexWriter/Reader?
 
  Today it's ad hoc.
 
  EG IW does not allow you to swap out your deletion policy, because
  it'd be a nightmare to implement.  You also can't swap the analyzer.
  But it does let you change your RAM buffer size, CFS or not, merge
  factor, etc.  We can remove that flexibility (I'm not sure it's
  compelling), so we can make things final.  You can't change read-only
  after opening your IndexReader.  I think it'd make sense to move away
  from changing settings after construction...
 
  But: the do we disallow changing config settings after construction?
  question is really orthogonal to the what syntax do we use for
  construction? (builder vs config vs zillions-of-ctors).
 
  Mike
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Michael McCandless
Right: 3.0 should be a fast turnaround w/ no further deprecations.
(And at your rate of progress Uwe it looks like it really *will* be
fast!).

For 3.1 we can salivate...

Mike

On Sat, Oct 3, 2009 at 6:18 AM, Uwe Schindler u...@thetaphi.de wrote:
 But we should not change for 3.0, because people have already much to do to
 get their 2.9 compile without deprec. If the work is then obsolete, because
 we change this fundamental, we will make a lot of people angry. So I would
 do this for 3.1.

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de


 -Original Message-
 From: Michael Busch [mailto:busch...@gmail.com]
 Sent: Saturday, October 03, 2009 12:15 PM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 and deprecated IR.open() methods

 There's also LUCENE-1698! Maybe we can change the policy. Now that 2.9
 is out we should try to get to a conclusion.

   Michael

 On 10/3/09 11:54 AM, Michael McCandless wrote:
  Well, let's first get 3.0 out the door ;)  Then we can salivate over
  all sorts of juicy changes for 3.1...
 
  These particular changes (switching syntax from multi-ctors to config
  or to builder, disallowing config changes after creation, switching to
  concrete impl is hidden) may merit an exception to our back-compat
  policy.  Obviously users are bothered by the horror of how many ctors
  you are confronted with for IW and IR.
 
  Mike
 
  On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindleru...@thetaphi.de  wrote:
 
  Hi,
 
  The problem is, we have to leave some of the not-yet-deprecated
 ctors/opens
  available for a while (not until 4.0 with our ne policy), but a user
  removing all deprecated stuff from his 2.9 release should be able to
 switch
  to 3.0 without changing any code (can even plug the jars in). We also
 have
  to keep the getters/setter avail. If we wanted to change this, 2.9 was
 the
  best option :-(
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Michael McCandless [mailto:luc...@mikemccandless.com]
  Sent: Saturday, October 03, 2009 11:35 AM
  To: java-dev@lucene.apache.org
  Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
  On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfootear...@gmail.com
 wrote:
 
  Call me old fashioned, but I like how the non constructor params are
 
  set
 
  now.
 
  And what happens when you index some docs, change these params, index
  more docs, change params, commit? Let's throw in some threads?
  You either end up writing really hairy state control code, or just
  leave it broken, with Don't change parameters after you start
 pumping
  docs through it! plea covering your back somewhere in JavaDocs.
  If nothing else, having stuff 'final' keeps JIT really happy.
 
  This is a good point: are you allowed to change config settings after
  creating your IndexWriter/Reader?
 
  Today it's ad hoc.
 
  EG IW does not allow you to swap out your deletion policy, because
  it'd be a nightmare to implement.  You also can't swap the analyzer.
  But it does let you change your RAM buffer size, CFS or not, merge
  factor, etc.  We can remove that flexibility (I'm not sure it's
  compelling), so we can make things final.  You can't change read-only
  after opening your IndexReader.  I think it'd make sense to move away
  from changing settings after construction...
 
  But: the do we disallow changing config settings after construction?
  question is really orthogonal to the what syntax do we use for
  construction? (builder vs config vs zillions-of-ctors).
 
  Mike
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 


 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Michael Busch
I agree, we have announed the 2.9/3.0 release plans a long time ago 
already and shouldn't change anything. But ideally I'd like to announce 
any backwards-compatibility changes together with the 3.0 release, while 
mentioning that the changes will take effect from 3.1 on. That's why I'd 
like to get to a conclusion soon.


 Michael

On 10/3/09 12:18 PM, Uwe Schindler wrote:

But we should not change for 3.0, because people have already much to do to
get their 2.9 compile without deprec. If the work is then obsolete, because
we change this fundamental, we will make a lot of people angry. So I would
do this for 3.1.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


   

-Original Message-
From: Michael Busch [mailto:busch...@gmail.com]
Sent: Saturday, October 03, 2009 12:15 PM
To: java-dev@lucene.apache.org
Subject: Re: Lucene 2.9 and deprecated IR.open() methods

There's also LUCENE-1698! Maybe we can change the policy. Now that 2.9
is out we should try to get to a conclusion.

   Michael

On 10/3/09 11:54 AM, Michael McCandless wrote:
 

Well, let's first get 3.0 out the door ;)  Then we can salivate over
all sorts of juicy changes for 3.1...

These particular changes (switching syntax from multi-ctors to config
or to builder, disallowing config changes after creation, switching to
concrete impl is hidden) may merit an exception to our back-compat
policy.  Obviously users are bothered by the horror of how many ctors
you are confronted with for IW and IR.

Mike

On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindleru...@thetaphi.de   wrote:

   

Hi,

The problem is, we have to leave some of the not-yet-deprecated
 

ctors/opens
 

available for a while (not until 4.0 with our ne policy), but a user
removing all deprecated stuff from his 2.9 release should be able to
 

switch
 

to 3.0 without changing any code (can even plug the jars in). We also
 

have
 

to keep the getters/setter avail. If we wanted to change this, 2.9 was
 

the
 

best option :-(

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 

-Original Message-
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Saturday, October 03, 2009 11:35 AM
To: java-dev@lucene.apache.org
Subject: Re: Lucene 2.9 and deprecated IR.open() methods

On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfootear...@gmail.com
   

wrote:
 
   

Call me old fashioned, but I like how the non constructor params are

   

set

   

now.

   

And what happens when you index some docs, change these params, index
more docs, change params, commit? Let's throw in some threads?
You either end up writing really hairy state control code, or just
leave it broken, with Don't change parameters after you start
 

pumping
 

docs through it! plea covering your back somewhere in JavaDocs.
If nothing else, having stuff 'final' keeps JIT really happy.

 

This is a good point: are you allowed to change config settings after
creating your IndexWriter/Reader?

Today it's ad hoc.

EG IW does not allow you to swap out your deletion policy, because
it'd be a nightmare to implement.  You also can't swap the analyzer.
But it does let you change your RAM buffer size, CFS or not, merge
factor, etc.  We can remove that flexibility (I'm not sure it's
compelling), so we can make things final.  You can't change read-only
after opening your IndexReader.  I think it'd make sense to move away
from changing settings after construction...

But: the do we disallow changing config settings after construction?
question is really orthogonal to the what syntax do we use for
construction? (builder vs config vs zillions-of-ctors).

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

   


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



 

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



   


-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
 



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Uwe Schindler
Now it gets slower. After applying LUCENE-1944, you get 600 errors when
compiling tests :(

We should have checked our tests in 2.9 that they only call deprecated
methods for BW compatibility. No I have to change tons of IR.open(), IW()
calls in backwards branch and also in trunk tests. But the patch is
currently the same for both branches - puh.

Completely unhappy :-(

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Saturday, October 03, 2009 12:21 PM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
 Right: 3.0 should be a fast turnaround w/ no further deprecations.
 (And at your rate of progress Uwe it looks like it really *will* be
 fast!).
 
 For 3.1 we can salivate...
 
 Mike
 
 On Sat, Oct 3, 2009 at 6:18 AM, Uwe Schindler u...@thetaphi.de wrote:
  But we should not change for 3.0, because people have already much to do
 to
  get their 2.9 compile without deprec. If the work is then obsolete,
 because
  we change this fundamental, we will make a lot of people angry. So I
 would
  do this for 3.1.
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Michael Busch [mailto:busch...@gmail.com]
  Sent: Saturday, October 03, 2009 12:15 PM
  To: java-dev@lucene.apache.org
  Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
  There's also LUCENE-1698! Maybe we can change the policy. Now that 2.9
  is out we should try to get to a conclusion.
 
    Michael
 
  On 10/3/09 11:54 AM, Michael McCandless wrote:
   Well, let's first get 3.0 out the door ;)  Then we can salivate over
   all sorts of juicy changes for 3.1...
  
   These particular changes (switching syntax from multi-ctors to config
   or to builder, disallowing config changes after creation, switching
 to
   concrete impl is hidden) may merit an exception to our back-compat
   policy.  Obviously users are bothered by the horror of how many ctors
   you are confronted with for IW and IR.
  
   Mike
  
   On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindleru...@thetaphi.de
  wrote:
  
   Hi,
  
   The problem is, we have to leave some of the not-yet-deprecated
  ctors/opens
   available for a while (not until 4.0 with our ne policy), but a user
   removing all deprecated stuff from his 2.9 release should be able to
  switch
   to 3.0 without changing any code (can even plug the jars in). We
 also
  have
   to keep the getters/setter avail. If we wanted to change this, 2.9
 was
  the
   best option :-(
  
   Uwe
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen
   http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
   -Original Message-
   From: Michael McCandless [mailto:luc...@mikemccandless.com]
   Sent: Saturday, October 03, 2009 11:35 AM
   To: java-dev@lucene.apache.org
   Subject: Re: Lucene 2.9 and deprecated IR.open() methods
  
   On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfootear...@gmail.com
  wrote:
  
   Call me old fashioned, but I like how the non constructor params
 are
  
   set
  
   now.
  
   And what happens when you index some docs, change these params,
 index
   more docs, change params, commit? Let's throw in some threads?
   You either end up writing really hairy state control code, or just
   leave it broken, with Don't change parameters after you start
  pumping
   docs through it! plea covering your back somewhere in JavaDocs.
   If nothing else, having stuff 'final' keeps JIT really happy.
  
   This is a good point: are you allowed to change config settings
 after
   creating your IndexWriter/Reader?
  
   Today it's ad hoc.
  
   EG IW does not allow you to swap out your deletion policy, because
   it'd be a nightmare to implement.  You also can't swap the
 analyzer.
   But it does let you change your RAM buffer size, CFS or not, merge
   factor, etc.  We can remove that flexibility (I'm not sure it's
   compelling), so we can make things final.  You can't change read-
 only
   after opening your IndexReader.  I think it'd make sense to move
 away
   from changing settings after construction...
  
   But: the do we disallow changing config settings after
 construction?
   question is really orthogonal to the what syntax do we use for
   construction? (builder vs config vs zillions-of-ctors).
  
   Mike
  
   ---
 --
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org
  
  
  
   
 -
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Michael McCandless
On Sat, Oct 3, 2009 at 6:25 AM, Uwe Schindler u...@thetaphi.de wrote:
 Now it gets slower. After applying LUCENE-1944, you get 600 errors when
 compiling tests :(

 We should have checked our tests in 2.9 that they only call deprecated
 methods for BW compatibility.

Sigh.  Yes, going forward we should probably always fix tests to not
use deprecated APIs anymore, at the same time that we deprecate.

 No I have to change tons of IR.open(), IW()
 calls in backwards branch and also in trunk tests. But the patch is
 currently the same for both branches - puh.

Maybe we should re-cut the back-compat branch after removal of all
deprecated APIs?

 Completely unhappy :-(

Sorry :(  Take a deep breath.  Go consume some coffee or dark
chocolate (or, maybe, a beer!) :)

Mike

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Uwe Schindler
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Saturday, October 03, 2009 12:29 PM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
 On Sat, Oct 3, 2009 at 6:25 AM, Uwe Schindler u...@thetaphi.de wrote:
  Now it gets slower. After applying LUCENE-1944, you get 600 errors when
  compiling tests :(
 
  We should have checked our tests in 2.9 that they only call deprecated
  methods for BW compatibility.
 
 Sigh.  Yes, going forward we should probably always fix tests to not
 use deprecated APIs anymore, at the same time that we deprecate.
  Now I have to change tons of IR.open(), IW()
  calls in backwards branch and also in trunk tests. But the patch is
  currently the same for both branches - puh.
 
 Maybe we should re-cut the back-compat branch after removal of all
 deprecated APIs?


No, better apply the patch on both branches. Because I changed generics in
TokenStream API and want to be sure, that not generified BW branch works.
Now it would not even compile anymore, because BW branch is forced to Java
1.4. The simpliest is really to create a patch and apply it to both branches
or merge it using SVN. That's my smallest problem.

  Completely unhappy :-(
 
 Sorry :(  Take a deep breath.  Go consume some coffee or dark
 chocolate (or, maybe, a beer!) :)
 
 Mike
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Uwe Schindler
I have a plan how to do the tests:

I use my BW branch checkout and enable deprecation warnings there. I then
start to fix  all deprec usage and remove all code parts that are only there
to test bw compatibility (e.g. TestTokenStreamBWCompatibiliy). After that
the test should compile without deprec warnings. When this is done commit
and create new TAG.

After that apply the patch also to trunk - tests should compile :-)

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Uwe Schindler [mailto:u...@thetaphi.de]
 Sent: Saturday, October 03, 2009 12:33 PM
 To: java-dev@lucene.apache.org
 Subject: RE: Lucene 2.9 and deprecated IR.open() methods
 
  From: Michael McCandless [mailto:luc...@mikemccandless.com]
  Sent: Saturday, October 03, 2009 12:29 PM
  To: java-dev@lucene.apache.org
  Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
  On Sat, Oct 3, 2009 at 6:25 AM, Uwe Schindler u...@thetaphi.de wrote:
   Now it gets slower. After applying LUCENE-1944, you get 600 errors
 when
   compiling tests :(
  
   We should have checked our tests in 2.9 that they only call deprecated
   methods for BW compatibility.
 
  Sigh.  Yes, going forward we should probably always fix tests to not
  use deprecated APIs anymore, at the same time that we deprecate.
   Now I have to change tons of IR.open(), IW()
   calls in backwards branch and also in trunk tests. But the patch is
   currently the same for both branches - puh.
 
  Maybe we should re-cut the back-compat branch after removal of all
  deprecated APIs?
 
 
 No, better apply the patch on both branches. Because I changed generics in
 TokenStream API and want to be sure, that not generified BW branch works.
 Now it would not even compile anymore, because BW branch is forced to Java
 1.4. The simpliest is really to create a patch and apply it to both
 branches
 or merge it using SVN. That's my smallest problem.
 
   Completely unhappy :-(
 
  Sorry :(  Take a deep breath.  Go consume some coffee or dark
  chocolate (or, maybe, a beer!) :)
 
  Mike
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Grant Ingersoll


On Oct 2, 2009, at 7:33 PM, Michael McCandless wrote:


Sigh.  The introduction of new but deprecated methods is silly.  Is
there some simple automated way to catch/prevent these?

The proliferation of ctors/factory methods is a nightmare.


Ah, so yet again, we are trying to work around a problem that is due  
to the ridiculousness of how we manage releases and deprecations and  
not necessarily something that is technically wrong.  It's not like  
this is news.  I've been complaining about the # of ctors for a long  
time (try training people on this stuff and you'll know what I mean).   
I'm not trying to be antagonistic, but if we would all just face facts  
that we do releases so few and far between that I just don't see it as  
being some massive hardship to remove some deprecations more often  
than every major release.   It's funny, we add things in an agile way  
and everyone loves that, but we remove them in such a drawn out and  
monolithic manner that it is mind-boggling.  We induce way more  
confusion than we prevent.  Any sane programmer out there has to do  
more than just drop in any release, no matter what, (in other words,  
the whole drop in back compat thing is a myth, so get over it) and as  
soon as they start looking at the myriad of options, they are going to  
be confused.  Far better for us just to remove an inferior method,  
with some smaller amount of warning, than to leave them guessing.  Not  
only that, but as is evidenced by the new Token stuff, using  
deprecated and new stuff together may be even worse than just getting  
rid of the old stuff.


Simply put, I propose we adopt a model we've all discussed many times  
before where we mark deprecated items with the version they will be  
removed in, regardless of minor/major number, with the caveat that it  
must be at least one more minor version (i.e. announce deprecation in  
2.4.0, remove in 2.5.0).  Major versions than are about what we all  
expect out of major versions from every other software package in the  
land:  major new features or near complete overhaul of existing  
functionality.  With this model, we won't have massive amounts of  
deprecation piling up, our users are still given plenty of warning  
whereby they can _plan_ for it, and we have more flexibility in how we  
develop.


-Grant

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Uwe Schindler
This seems to work, I have created some scripts that do the compilations and
create a deprecation report and I start to fix in BW branch. The easieist is
first to just remove a lot of tests, that only test the BW compatibility
API.

I will post something, as soon as I have removed most deprec warnings.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Uwe Schindler [mailto:u...@thetaphi.de]
 Sent: Saturday, October 03, 2009 12:39 PM
 To: java-dev@lucene.apache.org
 Subject: RE: Lucene 2.9 and deprecated IR.open() methods
 
 I have a plan how to do the tests:
 
 I use my BW branch checkout and enable deprecation warnings there. I then
 start to fix  all deprec usage and remove all code parts that are only
 there
 to test bw compatibility (e.g. TestTokenStreamBWCompatibiliy). After that
 the test should compile without deprec warnings. When this is done commit
 and create new TAG.
 
 After that apply the patch also to trunk - tests should compile :-)
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Uwe Schindler [mailto:u...@thetaphi.de]
  Sent: Saturday, October 03, 2009 12:33 PM
  To: java-dev@lucene.apache.org
  Subject: RE: Lucene 2.9 and deprecated IR.open() methods
 
   From: Michael McCandless [mailto:luc...@mikemccandless.com]
   Sent: Saturday, October 03, 2009 12:29 PM
   To: java-dev@lucene.apache.org
   Subject: Re: Lucene 2.9 and deprecated IR.open() methods
  
   On Sat, Oct 3, 2009 at 6:25 AM, Uwe Schindler u...@thetaphi.de wrote:
Now it gets slower. After applying LUCENE-1944, you get 600 errors
  when
compiling tests :(
   
We should have checked our tests in 2.9 that they only call
 deprecated
methods for BW compatibility.
  
   Sigh.  Yes, going forward we should probably always fix tests to not
   use deprecated APIs anymore, at the same time that we deprecate.
Now I have to change tons of IR.open(), IW()
calls in backwards branch and also in trunk tests. But the patch is
currently the same for both branches - puh.
  
   Maybe we should re-cut the back-compat branch after removal of all
   deprecated APIs?
 
 
  No, better apply the patch on both branches. Because I changed generics
 in
  TokenStream API and want to be sure, that not generified BW branch
 works.
  Now it would not even compile anymore, because BW branch is forced to
 Java
  1.4. The simpliest is really to create a patch and apply it to both
  branches
  or merge it using SVN. That's my smallest problem.
 
Completely unhappy :-(
  
   Sorry :(  Take a deep breath.  Go consume some coffee or dark
   chocolate (or, maybe, a beer!) :)
  
   Mike
  
   -
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Uwe Schindler
Do not wonder, I will now commit lots of test fixes for IR.open() in
backwards branch and then merge to trunk!

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de


 -Original Message-
 From: Uwe Schindler [mailto:u...@thetaphi.de]
 Sent: Saturday, October 03, 2009 2:00 PM
 To: java-dev@lucene.apache.org
 Subject: RE: Lucene 2.9 and deprecated IR.open() methods
 
 This seems to work, I have created some scripts that do the compilations
 and
 create a deprecation report and I start to fix in BW branch. The easieist
 is
 first to just remove a lot of tests, that only test the BW compatibility
 API.
 
 I will post something, as soon as I have removed most deprec warnings.
 
 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de
 
 
  -Original Message-
  From: Uwe Schindler [mailto:u...@thetaphi.de]
  Sent: Saturday, October 03, 2009 12:39 PM
  To: java-dev@lucene.apache.org
  Subject: RE: Lucene 2.9 and deprecated IR.open() methods
 
  I have a plan how to do the tests:
 
  I use my BW branch checkout and enable deprecation warnings there. I
 then
  start to fix  all deprec usage and remove all code parts that are only
  there
  to test bw compatibility (e.g. TestTokenStreamBWCompatibiliy). After
 that
  the test should compile without deprec warnings. When this is done
 commit
  and create new TAG.
 
  After that apply the patch also to trunk - tests should compile :-)
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
   -Original Message-
   From: Uwe Schindler [mailto:u...@thetaphi.de]
   Sent: Saturday, October 03, 2009 12:33 PM
   To: java-dev@lucene.apache.org
   Subject: RE: Lucene 2.9 and deprecated IR.open() methods
  
From: Michael McCandless [mailto:luc...@mikemccandless.com]
Sent: Saturday, October 03, 2009 12:29 PM
To: java-dev@lucene.apache.org
Subject: Re: Lucene 2.9 and deprecated IR.open() methods
   
On Sat, Oct 3, 2009 at 6:25 AM, Uwe Schindler u...@thetaphi.de
 wrote:
 Now it gets slower. After applying LUCENE-1944, you get 600
 errors
   when
 compiling tests :(

 We should have checked our tests in 2.9 that they only call
  deprecated
 methods for BW compatibility.
   
Sigh.  Yes, going forward we should probably always fix tests to not
use deprecated APIs anymore, at the same time that we deprecate.
 Now I have to change tons of IR.open(), IW()
 calls in backwards branch and also in trunk tests. But the patch
 is
 currently the same for both branches - puh.
   
Maybe we should re-cut the back-compat branch after removal of all
deprecated APIs?
  
  
   No, better apply the patch on both branches. Because I changed
 generics
  in
   TokenStream API and want to be sure, that not generified BW branch
  works.
   Now it would not even compile anymore, because BW branch is forced to
  Java
   1.4. The simpliest is really to create a patch and apply it to both
   branches
   or merge it using SVN. That's my smallest problem.
  
 Completely unhappy :-(
   
Sorry :(  Take a deep breath.  Go consume some coffee or dark
chocolate (or, maybe, a beer!) :)
   
Mike
   

 -
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org
  
  
  
   -
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Michael Busch

On 10/3/09 4:18 AM, Earwin Burrfoot wrote:

Builder pattern allows you to switch concrete implementations as you
please, taking parameters into account or not.
Besides that there's no real difference. I prefer builder, but that's just me :)

   


Why can't you do that with a factory that takes a config object as 
parameter? Seems very similar to me... the only difference is syntax, 
isn't it?
And if you have setter methods on the config object or methods that 
return this that you can concatenate is just personal preference, 
right? Personally I prefer the setter methods for our usecase, simply 
because there are so many config options. Maybe you don't want to set 
them all in the same places in your app code? E.g. in our app we have a 
method like applyIWConfig(IndexWriter) that, as the name says, applies 
all settings we have in a customizable config file. However, some IW 
settings are not customizable, and applied somewhere else in our code. I 
think with the concatenation pattern this would look less intuitive than 
with good old setter methods. You'd have to change 
applyIWConfig(IndexWriter.Builder) to return IW.Builder and do the 
concatenation both in the method and in the caller.


But, like Mark said, maybe this is just my personal preference and for 
others not compelling arguments. Or maybe I'm missing some other 
advantage of the builder pattern? I haven't used/implemented it myself 
very much yet...


 Michael


Thats just me though.

Michael McCandless wrote:
 

OK, I agree, using the builder approach looks compelling!

Though what about required settings?  EG IW's builder must have
Directory, Analyzer.  Would we pass these as up-front args to the
initial builder?

And shouldn't we still specify the version up-front so we can improve
defaults over time without breaking back-compat?  (Else, how can
we change defaults?)

EG:

   IndexWriter.builder(Version.29, dir, analyzer)
 .setRAMBufferSizeMB(128)
 .setUseCompoundFile(false)
 ...
 .create()

?

Mike

On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfootear...@gmail.com  wrote:

   

On Sat, Oct 3, 2009 at 03:29, Uwe Schindleru...@thetaphi.de  wrote:

 

It is also probably a good idea to move various settings methods from
IW to that builder and have IW immutable in regards to configuration.
I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB,
setMergePolicy, setMergeScheduler, setSimilarity.

IndexWriter.Builder iwb = IndexWriter.builder().
   writeLockTimeout(0).
   RAMBufferSize(config.indexationBufferMB).
   maxBufferedDocs(...).
   similarity(...).
   analyzer(...);

... = iwb.build(dir1);
... = iwb.build(dir2);

 

A happy user of google-collections API :-) These builders are really cool!

   

I feel myself caught in the act.

There is still a couple of things bothering me.
1. Introducing a builder, we'll have a whole heap of deprecated
constructors that will hang there for eternity. And then users will
scream in frustration - This class has 14(!) constructors and all of
them are deprecated! How on earth am I supposed to create this thing?
2. If someone creates IW with some reflectish javabeanish tools - he's
busted. Not that I'm feeling compassionate for such a person.


 

I like Earwin's version more. A builder is very flexible, because you can
concat all your properties (like StringBuilder works with its append method
returning itself) and create the instance at the end.

   

Besides (arguably) cleaner syntax, the lack of which is (arguably) a
curse of many Java libraries,
it also allows us to return a different concrete implementation of IW
without breaking back-compat,
and also to choose this concrete implementation based on settings
provided. If we feel like doing it at some point.

--
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



 

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


   


--
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


 



   



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Mark Miller
I think my preference is swayed by convention/simplicity. The way things
are done now are just very intuitive for me. When I sit down to write
some code with Lucene, I barley have to think or remember much. It all
sticks. Its mostly all basic Java with few patterns.

Now google has used some cool patterns to make things like Guice pretty
sweet. And because of what Guice does, they are pretty necessary I
think. But every time I go back to work on that code, I have to relearn
a bunch of stuff/conventions. Its not difficult - but its a small brain
annoyance.

One of the reasons I fell in love with Lucene is that its just so
natural and easy to use and yet still so powerful. Not that I'm claiming
the deprecated methods arn't a bit of a pain - but they have never
caused me problems.

Not a fan of static builder methods either. But hey, sometimes they make
sense, so whatever I guess ...

Michael Busch wrote:
 On 10/3/09 4:18 AM, Earwin Burrfoot wrote:
 Builder pattern allows you to switch concrete implementations as you
 please, taking parameters into account or not.
 Besides that there's no real difference. I prefer builder, but that's
 just me :)



 Why can't you do that with a factory that takes a config object as
 parameter? Seems very similar to me... the only difference is syntax,
 isn't it?
 And if you have setter methods on the config object or methods that
 return this that you can concatenate is just personal preference,
 right? Personally I prefer the setter methods for our usecase, simply
 because there are so many config options. Maybe you don't want to set
 them all in the same places in your app code? E.g. in our app we have
 a method like applyIWConfig(IndexWriter) that, as the name says,
 applies all settings we have in a customizable config file. However,
 some IW settings are not customizable, and applied somewhere else in
 our code. I think with the concatenation pattern this would look less
 intuitive than with good old setter methods. You'd have to change
 applyIWConfig(IndexWriter.Builder) to return IW.Builder and do the
 concatenation both in the method and in the caller.

 But, like Mark said, maybe this is just my personal preference and for
 others not compelling arguments. Or maybe I'm missing some other
 advantage of the builder pattern? I haven't used/implemented it myself
 very much yet...

  Michael

 Thats just me though.

 Michael McCandless wrote:
 
 OK, I agree, using the builder approach looks compelling!

 Though what about required settings?  EG IW's builder must have
 Directory, Analyzer.  Would we pass these as up-front args to the
 initial builder?

 And shouldn't we still specify the version up-front so we can improve
 defaults over time without breaking back-compat?  (Else, how can
 we change defaults?)

 EG:

IndexWriter.builder(Version.29, dir, analyzer)
  .setRAMBufferSizeMB(128)
  .setUseCompoundFile(false)
  ...
  .create()

 ?

 Mike

 On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfootear...@gmail.com 
 wrote:

   
 On Sat, Oct 3, 2009 at 03:29, Uwe Schindleru...@thetaphi.de  wrote:

 
 It is also probably a good idea to move various settings methods
 from
 IW to that builder and have IW immutable in regards to
 configuration.
 I'm speaking of the likes of setWriteLockTimeout,
 setRAMBufferSizeMB,
 setMergePolicy, setMergeScheduler, setSimilarity.

 IndexWriter.Builder iwb = IndexWriter.builder().
writeLockTimeout(0).
RAMBufferSize(config.indexationBufferMB).
maxBufferedDocs(...).
similarity(...).
analyzer(...);

 ... = iwb.build(dir1);
 ... = iwb.build(dir2);

  
 A happy user of google-collections API :-) These builders are
 really cool!


 I feel myself caught in the act.

 There is still a couple of things bothering me.
 1. Introducing a builder, we'll have a whole heap of deprecated
 constructors that will hang there for eternity. And then users will
 scream in frustration - This class has 14(!) constructors and all of
 them are deprecated! How on earth am I supposed to create this thing?
 2. If someone creates IW with some reflectish javabeanish tools -
 he's
 busted. Not that I'm feeling compassionate for such a person.


 
 I like Earwin's version more. A builder is very flexible, because
 you can
 concat all your properties (like StringBuilder works with its
 append method
 returning itself) and create the instance at the end.


 Besides (arguably) cleaner syntax, the lack of which is (arguably) a
 curse of many Java libraries,
 it also allows us to return a different concrete implementation of IW
 without breaking back-compat,
 and also to choose this concrete implementation based on settings
 provided. If we feel like doing it at some point.

 -- 
 Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
 Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
 ICQ: 104465785

 -
 To 

Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Ted Dunning
The builder pattern and the config argument to a factory both have the
advantage that you can limit changes after creating an object.  Some things
are just bad to change in mid-stream.  The config argument is nice in that
you can pass it around to different stake holders, but the builder can be
used a bit like that as well.

One way to look at it is that a builder is just a config object that happens
to have the create method.

On Sat, Oct 3, 2009 at 5:09 PM, Michael Busch busch...@gmail.com wrote:

 But, like Mark said, maybe this is just my personal preference and for
 others not compelling arguments. Or maybe I'm missing some other advantage
 of the builder pattern? I haven't used/implemented it myself very much
 yet...




-- 
Ted Dunning, CTO
DeepDyve


Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-03 Thread Mark Miller
Ted Dunning wrote:

 The builder pattern and the config argument to a factory both have the
 advantage that you can limit changes after creating an object.  Some
 things are just bad to change in mid-stream.  The config argument is
 nice in that you can pass it around to different stake holders, but
 the builder can be used a bit like that as well.
Yeah that argument has been made. But I've *never* seen it as an issue.
Just seems like a solution looking for a problem. I can see how it's
cleaner, not missing that point. But still doesn't make me like it much.


 One way to look at it is that a builder is just a config object that
 happens to have the create method.

 On Sat, Oct 3, 2009 at 5:09 PM, Michael Busch busch...@gmail.com
 mailto:busch...@gmail.com wrote:

 But, like Mark said, maybe this is just my personal preference and
 for others not compelling arguments. Or maybe I'm missing some
 other advantage of the builder pattern? I haven't used/implemented
 it myself very much yet...




 -- 
 Ted Dunning, CTO
 DeepDyve



-- 
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Michael Busch
I was thinking lately about the large quantity of IndexWriter 
constructors and IndexReader open methods. I'm not sure if this has been 
proposed before, but what if we introduced new objects, e.g. 
IndexWriterConfig and IndexReaderConfig. They would contain 
getter/setter methods for all the different parameters the various 
constructors and open methods currently have. Then there would only be 
one IW constructor taking an IndexWriterConfig object as parameter and 
one open method in IR likewise. Then going forward we won't have to 
add/deprecate more ctors or open methods, we can then easily extend or 
deprecate getters/setters in the *Config classes.


 Michael

On 10/3/09 12:41 AM, Uwe Schindler wrote:

When looking for press articles about the release of Lucene 2.9, I found the
following one from Bernd Fondermann
@ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html

Translation with Google Translate:

Deprecated

An index reader is created via the static open () factory method, of which
there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there
are now a total of 14 open-overloaded variants, with eight of them but they
are deprecated. This means that there are even some additions that have been
directly identified with introduction as deprecated - confusing.

The constructor-Deprecation orgy goes for the standard Analyzer, one of the
key classes during indexing and querying further. This class has now no-less
constructor arguments over what might, perhaps, some downstream libraries
bring to stumble to instantiate their analyzer on a property, which contains
the class name dynamically. Instead, an object version must be given to set
for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the
VERSION_29 parameters are deprecated but themselves - very confusing!
VERSION_CURRENT is the only safe investment in the future, a value which we
certainly also as assignment in a zero-argument constructor would have
trusted.

To write an index we need an index writer instance. Again, the majority of
the 19 possible constructors are about to be put to retire to.


What was going wrong with the open() hell in IR? Very strange, I should have
looked better.

By the way: How to proceed with deprecation removal? Case-by-case (e.g.
start with TS API, then these open() calls, then FSDirectory - to list the
ones I was involved) or some hyper-patch?

By the way, here is my talk @ Hadoop GetTogether in Berlin:

http://blog.isabel-drost.de/index.php/archives/category/events/apache-hadoop
-get-together-berlin

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org


   



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Earwin Burrfoot
It is also probably a good idea to move various settings methods from
IW to that builder and have IW immutable in regards to configuration.
I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB,
setMergePolicy, setMergeScheduler, setSimilarity.

IndexWriter.Builder iwb = IndexWriter.builder().
  writeLockTimeout(0).
  RAMBufferSize(config.indexationBufferMB).
  maxBufferedDocs(...).
  similarity(...).
  analyzer(...);

... = iwb.build(dir1);
... = iwb.build(dir2);

On Sat, Oct 3, 2009 at 02:54, Michael Busch busch...@gmail.com wrote:
 I was thinking lately about the large quantity of IndexWriter constructors
 and IndexReader open methods. I'm not sure if this has been proposed before,
 but what if we introduced new objects, e.g. IndexWriterConfig and
 IndexReaderConfig. They would contain getter/setter methods for all the
 different parameters the various constructors and open methods currently
 have. Then there would only be one IW constructor taking an
 IndexWriterConfig object as parameter and one open method in IR likewise.
 Then going forward we won't have to add/deprecate more ctors or open
 methods, we can then easily extend or deprecate getters/setters in the
 *Config classes.

  Michael

 On 10/3/09 12:41 AM, Uwe Schindler wrote:

 When looking for press articles about the release of Lucene 2.9, I found
 the
 following one from Bernd Fondermann
 @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html

 Translation with Google Translate:

 
 Deprecated

 An index reader is created via the static open () factory method, of which
 there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there
 are now a total of 14 open-overloaded variants, with eight of them but
 they
 are deprecated. This means that there are even some additions that have
 been
 directly identified with introduction as deprecated - confusing.

 The constructor-Deprecation orgy goes for the standard Analyzer, one of
 the
 key classes during indexing and querying further. This class has now
 no-less
 constructor arguments over what might, perhaps, some downstream libraries
 bring to stumble to instantiate their analyzer on a property, which
 contains
 the class name dynamically. Instead, an object version must be given to
 set
 for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the
 VERSION_29 parameters are deprecated but themselves - very confusing!
 VERSION_CURRENT is the only safe investment in the future, a value which
 we
 certainly also as assignment in a zero-argument constructor would have
 trusted.

 To write an index we need an index writer instance. Again, the majority of
 the 19 possible constructors are about to be put to retire to.

 

 What was going wrong with the open() hell in IR? Very strange, I should
 have
 looked better.

 By the way: How to proceed with deprecation removal? Case-by-case (e.g.
 start with TS API, then these open() calls, then FSDirectory - to list the
 ones I was involved) or some hyper-patch?

 By the way, here is my talk @ Hadoop GetTogether in Berlin:


 http://blog.isabel-drost.de/index.php/archives/category/events/apache-hadoop
 -get-together-berlin

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Uwe Schindler
 It is also probably a good idea to move various settings methods from
 IW to that builder and have IW immutable in regards to configuration.
 I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB,
 setMergePolicy, setMergeScheduler, setSimilarity.
 
 IndexWriter.Builder iwb = IndexWriter.builder().
   writeLockTimeout(0).
   RAMBufferSize(config.indexationBufferMB).
   maxBufferedDocs(...).
   similarity(...).
   analyzer(...);
 
 ... = iwb.build(dir1);
 ... = iwb.build(dir2);

A happy user of google-collections API :-) These builders are really cool!

Uwe



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Michael McCandless
Sigh.  The introduction of new but deprecated methods is silly.  Is
there some simple automated way to catch/prevent these?

The proliferation of ctors/factory methods is a nightmare.

Part of the story with IndexReader.open is the switch to readOnly
IndexReaders.  After the long back-compat discussion we settled on
adding new ctors as the best way to make the change.

On deprecation of Version.LUCENE_29, that doesn't seem right.  In fact
I don't think LUCENE_24 should be deprecated, either, since these
constants are used by StandardAnalyzer to state compatibility that's
equivalent to index format compability (from our last discussion).

I think deprecation by separate area makes sense?

Mike

On Fri, Oct 2, 2009 at 6:41 PM, Uwe Schindler u...@thetaphi.de wrote:
 When looking for press articles about the release of Lucene 2.9, I found the
 following one from Bernd Fondermann
 @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html

 Translation with Google Translate:
 
 Deprecated

 An index reader is created via the static open () factory method, of which
 there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there
 are now a total of 14 open-overloaded variants, with eight of them but they
 are deprecated. This means that there are even some additions that have been
 directly identified with introduction as deprecated - confusing.

 The constructor-Deprecation orgy goes for the standard Analyzer, one of the
 key classes during indexing and querying further. This class has now no-less
 constructor arguments over what might, perhaps, some downstream libraries
 bring to stumble to instantiate their analyzer on a property, which contains
 the class name dynamically. Instead, an object version must be given to set
 for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the
 VERSION_29 parameters are deprecated but themselves - very confusing!
 VERSION_CURRENT is the only safe investment in the future, a value which we
 certainly also as assignment in a zero-argument constructor would have
 trusted.

 To write an index we need an index writer instance. Again, the majority of
 the 19 possible constructors are about to be put to retire to.
 

 What was going wrong with the open() hell in IR? Very strange, I should have
 looked better.

 By the way: How to proceed with deprecation removal? Case-by-case (e.g.
 start with TS API, then these open() calls, then FSDirectory - to list the
 ones I was involved) or some hyper-patch?

 By the way, here is my talk @ Hadoop GetTogether in Berlin:

 http://blog.isabel-drost.de/index.php/archives/category/events/apache-hadoop
 -get-together-berlin

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Michael McCandless
I think this would make sense... though, it'd be a shame if the
simple case becomes overbearing.  Maybe we can keep good defaults,
but use Version to allow us to change them.  So eg:

  new IndexWriter(new IndexWriter.Config(dir, analyzer, Version.LUCENE_29));

would be the simple case.

Mike

On Fri, Oct 2, 2009 at 6:54 PM, Michael Busch busch...@gmail.com wrote:
 I was thinking lately about the large quantity of IndexWriter constructors
 and IndexReader open methods. I'm not sure if this has been proposed before,
 but what if we introduced new objects, e.g. IndexWriterConfig and
 IndexReaderConfig. They would contain getter/setter methods for all the
 different parameters the various constructors and open methods currently
 have. Then there would only be one IW constructor taking an
 IndexWriterConfig object as parameter and one open method in IR likewise.
 Then going forward we won't have to add/deprecate more ctors or open
 methods, we can then easily extend or deprecate getters/setters in the
 *Config classes.

  Michael

 On 10/3/09 12:41 AM, Uwe Schindler wrote:

 When looking for press articles about the release of Lucene 2.9, I found
 the
 following one from Bernd Fondermann
 @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html

 Translation with Google Translate:

 
 Deprecated

 An index reader is created via the static open () factory method, of which
 there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there
 are now a total of 14 open-overloaded variants, with eight of them but
 they
 are deprecated. This means that there are even some additions that have
 been
 directly identified with introduction as deprecated - confusing.

 The constructor-Deprecation orgy goes for the standard Analyzer, one of
 the
 key classes during indexing and querying further. This class has now
 no-less
 constructor arguments over what might, perhaps, some downstream libraries
 bring to stumble to instantiate their analyzer on a property, which
 contains
 the class name dynamically. Instead, an object version must be given to
 set
 for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the
 VERSION_29 parameters are deprecated but themselves - very confusing!
 VERSION_CURRENT is the only safe investment in the future, a value which
 we
 certainly also as assignment in a zero-argument constructor would have
 trusted.

 To write an index we need an index writer instance. Again, the majority of
 the 19 possible constructors are about to be put to retire to.

 

 What was going wrong with the open() hell in IR? Very strange, I should
 have
 looked better.

 By the way: How to proceed with deprecation removal? Case-by-case (e.g.
 start with TS API, then these open() calls, then FSDirectory - to list the
 ones I was involved) or some hyper-patch?

 By the way, here is my talk @ Hadoop GetTogether in Berlin:


 http://blog.isabel-drost.de/index.php/archives/category/events/apache-hadoop
 -get-together-berlin

 Uwe

 -
 Uwe Schindler
 H.-H.-Meier-Allee 63, D-28213 Bremen
 http://www.thetaphi.de
 eMail: u...@thetaphi.de



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Uwe Schindler
I like Earwin's version more. A builder is very flexible, because you can
concat all your properties (like StringBuilder works with its append method
returning itself) and create the instance at the end.

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Saturday, October 03, 2009 1:37 AM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
 I think this would make sense... though, it'd be a shame if the
 simple case becomes overbearing.  Maybe we can keep good defaults,
 but use Version to allow us to change them.  So eg:
 
   new IndexWriter(new IndexWriter.Config(dir, analyzer,
 Version.LUCENE_29));
 
 would be the simple case.
 
 Mike
 
 On Fri, Oct 2, 2009 at 6:54 PM, Michael Busch busch...@gmail.com wrote:
  I was thinking lately about the large quantity of IndexWriter
 constructors
  and IndexReader open methods. I'm not sure if this has been proposed
 before,
  but what if we introduced new objects, e.g. IndexWriterConfig and
  IndexReaderConfig. They would contain getter/setter methods for all the
  different parameters the various constructors and open methods currently
  have. Then there would only be one IW constructor taking an
  IndexWriterConfig object as parameter and one open method in IR
 likewise.
  Then going forward we won't have to add/deprecate more ctors or open
  methods, we can then easily extend or deprecate getters/setters in the
  *Config classes.
 
   Michael
 
  On 10/3/09 12:41 AM, Uwe Schindler wrote:
 
  When looking for press articles about the release of Lucene 2.9, I
 found
  the
  following one from Bernd Fondermann
  @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html
 
  Translation with Google Translate:
 
  ---
 -
  Deprecated
 
  An index reader is created via the static open () factory method, of
 which
  there were 2.4 in all nine. Five of them are now deprecated. In 2.9
 there
  are now a total of 14 open-overloaded variants, with eight of them but
  they
  are deprecated. This means that there are even some additions that have
  been
  directly identified with introduction as deprecated - confusing.
 
  The constructor-Deprecation orgy goes for the standard Analyzer, one of
  the
  key classes during indexing and querying further. This class has now
  no-less
  constructor arguments over what might, perhaps, some downstream
 libraries
  bring to stumble to instantiate their analyzer on a property, which
  contains
  the class name dynamically. Instead, an object version must be given to
  set
  for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the
  VERSION_29 parameters are deprecated but themselves - very confusing!
  VERSION_CURRENT is the only safe investment in the future, a value
 which
  we
  certainly also as assignment in a zero-argument constructor would have
  trusted.
 
  To write an index we need an index writer instance. Again, the majority
 of
  the 19 possible constructors are about to be put to retire to.
 
  ---
 -
 
  What was going wrong with the open() hell in IR? Very strange, I should
  have
  looked better.
 
  By the way: How to proceed with deprecation removal? Case-by-case (e.g.
  start with TS API, then these open() calls, then FSDirectory - to list
 the
  ones I was involved) or some hyper-patch?
 
  By the way, here is my talk @ Hadoop GetTogether in Berlin:
 
 
  http://blog.isabel-drost.de/index.php/archives/category/events/apache-
 hadoop
  -get-together-berlin
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Uwe Schindler
 I like Earwin's version more. A builder is very flexible, because you can
 concat all your properties (like StringBuilder works with its append
 method
 returning itself) and create the instance at the end.

This is a really cool example of this builder pattern:

http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/
common/collect/MapMaker.html


  -Original Message-
  From: Michael McCandless [mailto:luc...@mikemccandless.com]
  Sent: Saturday, October 03, 2009 1:37 AM
  To: java-dev@lucene.apache.org
  Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
  I think this would make sense... though, it'd be a shame if the
  simple case becomes overbearing.  Maybe we can keep good defaults,
  but use Version to allow us to change them.  So eg:
 
new IndexWriter(new IndexWriter.Config(dir, analyzer,
  Version.LUCENE_29));
 
  would be the simple case.
 
  Mike
 
  On Fri, Oct 2, 2009 at 6:54 PM, Michael Busch busch...@gmail.com
 wrote:
   I was thinking lately about the large quantity of IndexWriter
  constructors
   and IndexReader open methods. I'm not sure if this has been proposed
  before,
   but what if we introduced new objects, e.g. IndexWriterConfig and
   IndexReaderConfig. They would contain getter/setter methods for all
 the
   different parameters the various constructors and open methods
 currently
   have. Then there would only be one IW constructor taking an
   IndexWriterConfig object as parameter and one open method in IR
  likewise.
   Then going forward we won't have to add/deprecate more ctors or open
   methods, we can then easily extend or deprecate getters/setters in the
   *Config classes.
  
    Michael
  
   On 10/3/09 12:41 AM, Uwe Schindler wrote:
  
   When looking for press articles about the release of Lucene 2.9, I
  found
   the
   following one from Bernd Fondermann
   @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html
  
   Translation with Google Translate:
  
   -
 --
  -
   Deprecated
  
   An index reader is created via the static open () factory method, of
  which
   there were 2.4 in all nine. Five of them are now deprecated. In 2.9
  there
   are now a total of 14 open-overloaded variants, with eight of them
 but
   they
   are deprecated. This means that there are even some additions that
 have
   been
   directly identified with introduction as deprecated - confusing.
  
   The constructor-Deprecation orgy goes for the standard Analyzer, one
 of
   the
   key classes during indexing and querying further. This class has now
   no-less
   constructor arguments over what might, perhaps, some downstream
  libraries
   bring to stumble to instantiate their analyzer on a property, which
   contains
   the class name dynamically. Instead, an object version must be given
 to
   set
   for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the
   VERSION_29 parameters are deprecated but themselves - very confusing!
   VERSION_CURRENT is the only safe investment in the future, a value
  which
   we
   certainly also as assignment in a zero-argument constructor would
 have
   trusted.
  
   To write an index we need an index writer instance. Again, the
 majority
  of
   the 19 possible constructors are about to be put to retire to.
  
   -
 --
  -
  
   What was going wrong with the open() hell in IR? Very strange, I
 should
   have
   looked better.
  
   By the way: How to proceed with deprecation removal? Case-by-case
 (e.g.
   start with TS API, then these open() calls, then FSDirectory - to
 list
  the
   ones I was involved) or some hyper-patch?
  
   By the way, here is my talk @ Hadoop GetTogether in Berlin:
  
  
   http://blog.isabel-
 drost.de/index.php/archives/category/events/apache-
  hadoop
   -get-together-berlin
  
   Uwe
  
   -
   Uwe Schindler
   H.-H.-Meier-Allee 63, D-28213 Bremen
   http://www.thetaphi.de
   eMail: u...@thetaphi.de
  
  
  
   -
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org
  
  
  
  
  
   -
   To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
   For additional commands, e-mail: java-dev-h...@lucene.apache.org
  
  
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org

Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Earwin Burrfoot
On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote:
 It is also probably a good idea to move various settings methods from
 IW to that builder and have IW immutable in regards to configuration.
 I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB,
 setMergePolicy, setMergeScheduler, setSimilarity.

 IndexWriter.Builder iwb = IndexWriter.builder().
   writeLockTimeout(0).
   RAMBufferSize(config.indexationBufferMB).
   maxBufferedDocs(...).
   similarity(...).
   analyzer(...);

 ... = iwb.build(dir1);
 ... = iwb.build(dir2);

 A happy user of google-collections API :-) These builders are really cool!

I feel myself caught in the act.

There is still a couple of things bothering me.
1. Introducing a builder, we'll have a whole heap of deprecated
constructors that will hang there for eternity. And then users will
scream in frustration - This class has 14(!) constructors and all of
them are deprecated! How on earth am I supposed to create this thing?
2. If someone creates IW with some reflectish javabeanish tools - he's
busted. Not that I'm feeling compassionate for such a person.

 I like Earwin's version more. A builder is very flexible, because you can
 concat all your properties (like StringBuilder works with its append method
 returning itself) and create the instance at the end.
Besides (arguably) cleaner syntax, the lack of which is (arguably) a
curse of many Java libraries,
it also allows us to return a different concrete implementation of IW
without breaking back-compat,
and also to choose this concrete implementation based on settings
provided. If we feel like doing it at some point.

-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



RE: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Uwe Schindler
I already started with removing deprecations in o.a.l.store and make FSDir
abstract. This package is finished, now I have to remove all these
open()/ctors using getDirectory().

Will post a patch tomorrow! Good night!

Uwe

-
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

 -Original Message-
 From: Michael McCandless [mailto:luc...@mikemccandless.com]
 Sent: Saturday, October 03, 2009 1:33 AM
 To: java-dev@lucene.apache.org
 Subject: Re: Lucene 2.9 and deprecated IR.open() methods
 
 Sigh.  The introduction of new but deprecated methods is silly.  Is
 there some simple automated way to catch/prevent these?
 
 The proliferation of ctors/factory methods is a nightmare.
 
 Part of the story with IndexReader.open is the switch to readOnly
 IndexReaders.  After the long back-compat discussion we settled on
 adding new ctors as the best way to make the change.
 
 On deprecation of Version.LUCENE_29, that doesn't seem right.  In fact
 I don't think LUCENE_24 should be deprecated, either, since these
 constants are used by StandardAnalyzer to state compatibility that's
 equivalent to index format compability (from our last discussion).
 
 I think deprecation by separate area makes sense?
 
 Mike
 
 On Fri, Oct 2, 2009 at 6:41 PM, Uwe Schindler u...@thetaphi.de wrote:
  When looking for press articles about the release of Lucene 2.9, I found
 the
  following one from Bernd Fondermann
  @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html
 
  Translation with Google Translate:
  
 
  Deprecated
 
  An index reader is created via the static open () factory method, of
 which
  there were 2.4 in all nine. Five of them are now deprecated. In 2.9
 there
  are now a total of 14 open-overloaded variants, with eight of them but
 they
  are deprecated. This means that there are even some additions that have
 been
  directly identified with introduction as deprecated - confusing.
 
  The constructor-Deprecation orgy goes for the standard Analyzer, one of
 the
  key classes during indexing and querying further. This class has now no-
 less
  constructor arguments over what might, perhaps, some downstream
 libraries
  bring to stumble to instantiate their analyzer on a property, which
 contains
  the class name dynamically. Instead, an object version must be given to
 set
  for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the
  VERSION_29 parameters are deprecated but themselves - very confusing!
  VERSION_CURRENT is the only safe investment in the future, a value which
 we
  certainly also as assignment in a zero-argument constructor would have
  trusted.
 
  To write an index we need an index writer instance. Again, the majority
 of
  the 19 possible constructors are about to be put to retire to.
  
 
 
  What was going wrong with the open() hell in IR? Very strange, I should
 have
  looked better.
 
  By the way: How to proceed with deprecation removal? Case-by-case (e.g.
  start with TS API, then these open() calls, then FSDirectory - to list
 the
  ones I was involved) or some hyper-patch?
 
  By the way, here is my talk @ Hadoop GetTogether in Berlin:
 
  http://blog.isabel-drost.de/index.php/archives/category/events/apache-
 hadoop
  -get-together-berlin
 
  Uwe
 
  -
  Uwe Schindler
  H.-H.-Meier-Allee 63, D-28213 Bremen
  http://www.thetaphi.de
  eMail: u...@thetaphi.de
 
 
 
  -
  To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
  For additional commands, e-mail: java-dev-h...@lucene.apache.org
 
 
 
 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Michael McCandless
OK, I agree, using the builder approach looks compelling!

Though what about required settings?  EG IW's builder must have
Directory, Analyzer.  Would we pass these as up-front args to the
initial builder?

And shouldn't we still specify the version up-front so we can improve
defaults over time without breaking back-compat?  (Else, how can
we change defaults?)

EG:

  IndexWriter.builder(Version.29, dir, analyzer)
.setRAMBufferSizeMB(128)
.setUseCompoundFile(false)
...
.create()

?

Mike

On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com wrote:
 On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote:
 It is also probably a good idea to move various settings methods from
 IW to that builder and have IW immutable in regards to configuration.
 I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB,
 setMergePolicy, setMergeScheduler, setSimilarity.

 IndexWriter.Builder iwb = IndexWriter.builder().
   writeLockTimeout(0).
   RAMBufferSize(config.indexationBufferMB).
   maxBufferedDocs(...).
   similarity(...).
   analyzer(...);

 ... = iwb.build(dir1);
 ... = iwb.build(dir2);

 A happy user of google-collections API :-) These builders are really cool!

 I feel myself caught in the act.

 There is still a couple of things bothering me.
 1. Introducing a builder, we'll have a whole heap of deprecated
 constructors that will hang there for eternity. And then users will
 scream in frustration - This class has 14(!) constructors and all of
 them are deprecated! How on earth am I supposed to create this thing?
 2. If someone creates IW with some reflectish javabeanish tools - he's
 busted. Not that I'm feeling compassionate for such a person.

 I like Earwin's version more. A builder is very flexible, because you can
 concat all your properties (like StringBuilder works with its append method
 returning itself) and create the instance at the end.
 Besides (arguably) cleaner syntax, the lack of which is (arguably) a
 curse of many Java libraries,
 it also allows us to return a different concrete implementation of IW
 without breaking back-compat,
 and also to choose this concrete implementation based on settings
 provided. If we feel like doing it at some point.

 --
 Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
 Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
 ICQ: 104465785

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Mark Miller
Call me old fashioned, but I like how the non constructor params are set
now.

And for some reason I like a config object over a builder pattern for
the required constructor params.

Thats just me though.

Michael McCandless wrote:
 OK, I agree, using the builder approach looks compelling!

 Though what about required settings?  EG IW's builder must have
 Directory, Analyzer.  Would we pass these as up-front args to the
 initial builder?

 And shouldn't we still specify the version up-front so we can improve
 defaults over time without breaking back-compat?  (Else, how can
 we change defaults?)

 EG:

   IndexWriter.builder(Version.29, dir, analyzer)
 .setRAMBufferSizeMB(128)
 .setUseCompoundFile(false)
 ...
 .create()

 ?

 Mike

 On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com wrote:
   
 On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote:
 
 It is also probably a good idea to move various settings methods from
 IW to that builder and have IW immutable in regards to configuration.
 I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB,
 setMergePolicy, setMergeScheduler, setSimilarity.

 IndexWriter.Builder iwb = IndexWriter.builder().
   writeLockTimeout(0).
   RAMBufferSize(config.indexationBufferMB).
   maxBufferedDocs(...).
   similarity(...).
   analyzer(...);

 ... = iwb.build(dir1);
 ... = iwb.build(dir2);
 
 A happy user of google-collections API :-) These builders are really cool!
   
 I feel myself caught in the act.

 There is still a couple of things bothering me.
 1. Introducing a builder, we'll have a whole heap of deprecated
 constructors that will hang there for eternity. And then users will
 scream in frustration - This class has 14(!) constructors and all of
 them are deprecated! How on earth am I supposed to create this thing?
 2. If someone creates IW with some reflectish javabeanish tools - he's
 busted. Not that I'm feeling compassionate for such a person.

 
 I like Earwin's version more. A builder is very flexible, because you can
 concat all your properties (like StringBuilder works with its append method
 returning itself) and create the instance at the end.
   
 Besides (arguably) cleaner syntax, the lack of which is (arguably) a
 curse of many Java libraries,
 it also allows us to return a different concrete implementation of IW
 without breaking back-compat,
 and also to choose this concrete implementation based on settings
 provided. If we feel like doing it at some point.

 --
 Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
 Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
 ICQ: 104465785

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org


 

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org

   


-- 
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Earwin Burrfoot
 Though what about required settings?  EG IW's builder must have
 Directory, Analyzer.  Would we pass these as up-front args to the
 initial builder?
I'd try to keep required settings at minimum. The only one absolutely
required, imho, is a Directory, and it's best to specify it in
create() method, so you could set all your IW parameters and then
build several instances, for different Directories for example.

If you decide to add more required settings, we're back to square one
- after a couple of years we're looking at 14 builder() methods.
Okay, there is a way. Take a look at how Guice handles binding
declarations in Modules - different builder methods may return
different interfaces implemented by 'this'.

class IndexWriter {
  public static NoAnalyzerYetBuilder builder() { return new
HiddenTrueBuilder(); }

  interface NoAnalyzerYetBuilder {
 NoAnalyzerYetBuilder setRAMBuffer(...)
 NoAnalyzerYetBuilder setUseCompound(...)
 
 Builder setAnalyzer(Analyzer)
  }

  interface Builder extends NoAnalyzerYetBuilder {
 Builder setRAMBuffer(...)
 Builder setUseCompound (...)
 
 IndexWriter create(Directory)
  }

  private static class HiddenTrueBuilder implements Builder {
  }

  
}

This approach looks nice from client-side, but is a mess to implement.


 And shouldn't we still specify the version up-front so we can improve
 defaults over time without breaking back-compat?  (Else, how can
 we change defaults?)

 EG:

  IndexWriter.builder(Version.29, dir, analyzer)
    .setRAMBufferSizeMB(128)
    .setUseCompoundFile(false)
    ...
    .create()

 ?

It's probably okay to specify version upfront. But also, nothing bad
happens if we do it like:
IndexWriter.builder().
  defaultsFor(Version.29).
  setRam...

 Mike

 On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com wrote:
 On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote:
 It is also probably a good idea to move various settings methods from
 IW to that builder and have IW immutable in regards to configuration.
 I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB,
 setMergePolicy, setMergeScheduler, setSimilarity.

 IndexWriter.Builder iwb = IndexWriter.builder().
   writeLockTimeout(0).
   RAMBufferSize(config.indexationBufferMB).
   maxBufferedDocs(...).
   similarity(...).
   analyzer(...);

 ... = iwb.build(dir1);
 ... = iwb.build(dir2);

 A happy user of google-collections API :-) These builders are really cool!

 I feel myself caught in the act.

 There is still a couple of things bothering me.
 1. Introducing a builder, we'll have a whole heap of deprecated
 constructors that will hang there for eternity. And then users will
 scream in frustration - This class has 14(!) constructors and all of
 them are deprecated! How on earth am I supposed to create this thing?
 2. If someone creates IW with some reflectish javabeanish tools - he's
 busted. Not that I'm feeling compassionate for such a person.

 I like Earwin's version more. A builder is very flexible, because you can
 concat all your properties (like StringBuilder works with its append method
 returning itself) and create the instance at the end.
 Besides (arguably) cleaner syntax, the lack of which is (arguably) a
 curse of many Java libraries,
 it also allows us to return a different concrete implementation of IW
 without breaking back-compat,
 and also to choose this concrete implementation based on settings
 provided. If we feel like doing it at some point.

 --
 Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
 Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
 ICQ: 104465785

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org



 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Earwin Burrfoot
 Call me old fashioned, but I like how the non constructor params are set
 now.
And what happens when you index some docs, change these params, index
more docs, change params, commit? Let's throw in some threads?
You either end up writing really hairy state control code, or just
leave it broken, with Don't change parameters after you start pumping
docs through it! plea covering your back somewhere in JavaDocs.
If nothing else, having stuff 'final' keeps JIT really happy.

 And for some reason I like a config object over a builder pattern for
 the required constructor params.
Builder pattern allows you to switch concrete implementations as you
please, taking parameters into account or not.
Besides that there's no real difference. I prefer builder, but that's just me :)

 Thats just me though.

 Michael McCandless wrote:
 OK, I agree, using the builder approach looks compelling!

 Though what about required settings?  EG IW's builder must have
 Directory, Analyzer.  Would we pass these as up-front args to the
 initial builder?

 And shouldn't we still specify the version up-front so we can improve
 defaults over time without breaking back-compat?  (Else, how can
 we change defaults?)

 EG:

   IndexWriter.builder(Version.29, dir, analyzer)
     .setRAMBufferSizeMB(128)
     .setUseCompoundFile(false)
     ...
     .create()

 ?

 Mike

 On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com wrote:

 On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote:

 It is also probably a good idea to move various settings methods from
 IW to that builder and have IW immutable in regards to configuration.
 I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB,
 setMergePolicy, setMergeScheduler, setSimilarity.

 IndexWriter.Builder iwb = IndexWriter.builder().
   writeLockTimeout(0).
   RAMBufferSize(config.indexationBufferMB).
   maxBufferedDocs(...).
   similarity(...).
   analyzer(...);

 ... = iwb.build(dir1);
 ... = iwb.build(dir2);

 A happy user of google-collections API :-) These builders are really cool!

 I feel myself caught in the act.

 There is still a couple of things bothering me.
 1. Introducing a builder, we'll have a whole heap of deprecated
 constructors that will hang there for eternity. And then users will
 scream in frustration - This class has 14(!) constructors and all of
 them are deprecated! How on earth am I supposed to create this thing?
 2. If someone creates IW with some reflectish javabeanish tools - he's
 busted. Not that I'm feeling compassionate for such a person.


 I like Earwin's version more. A builder is very flexible, because you can
 concat all your properties (like StringBuilder works with its append method
 returning itself) and create the instance at the end.

 Besides (arguably) cleaner syntax, the lack of which is (arguably) a
 curse of many Java libraries,
 it also allows us to return a different concrete implementation of IW
 without breaking back-compat,
 and also to choose this concrete implementation based on settings
 provided. If we feel like doing it at some point.

 --
 Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
 Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
 ICQ: 104465785

 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org




 --
 - Mark

 http://www.lucidimagination.com




 -
 To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
 For additional commands, e-mail: java-dev-h...@lucene.apache.org





-- 
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Mark Miller





On Oct 2, 2009, at 10:18 PM, Earwin Burrfoot ear...@gmail.com wrote:

Call me old fashioned, but I like how the non constructor params  
are set

now.

And what happens when you index some docs, change these params, index
more docs, change params, commit? Let's throw in some threads?
You either end up writing really hairy state control code, or just
leave it broken, with Don't change parameters after you start pumping
docs through it! plea covering your back somewhere in JavaDocs.
If nothing else, having stuff 'final' keeps JIT really happy.


And for some reason I like a config object over a builder pattern for
the required constructor params.

Builder pattern allows you to switch concrete implementations as you
please, taking parameters into account or not.
Besides that there's no real difference. I prefer builder, but  
that's just me :)


Nope. So far it's you and a couple others ;)




Thats just me though.

Michael McCandless wrote:

OK, I agree, using the builder approach looks compelling!

Though what about required settings?  EG IW's builder must have
Directory, Analyzer.  Would we pass these as up-front args to the
initial builder?

And shouldn't we still specify the version up-front so we can  
improve

defaults over time without breaking back-compat?  (Else, how can
we change defaults?)

EG:

  IndexWriter.builder(Version.29, dir, analyzer)
.setRAMBufferSizeMB(128)
.setUseCompoundFile(false)
...
.create()

?

Mike

On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com  
wrote:


On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de  
wrote:


It is also probably a good idea to move various settings  
methods from
IW to that builder and have IW immutable in regards to  
configuration.
I'm speaking of the likes of setWriteLockTimeout,  
setRAMBufferSizeMB,

setMergePolicy, setMergeScheduler, setSimilarity.

IndexWriter.Builder iwb = IndexWriter.builder().
  writeLockTimeout(0).
  RAMBufferSize(config.indexationBufferMB).
  maxBufferedDocs(...).
  similarity(...).
  analyzer(...);

... = iwb.build(dir1);
... = iwb.build(dir2);

A happy user of google-collections API :-) These builders are  
really cool!



I feel myself caught in the act.

There is still a couple of things bothering me.
1. Introducing a builder, we'll have a whole heap of deprecated
constructors that will hang there for eternity. And then users will
scream in frustration - This class has 14(!) constructors and all  
of
them are deprecated! How on earth am I supposed to create this  
thing?
2. If someone creates IW with some reflectish javabeanish tools -  
he's

busted. Not that I'm feeling compassionate for such a person.


I like Earwin's version more. A builder is very flexible,  
because you can
concat all your properties (like StringBuilder works with its  
append method

returning itself) and create the instance at the end.

Besides (arguably) cleaner syntax, the lack of which is  
(arguably) a

curse of many Java libraries,
it also allows us to return a different concrete implementation  
of IW

without breaking back-compat,
and also to choose this concrete implementation based on settings
provided. If we feel like doing it at some point.

--
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail 
.com)

Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

--- 
--

To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org





--- 
--

To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org





--
- Mark

http://www.lucidimagination.com




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org






--
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



Re: Lucene 2.9 and deprecated IR.open() methods

2009-10-02 Thread Mark Miller
Again - random opinion from left field - I've used guice and I like it  
a lot. Really cool stuff and I actually prefer it to Spring for  
injection. But still for some reason I'd hate to see Lucene start  
resembling anything in Guice.


I'm not even taking the time to make arguments, so I don't expect  
these comments to have much weight (they don't by definition) - but  
just putting my opinion out there.


- Mark

http://www.lucidimagination.com (mobile)

On Oct 2, 2009, at 10:10 PM, Earwin Burrfoot ear...@gmail.com wrote:


Though what about required settings?  EG IW's builder must have
Directory, Analyzer.  Would we pass these as up-front args to the
initial builder?

I'd try to keep required settings at minimum. The only one absolutely
required, imho, is a Directory, and it's best to specify it in
create() method, so you could set all your IW parameters and then
build several instances, for different Directories for example.

If you decide to add more required settings, we're back to square one
- after a couple of years we're looking at 14 builder() methods.
Okay, there is a way. Take a look at how Guice handles binding
declarations in Modules - different builder methods may return
different interfaces implemented by 'this'.

class IndexWriter {
 public static NoAnalyzerYetBuilder builder() { return new
HiddenTrueBuilder(); }

 interface NoAnalyzerYetBuilder {
NoAnalyzerYetBuilder setRAMBuffer(...)
NoAnalyzerYetBuilder setUseCompound(...)

Builder setAnalyzer(Analyzer)
 }

 interface Builder extends NoAnalyzerYetBuilder {
Builder setRAMBuffer(...)
Builder setUseCompound (...)

IndexWriter create(Directory)
 }

 private static class HiddenTrueBuilder implements Builder {
 }

 
}

This approach looks nice from client-side, but is a mess to implement.



And shouldn't we still specify the version up-front so we can improve
defaults over time without breaking back-compat?  (Else, how can
we change defaults?)

EG:

 IndexWriter.builder(Version.29, dir, analyzer)
   .setRAMBufferSizeMB(128)
   .setUseCompoundFile(false)
   ...
   .create()

?


It's probably okay to specify version upfront. But also, nothing bad
happens if we do it like:
IndexWriter.builder().
 defaultsFor(Version.29).
 setRam...


Mike

On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com  
wrote:

On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote:
It is also probably a good idea to move various settings methods  
from
IW to that builder and have IW immutable in regards to  
configuration.
I'm speaking of the likes of setWriteLockTimeout,  
setRAMBufferSizeMB,

setMergePolicy, setMergeScheduler, setSimilarity.

IndexWriter.Builder iwb = IndexWriter.builder().
  writeLockTimeout(0).
  RAMBufferSize(config.indexationBufferMB).
  maxBufferedDocs(...).
  similarity(...).
  analyzer(...);

... = iwb.build(dir1);
... = iwb.build(dir2);


A happy user of google-collections API :-) These builders are  
really cool!


I feel myself caught in the act.

There is still a couple of things bothering me.
1. Introducing a builder, we'll have a whole heap of deprecated
constructors that will hang there for eternity. And then users will
scream in frustration - This class has 14(!) constructors and all of
them are deprecated! How on earth am I supposed to create this  
thing?
2. If someone creates IW with some reflectish javabeanish tools -  
he's

busted. Not that I'm feeling compassionate for such a person.

I like Earwin's version more. A builder is very flexible, because  
you can
concat all your properties (like StringBuilder works with its  
append method

returning itself) and create the instance at the end.

Besides (arguably) cleaner syntax, the lack of which is (arguably) a
curse of many Java libraries,
it also allows us to return a different concrete implementation of  
IW

without breaking back-compat,
and also to choose this concrete implementation based on settings
provided. If we feel like doing it at some point.

--
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

--- 
--

To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org




-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org






--
Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com)
Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423
ICQ: 104465785

-
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org



-
To unsubscribe, e-mail: