RE: Lucene 2.9 and deprecated IR.open() methods
On Sun, Oct 04, 2009 at 05:53:14AM -0400, Michael McCandless wrote: 1 Do we prevent config settings from changing after creating an IW/IR? Any settings conveyed via a settings object ought to be final if you want pluggable index components. Otherwise, you need some nightmarish notification system to propagate settings down into your subcomponents, which may or may not be prepared to handle the value modifications. +1, this is an argument in my opinion for final members/settings. By the way, there is a third possibility for passing configuration settings: The idea is to enable passing settings to IR/IW and its flexible indexing components by the same technique like JAXP does it (please don't hit me!): Pass a Properties or MapString,? to the ctor/open. The keys are predefined constants. Maybe our previous idea of an IndexConfiguration class is a subclass of HashMapString,? with all the constants and some easy-to-use setter methods for very often-used settings (like index dir) and some reasonable defaults. This allows us to pass these properties to any flex indexing component without need to modify/extend it to support the additional properties. The flexible indexing component just defines its own property names (e.g. as URNs, URLs, using its class name as prefix,...). Property names are always String, values any type (therefore MapString,?). With Java 5, integer props and so on are no bad syntax problem because of autoboxing (no need to pass new Integer() or Integer.valueOf()). Another good thing is, that implementors of e.g. XML config files like in Solr, can simple pass all elements in config to this map. Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
On Sun, Oct 04, 2009 at 05:53:14AM -0400, Michael McCandless wrote: 1 Do we prevent config settings from changing after creating an IW/IR? Any settings conveyed via a settings object ought to be final if you want pluggable index components. Otherwise, you need some nightmarish notification system to propagate settings down into your subcomponents, which may or may not be prepared to handle the value modifications. +1, this is an argument in my opinion for final members/settings. By the way, there is a third possibility for passing configuration settings: The idea is to enable passing settings to IR/IW and its flexible indexing components by the same technique like JAXP does it (please don't hit me!): Pass a Properties or MapString,? to the ctor/open. The keys are predefined constants. Maybe our previous idea of an IndexConfiguration class is a subclass of HashMapString,? with all the constants and some easy-to-use setter methods for very often-used settings (like index dir) and some reasonable defaults. This allows us to pass these properties to any flex indexing component without need to modify/extend it to support the additional properties. The flexible indexing component just defines its own property names (e.g. as URNs, URLs, using its class name as prefix,...). Property names are always String, values any type (therefore MapString,?). With Java 5, integer props and so on are no bad syntax problem because of autoboxing (no need to pass new Integer() or Integer.valueOf()). Another good thing is, that implementors of e.g. XML config files like in Solr, can simple pass all elements in config to this map. Another option for extensibility with type safety, properties would not have, would be Attributes. Just pass an AttributeSource as configuration. And the default index properties are one attribute where custom extensions can define own ones. Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Mon, Oct 05, 2009 at 08:27:20AM +0200, Uwe Schindler wrote: Pass a Properties or MapString,? to the ctor/open. The keys are predefined constants. Maybe our previous idea of an IndexConfiguration class is a subclass of HashMapString,? with all the constants and some easy-to-use setter methods for very often-used settings (like index dir) and some reasonable defaults. Interesting. The design we worked out for Lucy's Segment class (prototype in KS devel branch) uses hash/array/string data to store arbitrary metadata on behalf of segment components, written as JSON to seg_NNN/segmeta.json. In that case, though, each component is responsible for generating and consuming its own data. That's different from having the user supply data via such a format. I still think you're going to want an extensible builder class. This allows us to pass these properties to any flex indexing component without need to modify/extend it to support the additional properties. The flexible indexing component just defines its own property names (e.g. as URNs, URLs, using its class name as prefix,...). But how do you determine what the flex indexing components *are*? In theory, you can pass class names and sufficient arguments to build them up via your big ball of data, but then you're essentially creating a new language, with all the headaches that entails. In KS, Indexer/IndexReader configuration is divided between three classes. * Schema: field definitions. * Architecture: Settings that never change for the life of the index. * IndexManager: Settings that can change per index/search session. Schema isn't worth discussing -- Lucy will have it, Lucene won't, end of story. Architecture and IndexManager, though, are fairly close to what's being discussed. Architecture is responsible for e.g. determining which plugabble components get registered. It's the builder class. IndexManager is where things like merging and locking policies reside. Property names are always String, values any type (therefore MapString,?). With Java 5, integer props and so on are no bad syntax problem because of autoboxing (no need to pass new Integer() or Integer.valueOf()). Argument validation gets to be a headache when you pass around complex data structures. It's doable, but messy and hard to grok. Going through dedicated methods is cleaner and safer. Another good thing is, that implementors of e.g. XML config files like in Solr, can simple pass all elements in config to this map. I go back and forth on this. At some point, the volume of data becomes overwhelming and it becomes easier to swap in the name of a class where most of the data can reside in nice, reliable, structured code. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
Hi Marvin, Property names are always String, values any type (therefore MapString,?). With Java 5, integer props and so on are no bad syntax problem because of autoboxing (no need to pass new Integer() or Integer.valueOf()). Argument validation gets to be a headache when you pass around complex data structures. It's doable, but messy and hard to grok. Going through dedicated methods is cleaner and safer. Another good thing is, that implementors of e.g. XML config files like in Solr, can simple pass all elements in config to this map. I go back and forth on this. At some point, the volume of data becomes overwhelming and it becomes easier to swap in the name of a class where most of the data can reside in nice, reliable, structured code. See my second mail. The recently introduced Attributes and AttributeSource would solve this. Each component just defines its attribute interface and impl class and you pass in an AttributeSource as configuration. Then you can do: AttributeSource cfg = new AttributeSource(); ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class); compCfg.setMergeScheduler(FooScheduler.class); MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class); mergeCfg.setWhateverProp(1234); ... IndexWriter iw = new IndexWriter(dir, cfg); (this is just brainstorming not yet thoroughly thought about). Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Mon, Oct 5, 2009 at 12:01, Uwe Schindler u...@thetaphi.de wrote: Hi Marvin, Property names are always String, values any type (therefore MapString,?). With Java 5, integer props and so on are no bad syntax problem because of autoboxing (no need to pass new Integer() or Integer.valueOf()). Argument validation gets to be a headache when you pass around complex data structures. It's doable, but messy and hard to grok. Going through dedicated methods is cleaner and safer. Another good thing is, that implementors of e.g. XML config files like in Solr, can simple pass all elements in config to this map. I go back and forth on this. At some point, the volume of data becomes overwhelming and it becomes easier to swap in the name of a class where most of the data can reside in nice, reliable, structured code. See my second mail. The recently introduced Attributes and AttributeSource would solve this. Each component just defines its attribute interface and impl class and you pass in an AttributeSource as configuration. Then you can do: AttributeSource cfg = new AttributeSource(); ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class); compCfg.setMergeScheduler(FooScheduler.class); MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class); mergeCfg.setWhateverProp(1234); ... IndexWriter iw = new IndexWriter(dir, cfg); (this is just brainstorming not yet thoroughly thought about). This approach suggests IW creates its components, and while doing so provides them your AS instance. I personally prefer creating all these components myself, configuring them (at the moment of creation) and passing them to IW in one way or another. This requires way less code, you don't have to invent elaborate schemes of passing through your custom per-component settings and selecting which exact component types IW should use, you don't risk construct/postConstruct/postpostConstruct-style things. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
See my second mail. The recently introduced Attributes and AttributeSource would solve this. Each component just defines its attribute interface and impl class and you pass in an AttributeSource as configuration. Then you can do: AttributeSource cfg = new AttributeSource(); ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class); compCfg.setMergeScheduler(FooScheduler.class); MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class); mergeCfg.setWhateverProp(1234); ... IndexWriter iw = new IndexWriter(dir, cfg); (this is just brainstorming not yet thoroughly thought about). This approach suggests IW creates its components, and while doing so provides them your AS instance. I personally prefer creating all these components myself, configuring them (at the moment of creation) and passing them to IW in one way or another. This requires way less code, you don't have to invent elaborate schemes of passing through your custom per-component settings and selecting which exact component types IW should use, you don't risk construct/postConstruct/postpostConstruct-style things. Not really. That was just brainstorming. But you can pass also instances instead of class names through attributesource. AttributeSurce only provides type safety for the various configuration settings (which are interfaces). But you could also create an attribute that gets the pointer to the component. So compCfg.setMergeScheduler(FooScheduler.class); could also be compConfig.addComponent(new FooScheduler(...)); The AttributeSource approach has one other good thing: If you want to use the default settings for one attribute, you do not have to add it to the AS (or you can forget it). With the properties approach, you have to hardcode the parameter defaults and validation everywhere. As the consumer of an AttributeSource gets the attribute also by an addAttribute-call (see current indexing code consuming TokenStreams), this call would add the missing attribute with its default settings defined by the implementation class. So in the above example, if you do not want to provide the whateverProp, leave the whole MergeBarAttribute out. The consumer (IW) would just call addAttribute(MergeBarAttribute.class), because it needs the attribute to configure itself. AS would add this attribute with default settings. Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
I think AS is overkill for conveying configuration of IW/IR? Suddenly, instead of: cfg.setRAMBufferSizeMB(128.0) I'd have to do something like this? cfg.addAttribute(IndexWriterRAMBufferSizeAttribute.class).setRAMBufferSize(128.0) It's too cumbersome, I think, for something that ought to be simple. I'd prefer a dedicated config class with strongly typed setters exposed. Of all the pure syntax options so far I'd still prefer the traditional config object with setters. Also, I don't think we should roll this out for all Lucene classes. I think most classes do just fine accepting args to their ctor. EG TermQuery simply takes Term to its ctor. I do agree IW should not be in the business of brokering changes to the settings of its sub-components (eg mergeFactor, maxMergeDocs). You really should make such changes directly via your merge policy. Finally, I'm not convinced we should lock down all settings after classes are created. (I'm not convinced we shouldn't, either). A merge policy has no trouble changing its mergeFactor, maxMergeDocs/Size. IW has no trouble changing the its RAM buffer size, maxFieldLength, or useCompoundFile. Sure there are some things that cannot (or would be very tricky to) change, eg deletion policy. But then analyzer isn't changeable today, but could be. But, then, I can also see it'd simplify our code to not have to deal w/ such changes, reduce chance of subtle bugs, and it seems minor to go and re-open your IndexWriter if you need to make a settings change? (Hmm except in an NRT setting, because the reader pool would be reset; really we need to get the reader pool separated from the IW instance). Mike On Mon, Oct 5, 2009 at 4:38 AM, Uwe Schindler u...@thetaphi.de wrote: See my second mail. The recently introduced Attributes and AttributeSource would solve this. Each component just defines its attribute interface and impl class and you pass in an AttributeSource as configuration. Then you can do: AttributeSource cfg = new AttributeSource(); ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class); compCfg.setMergeScheduler(FooScheduler.class); MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class); mergeCfg.setWhateverProp(1234); ... IndexWriter iw = new IndexWriter(dir, cfg); (this is just brainstorming not yet thoroughly thought about). This approach suggests IW creates its components, and while doing so provides them your AS instance. I personally prefer creating all these components myself, configuring them (at the moment of creation) and passing them to IW in one way or another. This requires way less code, you don't have to invent elaborate schemes of passing through your custom per-component settings and selecting which exact component types IW should use, you don't risk construct/postConstruct/postpostConstruct-style things. Not really. That was just brainstorming. But you can pass also instances instead of class names through attributesource. AttributeSurce only provides type safety for the various configuration settings (which are interfaces). But you could also create an attribute that gets the pointer to the component. So compCfg.setMergeScheduler(FooScheduler.class); could also be compConfig.addComponent(new FooScheduler(...)); The AttributeSource approach has one other good thing: If you want to use the default settings for one attribute, you do not have to add it to the AS (or you can forget it). With the properties approach, you have to hardcode the parameter defaults and validation everywhere. As the consumer of an AttributeSource gets the attribute also by an addAttribute-call (see current indexing code consuming TokenStreams), this call would add the missing attribute with its default settings defined by the implementation class. So in the above example, if you do not want to provide the whateverProp, leave the whole MergeBarAttribute out. The consumer (IW) would just call addAttribute(MergeBarAttribute.class), because it needs the attribute to configure itself. AS would add this attribute with default settings. Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
Hi Mike, I think AS is overkill for conveying configuration of IW/IR? Suddenly, instead of: cfg.setRAMBufferSizeMB(128.0) I'd have to do something like this? cfg.addAttribute(IndexWriterRAMBufferSizeAttribute.class).setRAMBufferSize (128.0) It's too cumbersome, I think, for something that ought to be simple. I'd prefer a dedicated config class with strongly typed setters exposed. Of all the pure syntax options so far I'd still prefer the traditional config object with setters. From this point-of-view, it's also overkill for TokenStream. But as AS was also designed for flexible indexing it would fit very well into this area. The new query parser is a good example pro attributes. What is an argument against atts is the fact, that also Michael Bush didn't promote them from the beginning of this discussion :-) (maybe he needs also one night longer to think about it). Good points for AS, are e.g. the type-safety, simplicity to enhance, built-in defaults (you do not need to check for existence of attributes, just add them at the point you want to use them, like in your example - maybe with nicer and shorter names). With generics, AS is as simple to use like simple get/setters. Also, I don't think we should roll this out for all Lucene classes. I think most classes do just fine accepting args to their ctor. EG TermQuery simply takes Term to its ctor. I do agree IW should not be in the business of brokering changes to the settings of its sub-components (eg mergeFactor, maxMergeDocs). You really should make such changes directly via your merge policy. AttributeSource would also help us with e.g. the possibility for later changes to various attributes. If some of the attributes are fixed after construction of IW/IR, just throw IllegalStateExceptions. Finally, I'm not convinced we should lock down all settings after classes are created. (I'm not convinced we shouldn't, either). A merge policy has no trouble changing its mergeFactor, maxMergeDocs/Size. IW has no trouble changing the its RAM buffer size, maxFieldLength, or useCompoundFile. Sure there are some things that cannot (or would be very tricky to) change, eg deletion policy. But then analyzer isn't changeable today, but could be. But, then, I can also see it'd simplify our code to not have to deal w/ such changes, reduce chance of subtle bugs, and it seems minor to go and re-open your IndexWriter if you need to make a settings change? (Hmm except in an NRT setting, because the reader pool would be reset; really we need to get the reader pool separated from the IW instance). Mike On Mon, Oct 5, 2009 at 4:38 AM, Uwe Schindler u...@thetaphi.de wrote: See my second mail. The recently introduced Attributes and AttributeSource would solve this. Each component just defines its attribute interface and impl class and you pass in an AttributeSource as configuration. Then you can do: AttributeSource cfg = new AttributeSource(); ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class); compCfg.setMergeScheduler(FooScheduler.class); MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class); mergeCfg.setWhateverProp(1234); ... IndexWriter iw = new IndexWriter(dir, cfg); (this is just brainstorming not yet thoroughly thought about). This approach suggests IW creates its components, and while doing so provides them your AS instance. I personally prefer creating all these components myself, configuring them (at the moment of creation) and passing them to IW in one way or another. This requires way less code, you don't have to invent elaborate schemes of passing through your custom per-component settings and selecting which exact component types IW should use, you don't risk construct/postConstruct/postpostConstruct-style things. Not really. That was just brainstorming. But you can pass also instances instead of class names through attributesource. AttributeSurce only provides type safety for the various configuration settings (which are interfaces). But you could also create an attribute that gets the pointer to the component. So compCfg.setMergeScheduler(FooScheduler.class); could also be compConfig.addComponent(new FooScheduler(...)); The AttributeSource approach has one other good thing: If you want to use the default settings for one attribute, you do not have to add it to the AS (or you can forget it). With the properties approach, you have to hardcode the parameter defaults and validation everywhere. As the consumer of an AttributeSource gets the attribute also by an addAttribute-call (see current indexing code consuming TokenStreams), this call would add the missing attribute with its default settings defined by the implementation class. So in the above example, if you do not want to provide the whateverProp, leave the whole MergeBarAttribute out. The
Re: Lucene 2.9 and deprecated IR.open() methods
Michael McCandless wrote: I think AS is overkill for conveying configuration of IW/IR? Suddenly, instead of: cfg.setRAMBufferSizeMB(128.0) I'd have to do something like this? cfg.addAttribute(IndexWriterRAMBufferSizeAttribute.class).setRAMBufferSize(128.0) It's too cumbersome, I think, for something that ought to be simple. I'd prefer a dedicated config class with strongly typed setters exposed. Of all the pure syntax options so far I'd still prefer the traditional config object with setters. +1 Also, I don't think we should roll this out for all Lucene classes. I think most classes do just fine accepting args to their ctor. EG TermQuery simply takes Term to its ctor. +1 I do agree IW should not be in the business of brokering changes to the settings of its sub-components (eg mergeFactor, maxMergeDocs). You really should make such changes directly via your merge policy. Agreed we need to deal with - *but* I personally think it gets tricky. Users should be able to flip compound on/off easily without dealing with a mergepolicy IMO. And advanced users that set a mergepolicy shouldn't have to deal with losing a compound setting they set with IW after setting a new mergepolicy. Can't I have it both ways :) Finally, I'm not convinced we should lock down all settings after classes are created. (I'm not convinced we shouldn't, either). A merge policy has no trouble changing its mergeFactor, maxMergeDocs/Size. IW has no trouble changing the its RAM buffer size, maxFieldLength, or useCompoundFile. Sure there are some things that cannot (or would be very tricky to) change, eg deletion policy. But then analyzer isn't changeable today, but could be. But, then, I can also see it'd simplify our code to not have to deal w/ such changes, reduce chance of subtle bugs, and it seems minor to go and re-open your IndexWriter if you need to make a settings change? (Hmm except in an NRT setting, because the reader pool would be reset; really we need to get the reader pool separated from the IW instance). Mike On Mon, Oct 5, 2009 at 4:38 AM, Uwe Schindler u...@thetaphi.de wrote: See my second mail. The recently introduced Attributes and AttributeSource would solve this. Each component just defines its attribute interface and impl class and you pass in an AttributeSource as configuration. Then you can do: AttributeSource cfg = new AttributeSource(); ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class); compCfg.setMergeScheduler(FooScheduler.class); MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class); mergeCfg.setWhateverProp(1234); ... IndexWriter iw = new IndexWriter(dir, cfg); (this is just brainstorming not yet thoroughly thought about). This approach suggests IW creates its components, and while doing so provides them your AS instance. I personally prefer creating all these components myself, configuring them (at the moment of creation) and passing them to IW in one way or another. This requires way less code, you don't have to invent elaborate schemes of passing through your custom per-component settings and selecting which exact component types IW should use, you don't risk construct/postConstruct/postpostConstruct-style things. Not really. That was just brainstorming. But you can pass also instances instead of class names through attributesource. AttributeSurce only provides type safety for the various configuration settings (which are interfaces). But you could also create an attribute that gets the pointer to the component. So compCfg.setMergeScheduler(FooScheduler.class); could also be compConfig.addComponent(new FooScheduler(...)); The AttributeSource approach has one other good thing: If you want to use the default settings for one attribute, you do not have to add it to the AS (or you can forget it). With the properties approach, you have to hardcode the parameter defaults and validation everywhere. As the consumer of an AttributeSource gets the attribute also by an addAttribute-call (see current indexing code consuming TokenStreams), this call would add the missing attribute with its default settings defined by the implementation class. So in the above example, if you do not want to provide the whateverProp, leave the whole MergeBarAttribute out. The consumer (IW) would just call addAttribute(MergeBarAttribute.class), because it needs the attribute to configure itself. AS would add this attribute with default settings. Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands,
Re: Lucene 2.9 and deprecated IR.open() methods
I think AS is overkill for conveying configuration of IW/IR? Agree. It's too cumbersome, I think, for something that ought to be simple. I'd prefer a dedicated config class with strongly typed setters exposed. Of all the pure syntax options so far I'd still prefer the traditional config object with setters. Builders are visually cleaner. But well, it's just my aestetical preference. Also, I don't think we should roll this out for all Lucene classes. I think most classes do just fine accepting args to their ctor. EG TermQuery simply takes Term to its ctor. It's obvious. I do agree IW should not be in the business of brokering changes to the settings of its sub-components (eg mergeFactor, maxMergeDocs). You really should make such changes directly via your merge policy. Aaaand, you shouldn't do such changes after construction :) But, then, I can also see it'd simplify our code to not have to deal w/ such changes, reduce chance of subtle bugs, and it seems minor to go and re-open your IndexWriter if you need to make a settings change? (Hmm except in an NRT setting, because the reader pool would be reset; really we need to get the reader pool separated from the IW instance). Even if recreating IW is costly, you don't change settings that often, isn't it? Mark: Agreed we need to deal with - *but* I personally think it gets tricky. Users should be able to flip compound on/off easily without dealing with a mergepolicy IMO. And advanced users that set a mergepolicy shouldn't have to deal with losing a compound setting they set with IW after setting a new mergepolicy. Can't I have it both ways :) I don't understand why on earth compound setting is a property of MergePolicy. The question of which segments to merge is really orthogonal to the way you store these segments on disk. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
I think we shouldn't discuss too many different things here. To begin I'd just like to introduce the IndexConfig class, that will hold the parameters we currently pass to the different IndexWriter constructors. If we later need to create different IndexWriter impls we can introduce a factory. If we want to change some IW settings to be mandatory on IW instantiation we can move those parameters from IW to the Config class then. If we see in the future the need to pass arguments to the different flex index consumers, we can add an AttributeSource or Properties hashmap to the config class, or maybe directly to the IndexingChain class. I don't really think the IndexWriter needs this flexibility right now and it seems like Mike hasn't seen the need thus far while working on LUCENE-1458 either. Adding the Config class and deprecating all other IW constructors will not prevent us from doing any of the other things in the future IMO and is already a great start to simplify things. So let's do that first and discuss the other points separately when the need arises. Michael On 10/5/09 5:40 AM, Uwe Schindler wrote: Hi Mike, I think AS is overkill for conveying configuration of IW/IR? Suddenly, instead of: cfg.setRAMBufferSizeMB(128.0) I'd have to do something like this? cfg.addAttribute(IndexWriterRAMBufferSizeAttribute.class).setRAMBufferSize (128.0) It's too cumbersome, I think, for something that ought to be simple. I'd prefer a dedicated config class with strongly typed setters exposed. Of all the pure syntax options so far I'd still prefer the traditional config object with setters. From this point-of-view, it's also overkill for TokenStream. But as AS was also designed for flexible indexing it would fit very well into this area. The new query parser is a good example pro attributes. What is an argument against atts is the fact, that also Michael Bush didn't promote them from the beginning of this discussion :-) (maybe he needs also one night longer to think about it). Good points for AS, are e.g. the type-safety, simplicity to enhance, built-in defaults (you do not need to check for existence of attributes, just add them at the point you want to use them, like in your example - maybe with nicer and shorter names). With generics, AS is as simple to use like simple get/setters. Also, I don't think we should roll this out for all Lucene classes. I think most classes do just fine accepting args to their ctor. EG TermQuery simply takes Term to its ctor. I do agree IW should not be in the business of brokering changes to the settings of its sub-components (eg mergeFactor, maxMergeDocs). You really should make such changes directly via your merge policy. AttributeSource would also help us with e.g. the possibility for later changes to various attributes. If some of the attributes are fixed after construction of IW/IR, just throw IllegalStateExceptions. Finally, I'm not convinced we should lock down all settings after classes are created. (I'm not convinced we shouldn't, either). A merge policy has no trouble changing its mergeFactor, maxMergeDocs/Size. IW has no trouble changing the its RAM buffer size, maxFieldLength, or useCompoundFile. Sure there are some things that cannot (or would be very tricky to) change, eg deletion policy. But then analyzer isn't changeable today, but could be. But, then, I can also see it'd simplify our code to not have to deal w/ such changes, reduce chance of subtle bugs, and it seems minor to go and re-open your IndexWriter if you need to make a settings change? (Hmm except in an NRT setting, because the reader pool would be reset; really we need to get the reader pool separated from the IW instance). Mike On Mon, Oct 5, 2009 at 4:38 AM, Uwe Schindleru...@thetaphi.de wrote: See my second mail. The recently introduced Attributes and AttributeSource would solve this. Each component just defines its attribute interface and impl class and you pass in an AttributeSource as configuration. Then you can do: AttributeSource cfg = new AttributeSource(); ComponentAttribute compCfg = cfg.addAttribute(ComponentAttribute.class); compCfg.setMergeScheduler(FooScheduler.class); MergeBarAttribute mergeCfg = cfg.addAttribute(MergeBarAttribute.class); mergeCfg.setWhateverProp(1234); ... IndexWriter iw = new IndexWriter(dir, cfg); (this is just brainstorming not yet thoroughly thought about). This approach suggests IW creates its components, and while doing so provides them your AS instance. I personally prefer creating all these components myself, configuring them (at the moment of creation) and passing them to IW in one way or another. This requires way less code, you don't have to invent elaborate schemes of passing through your custom per-component settings and selecting which
Re: Lucene 2.9 and deprecated IR.open() methods
On 10/4/09 3:31 AM, Mark Miller wrote: Ted Dunning wrote: The builder pattern and the config argument to a factory both have the advantage that you can limit changes after creating an object. Some things are just bad to change in mid-stream. The config argument is nice in that you can pass it around to different stake holders, but the builder can be used a bit like that as well. Yeah that argument has been made. But I've *never* seen it as an issue. Just seems like a solution looking for a problem. I can see how it's cleaner, not missing that point. But still doesn't make me like it much. Yeah personally this wasn't a problem for me either. I do like the cleanliness though. Also, I'd very much prefer a config object over multiple constructors (with the need to deprecate/add with every change), as I proposed originally in this thread. I still don't see an advantage of the builder pattern over the config object + factory pattern - and I'm not even sure if we really need a factory; IMO passing a config object into a single constructor would be sufficient for IW. Michael - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
The builder pattern and the config argument to a factory both have the advantage that you can limit changes after creating an object. Some things are just bad to change in mid-stream. The config argument is nice in that you can pass it around to different stake holders, but the builder can be used a bit like that as well. Yeah that argument has been made. But I've *never* seen it as an issue. Just seems like a solution looking for a problem. I can see how it's cleaner, not missing that point. But still doesn't make me like it much. Yeah personally this wasn't a problem for me either. I do like the cleanliness though. Also, I'd very much prefer a config object over multiple constructors (with the need to deprecate/add with every change), as I proposed originally in this thread. I still don't see an advantage of the builder pattern over the config object + factory pattern - and I'm not even sure if we really need a factory; IMO passing a config object into a single constructor would be sufficient for IW. For IR the factory would be ok. In my opinion you could also combine both patterns: - Each setter in the config object returns itself, so you have the builder pattern, but you could also use it in classical setter way (this only works if the builder pattern always returns itself not a new builder object) - The builder factory .build() just delegates to the ctor/static factory in IR/IW and passes itself to it). So you have both possibilities: IndexReader reader = new IndexReader.Config(dir).setReadOnly(true) .setTermInfosIndexDivisor(foo).build(); is equal to: IndexReader.Config config = IndexReader.Config(dir); config.setReadOnly(true); config.setTermInfosIndexDivisor(foo); IndexReader reader = IndexReader.create(config); Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
I don't think we should do both. Suddenly, all code snippets (in javadocs, tutorials, email we all send, etc.) can be one pattern or the other, with each of us choosing based on our preference. Or, mixed. I think this just causes confusion. It'd suddenly become alot like differences of opinion on which whitespace style is best. I'd rather have one clear syntax, and at this point I'd prefer to stick with the classical setter approach, ie a standalone config object. But there are two other (more important than pure syntax!) questions being debated here: 1 Do we prevent config settings from changing after creating an IW/IR? 2 Do we use factory or ctor to create IW/IR? On #1, we are technically taking something away. Are we sure no users find the freedom to change IW settings mid-stream (ramBufferSizeMB, mergeFactor) important? For example, infoStream should remain an IW setter. Also, MergePolicy now requires IW instance on construction, so we'd need to rework that. On #2, I agree with Michael: until we see a clear reason to hide IW's concrete impl., we may as well stick with the one impl we have now. Design for today. Mike On Sun, Oct 4, 2009 at 5:33 AM, Uwe Schindler u...@thetaphi.de wrote: The builder pattern and the config argument to a factory both have the advantage that you can limit changes after creating an object. Some things are just bad to change in mid-stream. The config argument is nice in that you can pass it around to different stake holders, but the builder can be used a bit like that as well. Yeah that argument has been made. But I've *never* seen it as an issue. Just seems like a solution looking for a problem. I can see how it's cleaner, not missing that point. But still doesn't make me like it much. Yeah personally this wasn't a problem for me either. I do like the cleanliness though. Also, I'd very much prefer a config object over multiple constructors (with the need to deprecate/add with every change), as I proposed originally in this thread. I still don't see an advantage of the builder pattern over the config object + factory pattern - and I'm not even sure if we really need a factory; IMO passing a config object into a single constructor would be sufficient for IW. For IR the factory would be ok. In my opinion you could also combine both patterns: - Each setter in the config object returns itself, so you have the builder pattern, but you could also use it in classical setter way (this only works if the builder pattern always returns itself not a new builder object) - The builder factory .build() just delegates to the ctor/static factory in IR/IW and passes itself to it). So you have both possibilities: IndexReader reader = new IndexReader.Config(dir).setReadOnly(true) .setTermInfosIndexDivisor(foo).build(); is equal to: IndexReader.Config config = IndexReader.Config(dir); config.setReadOnly(true); config.setTermInfosIndexDivisor(foo); IndexReader reader = IndexReader.create(config); Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Sun, Oct 4, 2009 at 5:53 AM, Michael McCandless luc...@mikemccandless.com wrote: 1 Do we prevent config settings from changing after creating an IW/IR? 2 Do we use factory or ctor to create IW/IR? On #1, we are technically taking something away. Are we sure no users find the freedom to change IW settings mid-stream (ramBufferSizeMB, mergeFactor) important? Some of these are important I think - esp changing merge factor or the max segment size. Seems like everything that should be fixed at construction time (simple params at least) can be passed in the config object, and everything else can remain setters on the IndexWriter. Of course things like max segment size have been factored out into the merge policies... but you get the idea. -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
As I stated in my last email, there's zero difference between settings+static factory and builder except for syntax. Cannot understand what Mark, Mike are arguing about. Right now I offer to do two things, in any possible way - eradicate as much broken/spahetti-like runtime state change from IW and friends as possible, and kill setting methods that delegate to IW components (eg MergePolicy), as they are redundant and suddenly break if you supply a non-default component instance. On Sun, Oct 4, 2009 at 17:55, Yonik Seeley ysee...@gmail.com wrote: On Sun, Oct 4, 2009 at 5:53 AM, Michael McCandless luc...@mikemccandless.com wrote: 1 Do we prevent config settings from changing after creating an IW/IR? 2 Do we use factory or ctor to create IW/IR? On #1, we are technically taking something away. Are we sure no users find the freedom to change IW settings mid-stream (ramBufferSizeMB, mergeFactor) important? Some of these are important I think - esp changing merge factor or the max segment size. The question is - whether anybody's going to change mergefactor/maxsegment size often enough he can't recreate IW without dire performance penalties? Seems like everything that should be fixed at construction time (simple params at least) can be passed in the config object, and everything else can remain setters on the IndexWriter. Of course things like max segment size have been factored out into the merge policies... but you get the idea. -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
Earwin Burrfoot wrote: As I stated in my last email, there's zero difference between settings+static factory and builder except for syntax. Cannot understand what Mark, Mike are arguing about. Sounds like we are arguing that we don't like the syntax then... kill setting methods that delegate to IW components (eg MergePolicy), as they are redundant and suddenly break if you supply a non-default component instance. I do agree that this is something that should be addressed. On Sun, Oct 4, 2009 at 17:55, Yonik Seeley ysee...@gmail.com wrote: On Sun, Oct 4, 2009 at 5:53 AM, Michael McCandless luc...@mikemccandless.com wrote: 1 Do we prevent config settings from changing after creating an IW/IR? 2 Do we use factory or ctor to create IW/IR? On #1, we are technically taking something away. Are we sure no users find the freedom to change IW settings mid-stream (ramBufferSizeMB, mergeFactor) important? Some of these are important I think - esp changing merge factor or the max segment size. The question is - whether anybody's going to change mergefactor/maxsegment size often enough he can't recreate IW without dire performance penalties? Seems like everything that should be fixed at construction time (simple params at least) can be passed in the config object, and everything else can remain setters on the IndexWriter. Of course things like max segment size have been factored out into the merge policies... but you get the idea. -Yonik http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Sun, Oct 04, 2009 at 03:04:13PM -0400, Mark Miller wrote: Earwin Burrfoot wrote: As I stated in my last email, there's zero difference between settings+static factory and builder except for syntax. Cannot understand what Mark, Mike are arguing about. Sounds like we are arguing that we don't like the syntax then... So, implement the static factory methods as wrappers around the builder method. public static IndexWriter open(Directory dir, Analyzer analyzer) { return open(new IndexManager(dir), dir, analyzer) } public static IndexWriter open(IndexManager manager, Directory dir, Analyzer analyzer) { return arch.buildIndexWriter(new Architecture(), manager, dir, analyzer); } public static IndexWriter open(Architecture arch, IndexManager manager, Directory dir, Analyzer analyzer) { return arch.buildIndexWriter(manager, dir, analyzer); } IMO, it's important not to force first-time users to grok builder classes in order to perform basic indexing or searching. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Sun, Oct 04, 2009 at 05:53:14AM -0400, Michael McCandless wrote: 1 Do we prevent config settings from changing after creating an IW/IR? Any settings conveyed via a settings object ought to be final if you want pluggable index components. Otherwise, you need some nightmarish notification system to propagate settings down into your subcomponents, which may or may not be prepared to handle the value modifications. Marvin Humphrey - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfoot ear...@gmail.com wrote: Builder pattern allows you to switch concrete implementations as you please, taking parameters into account or not. We could also achieve this w/ static factory method. EG IndexReader.open(IndexReader.Config) could switch between concrete impls (it already does today). Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfoot ear...@gmail.com wrote: Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. This is a good point: are you allowed to change config settings after creating your IndexWriter/Reader? Today it's ad hoc. EG IW does not allow you to swap out your deletion policy, because it'd be a nightmare to implement. You also can't swap the analyzer. But it does let you change your RAM buffer size, CFS or not, merge factor, etc. We can remove that flexibility (I'm not sure it's compelling), so we can make things final. You can't change read-only after opening your IndexReader. I think it'd make sense to move away from changing settings after construction... But: the do we disallow changing config settings after construction? question is really orthogonal to the what syntax do we use for construction? (builder vs config vs zillions-of-ctors). Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
Hi, The problem is, we have to leave some of the not-yet-deprecated ctors/opens available for a while (not until 4.0 with our ne policy), but a user removing all deprecated stuff from his 2.9 release should be able to switch to 3.0 without changing any code (can even plug the jars in). We also have to keep the getters/setter avail. If we wanted to change this, 2.9 was the best option :-( Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 11:35 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfoot ear...@gmail.com wrote: Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. This is a good point: are you allowed to change config settings after creating your IndexWriter/Reader? Today it's ad hoc. EG IW does not allow you to swap out your deletion policy, because it'd be a nightmare to implement. You also can't swap the analyzer. But it does let you change your RAM buffer size, CFS or not, merge factor, etc. We can remove that flexibility (I'm not sure it's compelling), so we can make things final. You can't change read-only after opening your IndexReader. I think it'd make sense to move away from changing settings after construction... But: the do we disallow changing config settings after construction? question is really orthogonal to the what syntax do we use for construction? (builder vs config vs zillions-of-ctors). Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
Well, let's first get 3.0 out the door ;) Then we can salivate over all sorts of juicy changes for 3.1... These particular changes (switching syntax from multi-ctors to config or to builder, disallowing config changes after creation, switching to concrete impl is hidden) may merit an exception to our back-compat policy. Obviously users are bothered by the horror of how many ctors you are confronted with for IW and IR. Mike On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindler u...@thetaphi.de wrote: Hi, The problem is, we have to leave some of the not-yet-deprecated ctors/opens available for a while (not until 4.0 with our ne policy), but a user removing all deprecated stuff from his 2.9 release should be able to switch to 3.0 without changing any code (can even plug the jars in). We also have to keep the getters/setter avail. If we wanted to change this, 2.9 was the best option :-( Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 11:35 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfoot ear...@gmail.com wrote: Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. This is a good point: are you allowed to change config settings after creating your IndexWriter/Reader? Today it's ad hoc. EG IW does not allow you to swap out your deletion policy, because it'd be a nightmare to implement. You also can't swap the analyzer. But it does let you change your RAM buffer size, CFS or not, merge factor, etc. We can remove that flexibility (I'm not sure it's compelling), so we can make things final. You can't change read-only after opening your IndexReader. I think it'd make sense to move away from changing settings after construction... But: the do we disallow changing config settings after construction? question is really orthogonal to the what syntax do we use for construction? (builder vs config vs zillions-of-ctors). Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
Builder pattern allows you to switch concrete implementations as you please, taking parameters into account or not. We could also achieve this w/ static factory method. EG IndexReader.open(IndexReader.Config) could switch between concrete impls (it already does today). Yes, the choice of 'IW.create(IWSettings, Directory)' VS 'IWSettings.create(Directory)' is purely syntactical (with latter being more concise, imo), but I was comparing to 'new IW(Settings, Directory)'. Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. This is a good point: are you allowed to change config settings after creating your IndexWriter/Reader? Today it's ad hoc. EG IW does not allow you to swap out your deletion policy, because it'd be a nightmare to implement. You also can't swap the analyzer. But it does let you change your RAM buffer size, CFS or not, merge factor, etc. We can remove that flexibility (I'm not sure it's compelling), so we can make things final. You can't change read-only after opening your IndexReader. I think it'd make sense to move away from changing settings after construction... I've just remembered some horrible things: public void setMergeFactor(int mergeFactor) { getLogMergePolicy().setMergeFactor(mergeFactor); } Let's remove this flexibility too? But: the do we disallow changing config settings after construction? question is really orthogonal to the what syntax do we use for construction? (builder vs config vs zillions-of-ctors). There's better syntax for both mutable and immutable approach, so it's not like these two questions are completely orthogonal. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
There's also LUCENE-1698! Maybe we can change the policy. Now that 2.9 is out we should try to get to a conclusion. Michael On 10/3/09 11:54 AM, Michael McCandless wrote: Well, let's first get 3.0 out the door ;) Then we can salivate over all sorts of juicy changes for 3.1... These particular changes (switching syntax from multi-ctors to config or to builder, disallowing config changes after creation, switching to concrete impl is hidden) may merit an exception to our back-compat policy. Obviously users are bothered by the horror of how many ctors you are confronted with for IW and IR. Mike On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindleru...@thetaphi.de wrote: Hi, The problem is, we have to leave some of the not-yet-deprecated ctors/opens available for a while (not until 4.0 with our ne policy), but a user removing all deprecated stuff from his 2.9 release should be able to switch to 3.0 without changing any code (can even plug the jars in). We also have to keep the getters/setter avail. If we wanted to change this, 2.9 was the best option :-( Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 11:35 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfootear...@gmail.com wrote: Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. This is a good point: are you allowed to change config settings after creating your IndexWriter/Reader? Today it's ad hoc. EG IW does not allow you to swap out your deletion policy, because it'd be a nightmare to implement. You also can't swap the analyzer. But it does let you change your RAM buffer size, CFS or not, merge factor, etc. We can remove that flexibility (I'm not sure it's compelling), so we can make things final. You can't change read-only after opening your IndexReader. I think it'd make sense to move away from changing settings after construction... But: the do we disallow changing config settings after construction? question is really orthogonal to the what syntax do we use for construction? (builder vs config vs zillions-of-ctors). Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
But we should not change for 3.0, because people have already much to do to get their 2.9 compile without deprec. If the work is then obsolete, because we change this fundamental, we will make a lot of people angry. So I would do this for 3.1. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael Busch [mailto:busch...@gmail.com] Sent: Saturday, October 03, 2009 12:15 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods There's also LUCENE-1698! Maybe we can change the policy. Now that 2.9 is out we should try to get to a conclusion. Michael On 10/3/09 11:54 AM, Michael McCandless wrote: Well, let's first get 3.0 out the door ;) Then we can salivate over all sorts of juicy changes for 3.1... These particular changes (switching syntax from multi-ctors to config or to builder, disallowing config changes after creation, switching to concrete impl is hidden) may merit an exception to our back-compat policy. Obviously users are bothered by the horror of how many ctors you are confronted with for IW and IR. Mike On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindleru...@thetaphi.de wrote: Hi, The problem is, we have to leave some of the not-yet-deprecated ctors/opens available for a while (not until 4.0 with our ne policy), but a user removing all deprecated stuff from his 2.9 release should be able to switch to 3.0 without changing any code (can even plug the jars in). We also have to keep the getters/setter avail. If we wanted to change this, 2.9 was the best option :-( Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 11:35 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfootear...@gmail.com wrote: Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. This is a good point: are you allowed to change config settings after creating your IndexWriter/Reader? Today it's ad hoc. EG IW does not allow you to swap out your deletion policy, because it'd be a nightmare to implement. You also can't swap the analyzer. But it does let you change your RAM buffer size, CFS or not, merge factor, etc. We can remove that flexibility (I'm not sure it's compelling), so we can make things final. You can't change read-only after opening your IndexReader. I think it'd make sense to move away from changing settings after construction... But: the do we disallow changing config settings after construction? question is really orthogonal to the what syntax do we use for construction? (builder vs config vs zillions-of-ctors). Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
Right: 3.0 should be a fast turnaround w/ no further deprecations. (And at your rate of progress Uwe it looks like it really *will* be fast!). For 3.1 we can salivate... Mike On Sat, Oct 3, 2009 at 6:18 AM, Uwe Schindler u...@thetaphi.de wrote: But we should not change for 3.0, because people have already much to do to get their 2.9 compile without deprec. If the work is then obsolete, because we change this fundamental, we will make a lot of people angry. So I would do this for 3.1. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael Busch [mailto:busch...@gmail.com] Sent: Saturday, October 03, 2009 12:15 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods There's also LUCENE-1698! Maybe we can change the policy. Now that 2.9 is out we should try to get to a conclusion. Michael On 10/3/09 11:54 AM, Michael McCandless wrote: Well, let's first get 3.0 out the door ;) Then we can salivate over all sorts of juicy changes for 3.1... These particular changes (switching syntax from multi-ctors to config or to builder, disallowing config changes after creation, switching to concrete impl is hidden) may merit an exception to our back-compat policy. Obviously users are bothered by the horror of how many ctors you are confronted with for IW and IR. Mike On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindleru...@thetaphi.de wrote: Hi, The problem is, we have to leave some of the not-yet-deprecated ctors/opens available for a while (not until 4.0 with our ne policy), but a user removing all deprecated stuff from his 2.9 release should be able to switch to 3.0 without changing any code (can even plug the jars in). We also have to keep the getters/setter avail. If we wanted to change this, 2.9 was the best option :-( Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 11:35 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfootear...@gmail.com wrote: Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. This is a good point: are you allowed to change config settings after creating your IndexWriter/Reader? Today it's ad hoc. EG IW does not allow you to swap out your deletion policy, because it'd be a nightmare to implement. You also can't swap the analyzer. But it does let you change your RAM buffer size, CFS or not, merge factor, etc. We can remove that flexibility (I'm not sure it's compelling), so we can make things final. You can't change read-only after opening your IndexReader. I think it'd make sense to move away from changing settings after construction... But: the do we disallow changing config settings after construction? question is really orthogonal to the what syntax do we use for construction? (builder vs config vs zillions-of-ctors). Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
I agree, we have announed the 2.9/3.0 release plans a long time ago already and shouldn't change anything. But ideally I'd like to announce any backwards-compatibility changes together with the 3.0 release, while mentioning that the changes will take effect from 3.1 on. That's why I'd like to get to a conclusion soon. Michael On 10/3/09 12:18 PM, Uwe Schindler wrote: But we should not change for 3.0, because people have already much to do to get their 2.9 compile without deprec. If the work is then obsolete, because we change this fundamental, we will make a lot of people angry. So I would do this for 3.1. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael Busch [mailto:busch...@gmail.com] Sent: Saturday, October 03, 2009 12:15 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods There's also LUCENE-1698! Maybe we can change the policy. Now that 2.9 is out we should try to get to a conclusion. Michael On 10/3/09 11:54 AM, Michael McCandless wrote: Well, let's first get 3.0 out the door ;) Then we can salivate over all sorts of juicy changes for 3.1... These particular changes (switching syntax from multi-ctors to config or to builder, disallowing config changes after creation, switching to concrete impl is hidden) may merit an exception to our back-compat policy. Obviously users are bothered by the horror of how many ctors you are confronted with for IW and IR. Mike On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindleru...@thetaphi.de wrote: Hi, The problem is, we have to leave some of the not-yet-deprecated ctors/opens available for a while (not until 4.0 with our ne policy), but a user removing all deprecated stuff from his 2.9 release should be able to switch to 3.0 without changing any code (can even plug the jars in). We also have to keep the getters/setter avail. If we wanted to change this, 2.9 was the best option :-( Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 11:35 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfootear...@gmail.com wrote: Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. This is a good point: are you allowed to change config settings after creating your IndexWriter/Reader? Today it's ad hoc. EG IW does not allow you to swap out your deletion policy, because it'd be a nightmare to implement. You also can't swap the analyzer. But it does let you change your RAM buffer size, CFS or not, merge factor, etc. We can remove that flexibility (I'm not sure it's compelling), so we can make things final. You can't change read-only after opening your IndexReader. I think it'd make sense to move away from changing settings after construction... But: the do we disallow changing config settings after construction? question is really orthogonal to the what syntax do we use for construction? (builder vs config vs zillions-of-ctors). Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
Now it gets slower. After applying LUCENE-1944, you get 600 errors when compiling tests :( We should have checked our tests in 2.9 that they only call deprecated methods for BW compatibility. No I have to change tons of IR.open(), IW() calls in backwards branch and also in trunk tests. But the patch is currently the same for both branches - puh. Completely unhappy :-( Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 12:21 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods Right: 3.0 should be a fast turnaround w/ no further deprecations. (And at your rate of progress Uwe it looks like it really *will* be fast!). For 3.1 we can salivate... Mike On Sat, Oct 3, 2009 at 6:18 AM, Uwe Schindler u...@thetaphi.de wrote: But we should not change for 3.0, because people have already much to do to get their 2.9 compile without deprec. If the work is then obsolete, because we change this fundamental, we will make a lot of people angry. So I would do this for 3.1. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael Busch [mailto:busch...@gmail.com] Sent: Saturday, October 03, 2009 12:15 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods There's also LUCENE-1698! Maybe we can change the policy. Now that 2.9 is out we should try to get to a conclusion. Michael On 10/3/09 11:54 AM, Michael McCandless wrote: Well, let's first get 3.0 out the door ;) Then we can salivate over all sorts of juicy changes for 3.1... These particular changes (switching syntax from multi-ctors to config or to builder, disallowing config changes after creation, switching to concrete impl is hidden) may merit an exception to our back-compat policy. Obviously users are bothered by the horror of how many ctors you are confronted with for IW and IR. Mike On Sat, Oct 3, 2009 at 5:46 AM, Uwe Schindleru...@thetaphi.de wrote: Hi, The problem is, we have to leave some of the not-yet-deprecated ctors/opens available for a while (not until 4.0 with our ne policy), but a user removing all deprecated stuff from his 2.9 release should be able to switch to 3.0 without changing any code (can even plug the jars in). We also have to keep the getters/setter avail. If we wanted to change this, 2.9 was the best option :-( Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 11:35 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Fri, Oct 2, 2009 at 10:18 PM, Earwin Burrfootear...@gmail.com wrote: Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. This is a good point: are you allowed to change config settings after creating your IndexWriter/Reader? Today it's ad hoc. EG IW does not allow you to swap out your deletion policy, because it'd be a nightmare to implement. You also can't swap the analyzer. But it does let you change your RAM buffer size, CFS or not, merge factor, etc. We can remove that flexibility (I'm not sure it's compelling), so we can make things final. You can't change read- only after opening your IndexReader. I think it'd make sense to move away from changing settings after construction... But: the do we disallow changing config settings after construction? question is really orthogonal to the what syntax do we use for construction? (builder vs config vs zillions-of-ctors). Mike --- -- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Sat, Oct 3, 2009 at 6:25 AM, Uwe Schindler u...@thetaphi.de wrote: Now it gets slower. After applying LUCENE-1944, you get 600 errors when compiling tests :( We should have checked our tests in 2.9 that they only call deprecated methods for BW compatibility. Sigh. Yes, going forward we should probably always fix tests to not use deprecated APIs anymore, at the same time that we deprecate. No I have to change tons of IR.open(), IW() calls in backwards branch and also in trunk tests. But the patch is currently the same for both branches - puh. Maybe we should re-cut the back-compat branch after removal of all deprecated APIs? Completely unhappy :-( Sorry :( Take a deep breath. Go consume some coffee or dark chocolate (or, maybe, a beer!) :) Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 12:29 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Sat, Oct 3, 2009 at 6:25 AM, Uwe Schindler u...@thetaphi.de wrote: Now it gets slower. After applying LUCENE-1944, you get 600 errors when compiling tests :( We should have checked our tests in 2.9 that they only call deprecated methods for BW compatibility. Sigh. Yes, going forward we should probably always fix tests to not use deprecated APIs anymore, at the same time that we deprecate. Now I have to change tons of IR.open(), IW() calls in backwards branch and also in trunk tests. But the patch is currently the same for both branches - puh. Maybe we should re-cut the back-compat branch after removal of all deprecated APIs? No, better apply the patch on both branches. Because I changed generics in TokenStream API and want to be sure, that not generified BW branch works. Now it would not even compile anymore, because BW branch is forced to Java 1.4. The simpliest is really to create a patch and apply it to both branches or merge it using SVN. That's my smallest problem. Completely unhappy :-( Sorry :( Take a deep breath. Go consume some coffee or dark chocolate (or, maybe, a beer!) :) Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
I have a plan how to do the tests: I use my BW branch checkout and enable deprecation warnings there. I then start to fix all deprec usage and remove all code parts that are only there to test bw compatibility (e.g. TestTokenStreamBWCompatibiliy). After that the test should compile without deprec warnings. When this is done commit and create new TAG. After that apply the patch also to trunk - tests should compile :-) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Saturday, October 03, 2009 12:33 PM To: java-dev@lucene.apache.org Subject: RE: Lucene 2.9 and deprecated IR.open() methods From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 12:29 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Sat, Oct 3, 2009 at 6:25 AM, Uwe Schindler u...@thetaphi.de wrote: Now it gets slower. After applying LUCENE-1944, you get 600 errors when compiling tests :( We should have checked our tests in 2.9 that they only call deprecated methods for BW compatibility. Sigh. Yes, going forward we should probably always fix tests to not use deprecated APIs anymore, at the same time that we deprecate. Now I have to change tons of IR.open(), IW() calls in backwards branch and also in trunk tests. But the patch is currently the same for both branches - puh. Maybe we should re-cut the back-compat branch after removal of all deprecated APIs? No, better apply the patch on both branches. Because I changed generics in TokenStream API and want to be sure, that not generified BW branch works. Now it would not even compile anymore, because BW branch is forced to Java 1.4. The simpliest is really to create a patch and apply it to both branches or merge it using SVN. That's my smallest problem. Completely unhappy :-( Sorry :( Take a deep breath. Go consume some coffee or dark chocolate (or, maybe, a beer!) :) Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Oct 2, 2009, at 7:33 PM, Michael McCandless wrote: Sigh. The introduction of new but deprecated methods is silly. Is there some simple automated way to catch/prevent these? The proliferation of ctors/factory methods is a nightmare. Ah, so yet again, we are trying to work around a problem that is due to the ridiculousness of how we manage releases and deprecations and not necessarily something that is technically wrong. It's not like this is news. I've been complaining about the # of ctors for a long time (try training people on this stuff and you'll know what I mean). I'm not trying to be antagonistic, but if we would all just face facts that we do releases so few and far between that I just don't see it as being some massive hardship to remove some deprecations more often than every major release. It's funny, we add things in an agile way and everyone loves that, but we remove them in such a drawn out and monolithic manner that it is mind-boggling. We induce way more confusion than we prevent. Any sane programmer out there has to do more than just drop in any release, no matter what, (in other words, the whole drop in back compat thing is a myth, so get over it) and as soon as they start looking at the myriad of options, they are going to be confused. Far better for us just to remove an inferior method, with some smaller amount of warning, than to leave them guessing. Not only that, but as is evidenced by the new Token stuff, using deprecated and new stuff together may be even worse than just getting rid of the old stuff. Simply put, I propose we adopt a model we've all discussed many times before where we mark deprecated items with the version they will be removed in, regardless of minor/major number, with the caveat that it must be at least one more minor version (i.e. announce deprecation in 2.4.0, remove in 2.5.0). Major versions than are about what we all expect out of major versions from every other software package in the land: major new features or near complete overhaul of existing functionality. With this model, we won't have massive amounts of deprecation piling up, our users are still given plenty of warning whereby they can _plan_ for it, and we have more flexibility in how we develop. -Grant - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
This seems to work, I have created some scripts that do the compilations and create a deprecation report and I start to fix in BW branch. The easieist is first to just remove a lot of tests, that only test the BW compatibility API. I will post something, as soon as I have removed most deprec warnings. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Saturday, October 03, 2009 12:39 PM To: java-dev@lucene.apache.org Subject: RE: Lucene 2.9 and deprecated IR.open() methods I have a plan how to do the tests: I use my BW branch checkout and enable deprecation warnings there. I then start to fix all deprec usage and remove all code parts that are only there to test bw compatibility (e.g. TestTokenStreamBWCompatibiliy). After that the test should compile without deprec warnings. When this is done commit and create new TAG. After that apply the patch also to trunk - tests should compile :-) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Saturday, October 03, 2009 12:33 PM To: java-dev@lucene.apache.org Subject: RE: Lucene 2.9 and deprecated IR.open() methods From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 12:29 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Sat, Oct 3, 2009 at 6:25 AM, Uwe Schindler u...@thetaphi.de wrote: Now it gets slower. After applying LUCENE-1944, you get 600 errors when compiling tests :( We should have checked our tests in 2.9 that they only call deprecated methods for BW compatibility. Sigh. Yes, going forward we should probably always fix tests to not use deprecated APIs anymore, at the same time that we deprecate. Now I have to change tons of IR.open(), IW() calls in backwards branch and also in trunk tests. But the patch is currently the same for both branches - puh. Maybe we should re-cut the back-compat branch after removal of all deprecated APIs? No, better apply the patch on both branches. Because I changed generics in TokenStream API and want to be sure, that not generified BW branch works. Now it would not even compile anymore, because BW branch is forced to Java 1.4. The simpliest is really to create a patch and apply it to both branches or merge it using SVN. That's my smallest problem. Completely unhappy :-( Sorry :( Take a deep breath. Go consume some coffee or dark chocolate (or, maybe, a beer!) :) Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
Do not wonder, I will now commit lots of test fixes for IR.open() in backwards branch and then merge to trunk! - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Saturday, October 03, 2009 2:00 PM To: java-dev@lucene.apache.org Subject: RE: Lucene 2.9 and deprecated IR.open() methods This seems to work, I have created some scripts that do the compilations and create a deprecation report and I start to fix in BW branch. The easieist is first to just remove a lot of tests, that only test the BW compatibility API. I will post something, as soon as I have removed most deprec warnings. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Saturday, October 03, 2009 12:39 PM To: java-dev@lucene.apache.org Subject: RE: Lucene 2.9 and deprecated IR.open() methods I have a plan how to do the tests: I use my BW branch checkout and enable deprecation warnings there. I then start to fix all deprec usage and remove all code parts that are only there to test bw compatibility (e.g. TestTokenStreamBWCompatibiliy). After that the test should compile without deprec warnings. When this is done commit and create new TAG. After that apply the patch also to trunk - tests should compile :-) - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Uwe Schindler [mailto:u...@thetaphi.de] Sent: Saturday, October 03, 2009 12:33 PM To: java-dev@lucene.apache.org Subject: RE: Lucene 2.9 and deprecated IR.open() methods From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 12:29 PM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods On Sat, Oct 3, 2009 at 6:25 AM, Uwe Schindler u...@thetaphi.de wrote: Now it gets slower. After applying LUCENE-1944, you get 600 errors when compiling tests :( We should have checked our tests in 2.9 that they only call deprecated methods for BW compatibility. Sigh. Yes, going forward we should probably always fix tests to not use deprecated APIs anymore, at the same time that we deprecate. Now I have to change tons of IR.open(), IW() calls in backwards branch and also in trunk tests. But the patch is currently the same for both branches - puh. Maybe we should re-cut the back-compat branch after removal of all deprecated APIs? No, better apply the patch on both branches. Because I changed generics in TokenStream API and want to be sure, that not generified BW branch works. Now it would not even compile anymore, because BW branch is forced to Java 1.4. The simpliest is really to create a patch and apply it to both branches or merge it using SVN. That's my smallest problem. Completely unhappy :-( Sorry :( Take a deep breath. Go consume some coffee or dark chocolate (or, maybe, a beer!) :) Mike - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On 10/3/09 4:18 AM, Earwin Burrfoot wrote: Builder pattern allows you to switch concrete implementations as you please, taking parameters into account or not. Besides that there's no real difference. I prefer builder, but that's just me :) Why can't you do that with a factory that takes a config object as parameter? Seems very similar to me... the only difference is syntax, isn't it? And if you have setter methods on the config object or methods that return this that you can concatenate is just personal preference, right? Personally I prefer the setter methods for our usecase, simply because there are so many config options. Maybe you don't want to set them all in the same places in your app code? E.g. in our app we have a method like applyIWConfig(IndexWriter) that, as the name says, applies all settings we have in a customizable config file. However, some IW settings are not customizable, and applied somewhere else in our code. I think with the concatenation pattern this would look less intuitive than with good old setter methods. You'd have to change applyIWConfig(IndexWriter.Builder) to return IW.Builder and do the concatenation both in the method and in the caller. But, like Mark said, maybe this is just my personal preference and for others not compelling arguments. Or maybe I'm missing some other advantage of the builder pattern? I haven't used/implemented it myself very much yet... Michael Thats just me though. Michael McCandless wrote: OK, I agree, using the builder approach looks compelling! Though what about required settings? EG IW's builder must have Directory, Analyzer. Would we pass these as up-front args to the initial builder? And shouldn't we still specify the version up-front so we can improve defaults over time without breaking back-compat? (Else, how can we change defaults?) EG: IndexWriter.builder(Version.29, dir, analyzer) .setRAMBufferSizeMB(128) .setUseCompoundFile(false) ... .create() ? Mike On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfootear...@gmail.com wrote: On Sat, Oct 3, 2009 at 03:29, Uwe Schindleru...@thetaphi.de wrote: It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); A happy user of google-collections API :-) These builders are really cool! I feel myself caught in the act. There is still a couple of things bothering me. 1. Introducing a builder, we'll have a whole heap of deprecated constructors that will hang there for eternity. And then users will scream in frustration - This class has 14(!) constructors and all of them are deprecated! How on earth am I supposed to create this thing? 2. If someone creates IW with some reflectish javabeanish tools - he's busted. Not that I'm feeling compassionate for such a person. I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. Besides (arguably) cleaner syntax, the lack of which is (arguably) a curse of many Java libraries, it also allows us to return a different concrete implementation of IW without breaking back-compat, and also to choose this concrete implementation based on settings provided. If we feel like doing it at some point. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
I think my preference is swayed by convention/simplicity. The way things are done now are just very intuitive for me. When I sit down to write some code with Lucene, I barley have to think or remember much. It all sticks. Its mostly all basic Java with few patterns. Now google has used some cool patterns to make things like Guice pretty sweet. And because of what Guice does, they are pretty necessary I think. But every time I go back to work on that code, I have to relearn a bunch of stuff/conventions. Its not difficult - but its a small brain annoyance. One of the reasons I fell in love with Lucene is that its just so natural and easy to use and yet still so powerful. Not that I'm claiming the deprecated methods arn't a bit of a pain - but they have never caused me problems. Not a fan of static builder methods either. But hey, sometimes they make sense, so whatever I guess ... Michael Busch wrote: On 10/3/09 4:18 AM, Earwin Burrfoot wrote: Builder pattern allows you to switch concrete implementations as you please, taking parameters into account or not. Besides that there's no real difference. I prefer builder, but that's just me :) Why can't you do that with a factory that takes a config object as parameter? Seems very similar to me... the only difference is syntax, isn't it? And if you have setter methods on the config object or methods that return this that you can concatenate is just personal preference, right? Personally I prefer the setter methods for our usecase, simply because there are so many config options. Maybe you don't want to set them all in the same places in your app code? E.g. in our app we have a method like applyIWConfig(IndexWriter) that, as the name says, applies all settings we have in a customizable config file. However, some IW settings are not customizable, and applied somewhere else in our code. I think with the concatenation pattern this would look less intuitive than with good old setter methods. You'd have to change applyIWConfig(IndexWriter.Builder) to return IW.Builder and do the concatenation both in the method and in the caller. But, like Mark said, maybe this is just my personal preference and for others not compelling arguments. Or maybe I'm missing some other advantage of the builder pattern? I haven't used/implemented it myself very much yet... Michael Thats just me though. Michael McCandless wrote: OK, I agree, using the builder approach looks compelling! Though what about required settings? EG IW's builder must have Directory, Analyzer. Would we pass these as up-front args to the initial builder? And shouldn't we still specify the version up-front so we can improve defaults over time without breaking back-compat? (Else, how can we change defaults?) EG: IndexWriter.builder(Version.29, dir, analyzer) .setRAMBufferSizeMB(128) .setUseCompoundFile(false) ... .create() ? Mike On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfootear...@gmail.com wrote: On Sat, Oct 3, 2009 at 03:29, Uwe Schindleru...@thetaphi.de wrote: It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); A happy user of google-collections API :-) These builders are really cool! I feel myself caught in the act. There is still a couple of things bothering me. 1. Introducing a builder, we'll have a whole heap of deprecated constructors that will hang there for eternity. And then users will scream in frustration - This class has 14(!) constructors and all of them are deprecated! How on earth am I supposed to create this thing? 2. If someone creates IW with some reflectish javabeanish tools - he's busted. Not that I'm feeling compassionate for such a person. I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. Besides (arguably) cleaner syntax, the lack of which is (arguably) a curse of many Java libraries, it also allows us to return a different concrete implementation of IW without breaking back-compat, and also to choose this concrete implementation based on settings provided. If we feel like doing it at some point. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To
Re: Lucene 2.9 and deprecated IR.open() methods
The builder pattern and the config argument to a factory both have the advantage that you can limit changes after creating an object. Some things are just bad to change in mid-stream. The config argument is nice in that you can pass it around to different stake holders, but the builder can be used a bit like that as well. One way to look at it is that a builder is just a config object that happens to have the create method. On Sat, Oct 3, 2009 at 5:09 PM, Michael Busch busch...@gmail.com wrote: But, like Mark said, maybe this is just my personal preference and for others not compelling arguments. Or maybe I'm missing some other advantage of the builder pattern? I haven't used/implemented it myself very much yet... -- Ted Dunning, CTO DeepDyve
Re: Lucene 2.9 and deprecated IR.open() methods
Ted Dunning wrote: The builder pattern and the config argument to a factory both have the advantage that you can limit changes after creating an object. Some things are just bad to change in mid-stream. The config argument is nice in that you can pass it around to different stake holders, but the builder can be used a bit like that as well. Yeah that argument has been made. But I've *never* seen it as an issue. Just seems like a solution looking for a problem. I can see how it's cleaner, not missing that point. But still doesn't make me like it much. One way to look at it is that a builder is just a config object that happens to have the create method. On Sat, Oct 3, 2009 at 5:09 PM, Michael Busch busch...@gmail.com mailto:busch...@gmail.com wrote: But, like Mark said, maybe this is just my personal preference and for others not compelling arguments. Or maybe I'm missing some other advantage of the builder pattern? I haven't used/implemented it myself very much yet... -- Ted Dunning, CTO DeepDyve -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
I was thinking lately about the large quantity of IndexWriter constructors and IndexReader open methods. I'm not sure if this has been proposed before, but what if we introduced new objects, e.g. IndexWriterConfig and IndexReaderConfig. They would contain getter/setter methods for all the different parameters the various constructors and open methods currently have. Then there would only be one IW constructor taking an IndexWriterConfig object as parameter and one open method in IR likewise. Then going forward we won't have to add/deprecate more ctors or open methods, we can then easily extend or deprecate getters/setters in the *Config classes. Michael On 10/3/09 12:41 AM, Uwe Schindler wrote: When looking for press articles about the release of Lucene 2.9, I found the following one from Bernd Fondermann @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html Translation with Google Translate: Deprecated An index reader is created via the static open () factory method, of which there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there are now a total of 14 open-overloaded variants, with eight of them but they are deprecated. This means that there are even some additions that have been directly identified with introduction as deprecated - confusing. The constructor-Deprecation orgy goes for the standard Analyzer, one of the key classes during indexing and querying further. This class has now no-less constructor arguments over what might, perhaps, some downstream libraries bring to stumble to instantiate their analyzer on a property, which contains the class name dynamically. Instead, an object version must be given to set for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the VERSION_29 parameters are deprecated but themselves - very confusing! VERSION_CURRENT is the only safe investment in the future, a value which we certainly also as assignment in a zero-argument constructor would have trusted. To write an index we need an index writer instance. Again, the majority of the 19 possible constructors are about to be put to retire to. What was going wrong with the open() hell in IR? Very strange, I should have looked better. By the way: How to proceed with deprecation removal? Case-by-case (e.g. start with TS API, then these open() calls, then FSDirectory - to list the ones I was involved) or some hyper-patch? By the way, here is my talk @ Hadoop GetTogether in Berlin: http://blog.isabel-drost.de/index.php/archives/category/events/apache-hadoop -get-together-berlin Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); On Sat, Oct 3, 2009 at 02:54, Michael Busch busch...@gmail.com wrote: I was thinking lately about the large quantity of IndexWriter constructors and IndexReader open methods. I'm not sure if this has been proposed before, but what if we introduced new objects, e.g. IndexWriterConfig and IndexReaderConfig. They would contain getter/setter methods for all the different parameters the various constructors and open methods currently have. Then there would only be one IW constructor taking an IndexWriterConfig object as parameter and one open method in IR likewise. Then going forward we won't have to add/deprecate more ctors or open methods, we can then easily extend or deprecate getters/setters in the *Config classes. Michael On 10/3/09 12:41 AM, Uwe Schindler wrote: When looking for press articles about the release of Lucene 2.9, I found the following one from Bernd Fondermann @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html Translation with Google Translate: Deprecated An index reader is created via the static open () factory method, of which there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there are now a total of 14 open-overloaded variants, with eight of them but they are deprecated. This means that there are even some additions that have been directly identified with introduction as deprecated - confusing. The constructor-Deprecation orgy goes for the standard Analyzer, one of the key classes during indexing and querying further. This class has now no-less constructor arguments over what might, perhaps, some downstream libraries bring to stumble to instantiate their analyzer on a property, which contains the class name dynamically. Instead, an object version must be given to set for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the VERSION_29 parameters are deprecated but themselves - very confusing! VERSION_CURRENT is the only safe investment in the future, a value which we certainly also as assignment in a zero-argument constructor would have trusted. To write an index we need an index writer instance. Again, the majority of the 19 possible constructors are about to be put to retire to. What was going wrong with the open() hell in IR? Very strange, I should have looked better. By the way: How to proceed with deprecation removal? Case-by-case (e.g. start with TS API, then these open() calls, then FSDirectory - to list the ones I was involved) or some hyper-patch? By the way, here is my talk @ Hadoop GetTogether in Berlin: http://blog.isabel-drost.de/index.php/archives/category/events/apache-hadoop -get-together-berlin Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); A happy user of google-collections API :-) These builders are really cool! Uwe - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
Sigh. The introduction of new but deprecated methods is silly. Is there some simple automated way to catch/prevent these? The proliferation of ctors/factory methods is a nightmare. Part of the story with IndexReader.open is the switch to readOnly IndexReaders. After the long back-compat discussion we settled on adding new ctors as the best way to make the change. On deprecation of Version.LUCENE_29, that doesn't seem right. In fact I don't think LUCENE_24 should be deprecated, either, since these constants are used by StandardAnalyzer to state compatibility that's equivalent to index format compability (from our last discussion). I think deprecation by separate area makes sense? Mike On Fri, Oct 2, 2009 at 6:41 PM, Uwe Schindler u...@thetaphi.de wrote: When looking for press articles about the release of Lucene 2.9, I found the following one from Bernd Fondermann @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html Translation with Google Translate: Deprecated An index reader is created via the static open () factory method, of which there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there are now a total of 14 open-overloaded variants, with eight of them but they are deprecated. This means that there are even some additions that have been directly identified with introduction as deprecated - confusing. The constructor-Deprecation orgy goes for the standard Analyzer, one of the key classes during indexing and querying further. This class has now no-less constructor arguments over what might, perhaps, some downstream libraries bring to stumble to instantiate their analyzer on a property, which contains the class name dynamically. Instead, an object version must be given to set for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the VERSION_29 parameters are deprecated but themselves - very confusing! VERSION_CURRENT is the only safe investment in the future, a value which we certainly also as assignment in a zero-argument constructor would have trusted. To write an index we need an index writer instance. Again, the majority of the 19 possible constructors are about to be put to retire to. What was going wrong with the open() hell in IR? Very strange, I should have looked better. By the way: How to proceed with deprecation removal? Case-by-case (e.g. start with TS API, then these open() calls, then FSDirectory - to list the ones I was involved) or some hyper-patch? By the way, here is my talk @ Hadoop GetTogether in Berlin: http://blog.isabel-drost.de/index.php/archives/category/events/apache-hadoop -get-together-berlin Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
I think this would make sense... though, it'd be a shame if the simple case becomes overbearing. Maybe we can keep good defaults, but use Version to allow us to change them. So eg: new IndexWriter(new IndexWriter.Config(dir, analyzer, Version.LUCENE_29)); would be the simple case. Mike On Fri, Oct 2, 2009 at 6:54 PM, Michael Busch busch...@gmail.com wrote: I was thinking lately about the large quantity of IndexWriter constructors and IndexReader open methods. I'm not sure if this has been proposed before, but what if we introduced new objects, e.g. IndexWriterConfig and IndexReaderConfig. They would contain getter/setter methods for all the different parameters the various constructors and open methods currently have. Then there would only be one IW constructor taking an IndexWriterConfig object as parameter and one open method in IR likewise. Then going forward we won't have to add/deprecate more ctors or open methods, we can then easily extend or deprecate getters/setters in the *Config classes. Michael On 10/3/09 12:41 AM, Uwe Schindler wrote: When looking for press articles about the release of Lucene 2.9, I found the following one from Bernd Fondermann @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html Translation with Google Translate: Deprecated An index reader is created via the static open () factory method, of which there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there are now a total of 14 open-overloaded variants, with eight of them but they are deprecated. This means that there are even some additions that have been directly identified with introduction as deprecated - confusing. The constructor-Deprecation orgy goes for the standard Analyzer, one of the key classes during indexing and querying further. This class has now no-less constructor arguments over what might, perhaps, some downstream libraries bring to stumble to instantiate their analyzer on a property, which contains the class name dynamically. Instead, an object version must be given to set for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the VERSION_29 parameters are deprecated but themselves - very confusing! VERSION_CURRENT is the only safe investment in the future, a value which we certainly also as assignment in a zero-argument constructor would have trusted. To write an index we need an index writer instance. Again, the majority of the 19 possible constructors are about to be put to retire to. What was going wrong with the open() hell in IR? Very strange, I should have looked better. By the way: How to proceed with deprecation removal? Case-by-case (e.g. start with TS API, then these open() calls, then FSDirectory - to list the ones I was involved) or some hyper-patch? By the way, here is my talk @ Hadoop GetTogether in Berlin: http://blog.isabel-drost.de/index.php/archives/category/events/apache-hadoop -get-together-berlin Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 1:37 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods I think this would make sense... though, it'd be a shame if the simple case becomes overbearing. Maybe we can keep good defaults, but use Version to allow us to change them. So eg: new IndexWriter(new IndexWriter.Config(dir, analyzer, Version.LUCENE_29)); would be the simple case. Mike On Fri, Oct 2, 2009 at 6:54 PM, Michael Busch busch...@gmail.com wrote: I was thinking lately about the large quantity of IndexWriter constructors and IndexReader open methods. I'm not sure if this has been proposed before, but what if we introduced new objects, e.g. IndexWriterConfig and IndexReaderConfig. They would contain getter/setter methods for all the different parameters the various constructors and open methods currently have. Then there would only be one IW constructor taking an IndexWriterConfig object as parameter and one open method in IR likewise. Then going forward we won't have to add/deprecate more ctors or open methods, we can then easily extend or deprecate getters/setters in the *Config classes. Michael On 10/3/09 12:41 AM, Uwe Schindler wrote: When looking for press articles about the release of Lucene 2.9, I found the following one from Bernd Fondermann @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html Translation with Google Translate: --- - Deprecated An index reader is created via the static open () factory method, of which there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there are now a total of 14 open-overloaded variants, with eight of them but they are deprecated. This means that there are even some additions that have been directly identified with introduction as deprecated - confusing. The constructor-Deprecation orgy goes for the standard Analyzer, one of the key classes during indexing and querying further. This class has now no-less constructor arguments over what might, perhaps, some downstream libraries bring to stumble to instantiate their analyzer on a property, which contains the class name dynamically. Instead, an object version must be given to set for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the VERSION_29 parameters are deprecated but themselves - very confusing! VERSION_CURRENT is the only safe investment in the future, a value which we certainly also as assignment in a zero-argument constructor would have trusted. To write an index we need an index writer instance. Again, the majority of the 19 possible constructors are about to be put to retire to. --- - What was going wrong with the open() hell in IR? Very strange, I should have looked better. By the way: How to proceed with deprecation removal? Case-by-case (e.g. start with TS API, then these open() calls, then FSDirectory - to list the ones I was involved) or some hyper-patch? By the way, here is my talk @ Hadoop GetTogether in Berlin: http://blog.isabel-drost.de/index.php/archives/category/events/apache- hadoop -get-together-berlin Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. This is a really cool example of this builder pattern: http://google-collections.googlecode.com/svn/trunk/javadoc/com/google/ common/collect/MapMaker.html -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 1:37 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods I think this would make sense... though, it'd be a shame if the simple case becomes overbearing. Maybe we can keep good defaults, but use Version to allow us to change them. So eg: new IndexWriter(new IndexWriter.Config(dir, analyzer, Version.LUCENE_29)); would be the simple case. Mike On Fri, Oct 2, 2009 at 6:54 PM, Michael Busch busch...@gmail.com wrote: I was thinking lately about the large quantity of IndexWriter constructors and IndexReader open methods. I'm not sure if this has been proposed before, but what if we introduced new objects, e.g. IndexWriterConfig and IndexReaderConfig. They would contain getter/setter methods for all the different parameters the various constructors and open methods currently have. Then there would only be one IW constructor taking an IndexWriterConfig object as parameter and one open method in IR likewise. Then going forward we won't have to add/deprecate more ctors or open methods, we can then easily extend or deprecate getters/setters in the *Config classes. Michael On 10/3/09 12:41 AM, Uwe Schindler wrote: When looking for press articles about the release of Lucene 2.9, I found the following one from Bernd Fondermann @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html Translation with Google Translate: - -- - Deprecated An index reader is created via the static open () factory method, of which there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there are now a total of 14 open-overloaded variants, with eight of them but they are deprecated. This means that there are even some additions that have been directly identified with introduction as deprecated - confusing. The constructor-Deprecation orgy goes for the standard Analyzer, one of the key classes during indexing and querying further. This class has now no-less constructor arguments over what might, perhaps, some downstream libraries bring to stumble to instantiate their analyzer on a property, which contains the class name dynamically. Instead, an object version must be given to set for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the VERSION_29 parameters are deprecated but themselves - very confusing! VERSION_CURRENT is the only safe investment in the future, a value which we certainly also as assignment in a zero-argument constructor would have trusted. To write an index we need an index writer instance. Again, the majority of the 19 possible constructors are about to be put to retire to. - -- - What was going wrong with the open() hell in IR? Very strange, I should have looked better. By the way: How to proceed with deprecation removal? Case-by-case (e.g. start with TS API, then these open() calls, then FSDirectory - to list the ones I was involved) or some hyper-patch? By the way, here is my talk @ Hadoop GetTogether in Berlin: http://blog.isabel- drost.de/index.php/archives/category/events/apache- hadoop -get-together-berlin Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote: It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); A happy user of google-collections API :-) These builders are really cool! I feel myself caught in the act. There is still a couple of things bothering me. 1. Introducing a builder, we'll have a whole heap of deprecated constructors that will hang there for eternity. And then users will scream in frustration - This class has 14(!) constructors and all of them are deprecated! How on earth am I supposed to create this thing? 2. If someone creates IW with some reflectish javabeanish tools - he's busted. Not that I'm feeling compassionate for such a person. I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. Besides (arguably) cleaner syntax, the lack of which is (arguably) a curse of many Java libraries, it also allows us to return a different concrete implementation of IW without breaking back-compat, and also to choose this concrete implementation based on settings provided. If we feel like doing it at some point. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
RE: Lucene 2.9 and deprecated IR.open() methods
I already started with removing deprecations in o.a.l.store and make FSDir abstract. This package is finished, now I have to remove all these open()/ctors using getDirectory(). Will post a patch tomorrow! Good night! Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.com] Sent: Saturday, October 03, 2009 1:33 AM To: java-dev@lucene.apache.org Subject: Re: Lucene 2.9 and deprecated IR.open() methods Sigh. The introduction of new but deprecated methods is silly. Is there some simple automated way to catch/prevent these? The proliferation of ctors/factory methods is a nightmare. Part of the story with IndexReader.open is the switch to readOnly IndexReaders. After the long back-compat discussion we settled on adding new ctors as the best way to make the change. On deprecation of Version.LUCENE_29, that doesn't seem right. In fact I don't think LUCENE_24 should be deprecated, either, since these constants are used by StandardAnalyzer to state compatibility that's equivalent to index format compability (from our last discussion). I think deprecation by separate area makes sense? Mike On Fri, Oct 2, 2009 at 6:41 PM, Uwe Schindler u...@thetaphi.de wrote: When looking for press articles about the release of Lucene 2.9, I found the following one from Bernd Fondermann @ http://it-republik.de/jaxenter/artikel/Apache-Lucene-2.9-2594.html Translation with Google Translate: Deprecated An index reader is created via the static open () factory method, of which there were 2.4 in all nine. Five of them are now deprecated. In 2.9 there are now a total of 14 open-overloaded variants, with eight of them but they are deprecated. This means that there are even some additions that have been directly identified with introduction as deprecated - confusing. The constructor-Deprecation orgy goes for the standard Analyzer, one of the key classes during indexing and querying further. This class has now no- less constructor arguments over what might, perhaps, some downstream libraries bring to stumble to instantiate their analyzer on a property, which contains the class name dynamically. Instead, an object version must be given to set for compatibility with 2.4 or 2.9. Both the VERSION_24 as well as the VERSION_29 parameters are deprecated but themselves - very confusing! VERSION_CURRENT is the only safe investment in the future, a value which we certainly also as assignment in a zero-argument constructor would have trusted. To write an index we need an index writer instance. Again, the majority of the 19 possible constructors are about to be put to retire to. What was going wrong with the open() hell in IR? Very strange, I should have looked better. By the way: How to proceed with deprecation removal? Case-by-case (e.g. start with TS API, then these open() calls, then FSDirectory - to list the ones I was involved) or some hyper-patch? By the way, here is my talk @ Hadoop GetTogether in Berlin: http://blog.isabel-drost.de/index.php/archives/category/events/apache- hadoop -get-together-berlin Uwe - Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
OK, I agree, using the builder approach looks compelling! Though what about required settings? EG IW's builder must have Directory, Analyzer. Would we pass these as up-front args to the initial builder? And shouldn't we still specify the version up-front so we can improve defaults over time without breaking back-compat? (Else, how can we change defaults?) EG: IndexWriter.builder(Version.29, dir, analyzer) .setRAMBufferSizeMB(128) .setUseCompoundFile(false) ... .create() ? Mike On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com wrote: On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote: It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); A happy user of google-collections API :-) These builders are really cool! I feel myself caught in the act. There is still a couple of things bothering me. 1. Introducing a builder, we'll have a whole heap of deprecated constructors that will hang there for eternity. And then users will scream in frustration - This class has 14(!) constructors and all of them are deprecated! How on earth am I supposed to create this thing? 2. If someone creates IW with some reflectish javabeanish tools - he's busted. Not that I'm feeling compassionate for such a person. I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. Besides (arguably) cleaner syntax, the lack of which is (arguably) a curse of many Java libraries, it also allows us to return a different concrete implementation of IW without breaking back-compat, and also to choose this concrete implementation based on settings provided. If we feel like doing it at some point. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
Call me old fashioned, but I like how the non constructor params are set now. And for some reason I like a config object over a builder pattern for the required constructor params. Thats just me though. Michael McCandless wrote: OK, I agree, using the builder approach looks compelling! Though what about required settings? EG IW's builder must have Directory, Analyzer. Would we pass these as up-front args to the initial builder? And shouldn't we still specify the version up-front so we can improve defaults over time without breaking back-compat? (Else, how can we change defaults?) EG: IndexWriter.builder(Version.29, dir, analyzer) .setRAMBufferSizeMB(128) .setUseCompoundFile(false) ... .create() ? Mike On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com wrote: On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote: It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); A happy user of google-collections API :-) These builders are really cool! I feel myself caught in the act. There is still a couple of things bothering me. 1. Introducing a builder, we'll have a whole heap of deprecated constructors that will hang there for eternity. And then users will scream in frustration - This class has 14(!) constructors and all of them are deprecated! How on earth am I supposed to create this thing? 2. If someone creates IW with some reflectish javabeanish tools - he's busted. Not that I'm feeling compassionate for such a person. I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. Besides (arguably) cleaner syntax, the lack of which is (arguably) a curse of many Java libraries, it also allows us to return a different concrete implementation of IW without breaking back-compat, and also to choose this concrete implementation based on settings provided. If we feel like doing it at some point. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
Though what about required settings? EG IW's builder must have Directory, Analyzer. Would we pass these as up-front args to the initial builder? I'd try to keep required settings at minimum. The only one absolutely required, imho, is a Directory, and it's best to specify it in create() method, so you could set all your IW parameters and then build several instances, for different Directories for example. If you decide to add more required settings, we're back to square one - after a couple of years we're looking at 14 builder() methods. Okay, there is a way. Take a look at how Guice handles binding declarations in Modules - different builder methods may return different interfaces implemented by 'this'. class IndexWriter { public static NoAnalyzerYetBuilder builder() { return new HiddenTrueBuilder(); } interface NoAnalyzerYetBuilder { NoAnalyzerYetBuilder setRAMBuffer(...) NoAnalyzerYetBuilder setUseCompound(...) Builder setAnalyzer(Analyzer) } interface Builder extends NoAnalyzerYetBuilder { Builder setRAMBuffer(...) Builder setUseCompound (...) IndexWriter create(Directory) } private static class HiddenTrueBuilder implements Builder { } } This approach looks nice from client-side, but is a mess to implement. And shouldn't we still specify the version up-front so we can improve defaults over time without breaking back-compat? (Else, how can we change defaults?) EG: IndexWriter.builder(Version.29, dir, analyzer) .setRAMBufferSizeMB(128) .setUseCompoundFile(false) ... .create() ? It's probably okay to specify version upfront. But also, nothing bad happens if we do it like: IndexWriter.builder(). defaultsFor(Version.29). setRam... Mike On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com wrote: On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote: It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); A happy user of google-collections API :-) These builders are really cool! I feel myself caught in the act. There is still a couple of things bothering me. 1. Introducing a builder, we'll have a whole heap of deprecated constructors that will hang there for eternity. And then users will scream in frustration - This class has 14(!) constructors and all of them are deprecated! How on earth am I supposed to create this thing? 2. If someone creates IW with some reflectish javabeanish tools - he's busted. Not that I'm feeling compassionate for such a person. I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. Besides (arguably) cleaner syntax, the lack of which is (arguably) a curse of many Java libraries, it also allows us to return a different concrete implementation of IW without breaking back-compat, and also to choose this concrete implementation based on settings provided. If we feel like doing it at some point. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. And for some reason I like a config object over a builder pattern for the required constructor params. Builder pattern allows you to switch concrete implementations as you please, taking parameters into account or not. Besides that there's no real difference. I prefer builder, but that's just me :) Thats just me though. Michael McCandless wrote: OK, I agree, using the builder approach looks compelling! Though what about required settings? EG IW's builder must have Directory, Analyzer. Would we pass these as up-front args to the initial builder? And shouldn't we still specify the version up-front so we can improve defaults over time without breaking back-compat? (Else, how can we change defaults?) EG: IndexWriter.builder(Version.29, dir, analyzer) .setRAMBufferSizeMB(128) .setUseCompoundFile(false) ... .create() ? Mike On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com wrote: On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote: It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); A happy user of google-collections API :-) These builders are really cool! I feel myself caught in the act. There is still a couple of things bothering me. 1. Introducing a builder, we'll have a whole heap of deprecated constructors that will hang there for eternity. And then users will scream in frustration - This class has 14(!) constructors and all of them are deprecated! How on earth am I supposed to create this thing? 2. If someone creates IW with some reflectish javabeanish tools - he's busted. Not that I'm feeling compassionate for such a person. I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. Besides (arguably) cleaner syntax, the lack of which is (arguably) a curse of many Java libraries, it also allows us to return a different concrete implementation of IW without breaking back-compat, and also to choose this concrete implementation based on settings provided. If we feel like doing it at some point. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
On Oct 2, 2009, at 10:18 PM, Earwin Burrfoot ear...@gmail.com wrote: Call me old fashioned, but I like how the non constructor params are set now. And what happens when you index some docs, change these params, index more docs, change params, commit? Let's throw in some threads? You either end up writing really hairy state control code, or just leave it broken, with Don't change parameters after you start pumping docs through it! plea covering your back somewhere in JavaDocs. If nothing else, having stuff 'final' keeps JIT really happy. And for some reason I like a config object over a builder pattern for the required constructor params. Builder pattern allows you to switch concrete implementations as you please, taking parameters into account or not. Besides that there's no real difference. I prefer builder, but that's just me :) Nope. So far it's you and a couple others ;) Thats just me though. Michael McCandless wrote: OK, I agree, using the builder approach looks compelling! Though what about required settings? EG IW's builder must have Directory, Analyzer. Would we pass these as up-front args to the initial builder? And shouldn't we still specify the version up-front so we can improve defaults over time without breaking back-compat? (Else, how can we change defaults?) EG: IndexWriter.builder(Version.29, dir, analyzer) .setRAMBufferSizeMB(128) .setUseCompoundFile(false) ... .create() ? Mike On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com wrote: On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote: It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); A happy user of google-collections API :-) These builders are really cool! I feel myself caught in the act. There is still a couple of things bothering me. 1. Introducing a builder, we'll have a whole heap of deprecated constructors that will hang there for eternity. And then users will scream in frustration - This class has 14(!) constructors and all of them are deprecated! How on earth am I supposed to create this thing? 2. If someone creates IW with some reflectish javabeanish tools - he's busted. Not that I'm feeling compassionate for such a person. I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. Besides (arguably) cleaner syntax, the lack of which is (arguably) a curse of many Java libraries, it also allows us to return a different concrete implementation of IW without breaking back-compat, and also to choose this concrete implementation based on settings provided. If we feel like doing it at some point. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail .com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 --- -- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org --- -- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- - Mark http://www.lucidimagination.com - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org
Re: Lucene 2.9 and deprecated IR.open() methods
Again - random opinion from left field - I've used guice and I like it a lot. Really cool stuff and I actually prefer it to Spring for injection. But still for some reason I'd hate to see Lucene start resembling anything in Guice. I'm not even taking the time to make arguments, so I don't expect these comments to have much weight (they don't by definition) - but just putting my opinion out there. - Mark http://www.lucidimagination.com (mobile) On Oct 2, 2009, at 10:10 PM, Earwin Burrfoot ear...@gmail.com wrote: Though what about required settings? EG IW's builder must have Directory, Analyzer. Would we pass these as up-front args to the initial builder? I'd try to keep required settings at minimum. The only one absolutely required, imho, is a Directory, and it's best to specify it in create() method, so you could set all your IW parameters and then build several instances, for different Directories for example. If you decide to add more required settings, we're back to square one - after a couple of years we're looking at 14 builder() methods. Okay, there is a way. Take a look at how Guice handles binding declarations in Modules - different builder methods may return different interfaces implemented by 'this'. class IndexWriter { public static NoAnalyzerYetBuilder builder() { return new HiddenTrueBuilder(); } interface NoAnalyzerYetBuilder { NoAnalyzerYetBuilder setRAMBuffer(...) NoAnalyzerYetBuilder setUseCompound(...) Builder setAnalyzer(Analyzer) } interface Builder extends NoAnalyzerYetBuilder { Builder setRAMBuffer(...) Builder setUseCompound (...) IndexWriter create(Directory) } private static class HiddenTrueBuilder implements Builder { } } This approach looks nice from client-side, but is a mess to implement. And shouldn't we still specify the version up-front so we can improve defaults over time without breaking back-compat? (Else, how can we change defaults?) EG: IndexWriter.builder(Version.29, dir, analyzer) .setRAMBufferSizeMB(128) .setUseCompoundFile(false) ... .create() ? It's probably okay to specify version upfront. But also, nothing bad happens if we do it like: IndexWriter.builder(). defaultsFor(Version.29). setRam... Mike On Fri, Oct 2, 2009 at 7:45 PM, Earwin Burrfoot ear...@gmail.com wrote: On Sat, Oct 3, 2009 at 03:29, Uwe Schindler u...@thetaphi.de wrote: It is also probably a good idea to move various settings methods from IW to that builder and have IW immutable in regards to configuration. I'm speaking of the likes of setWriteLockTimeout, setRAMBufferSizeMB, setMergePolicy, setMergeScheduler, setSimilarity. IndexWriter.Builder iwb = IndexWriter.builder(). writeLockTimeout(0). RAMBufferSize(config.indexationBufferMB). maxBufferedDocs(...). similarity(...). analyzer(...); ... = iwb.build(dir1); ... = iwb.build(dir2); A happy user of google-collections API :-) These builders are really cool! I feel myself caught in the act. There is still a couple of things bothering me. 1. Introducing a builder, we'll have a whole heap of deprecated constructors that will hang there for eternity. And then users will scream in frustration - This class has 14(!) constructors and all of them are deprecated! How on earth am I supposed to create this thing? 2. If someone creates IW with some reflectish javabeanish tools - he's busted. Not that I'm feeling compassionate for such a person. I like Earwin's version more. A builder is very flexible, because you can concat all your properties (like StringBuilder works with its append method returning itself) and create the instance at the end. Besides (arguably) cleaner syntax, the lack of which is (arguably) a curse of many Java libraries, it also allows us to return a different concrete implementation of IW without breaking back-compat, and also to choose this concrete implementation based on settings provided. If we feel like doing it at some point. -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 --- -- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org -- Kirill Zakharenko/Кирилл Захаренко (ear...@gmail.com) Home / Mobile: +7 (495) 683-567-4 / +7 (903) 5-888-423 ICQ: 104465785 - To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org - To unsubscribe, e-mail: