Thanks Hoss for the detailed reply! About <useCompoundFile>, I understand the simplification this brings to regular users, but I also think we should protect such users from making silly mistakes, ONLY because they don't have deep understanding of the underlying stuff. Packing 20GB segments in a compound file will most likely buy you nothing at search time, and is a lot of wasted I/O during indexing.
Maybe we can have a new parameter <alwaysUseCompoundFile> to affect both newly flushed and newly merged segments (this will be translated into respective IWC and MP settings), and we make <useCompoundFile> affect IWC only? I know this is a slight change to back-compat, but it's not a serious change as in user indexes will still work as they did, only huge merged segments won't be packed in a CFS. So if anyone asks, we just tell them to migrate to the new API. As for keeping code in for as long as we can ... I have a problem with that. It's like we try to "educate" users about the best use of Solr, through WARN messages, but then don't make them do the actual cut ever, unless it gets in the developers' way. I'd prefer that we treat the XML files like any other API -- we deprecate and remove in the next major release. Users have all the time through 4.x to get used to the new API. When 5.0 is out, they have to make the cut, and since we also document that in the migration guide, it should be enough, no? Shai On Thu, Aug 7, 2014 at 7:49 PM, Chris Hostetter <hossman_luc...@fucit.org> wrote: > > : I understand that this might seem as a simplification to users, where > they > : set this value once and it controls both places, but I think it's bad. > : First, because if you set <useCompoundFile>, you basically *always* end > up > : w/ CFS, even if you intend that to apply to only newly flushed segments. > In > : order to use default settings for merged segments, you have to explicitly > : include the default settings in the <mergePolicy> element. This is > trappy I > : think and looks odd. > > I'm pretty sure this was intentional because it kept things consistent > from a backcompat standpoint, and (for solr users with a high level > understanding, not folks like you who are intimitly familiar with the > underlying code) it's very easy to understand: if you just set > <useCompoundFile/> -- w/o customizing a <mergePolicy/> -- all of your > files are compound. Novice solr users will never be confused why there > are some files that aren't compound. > > Having said that: times change. > > If you think it's trappy behavior for the common case, i won't agrue with > you (somebody else might though). There's nothing to stop us from > changing it, as long as we have a note in the Upgrading section making it > clear what folks need to add to their solrconfig.xml file to maintain > existing behavior if they so choose. (and as you said: beefing up the ref > guide with details about why there are multiple CFS related settings, and > ow they impact perf in different scenerios, etc...) > > : Beyond that, SolrIndexConfig is trunk contains deprecated code around > this > : parameter and somewhat hacks around older schemas that defined useCFS > : inside the MP element -- are we still required to support that > back-compat > : in trunk as well? > > that's more of a judgement call. > > usually what we've done in situations like this is make the backcompat > support log a WARN that the syntax they are using is deprecated and should > be changed, but then we also tend to leave the support in for as long as > feasible -- following the princible of "don't break shit for existing > users unless absolutely neccessary". but "feasible" and "absolutely > neccessary" can vary by situation: if the backcompat hoops we have to jump > through are making the code impossible to maintain, or impossible to add > some new feature, or causing performance problems for the common case (but > that's rare with config still backcompat like we're talking about here) - > then go ahead and rip it out in trunk; but when doing that we also usually > update the "active" (ie: 4x) branch to switch those WARN logs to hard > fails so they don't get overlooked by obtuse users who keep upgrading. > > So, for example: some user has a config file they've been upgrading since > Solr 1.2 that contains a <foo/> tag. in 3.6 the syntax changed, <bar > foo=""/> is the new right way to do things and we added backcompat kludge > for the old syntax, with a WARN log advising them to change -- but the > user never notices it. ~ 4.5 we decided the backcompat logic is getting > to be a bitch to maintain, so on trunk we rip it out, but on 4x we add a > special check that throws a hard starup error if the <foo/> tag was found > in the config -- so as long as the guy upgrades to 4.6 at some point, > he'll know beyond a doubt that he needs to change his config. but if he > manages to upgrade from 4.5 directly to 5.x, then his antique syntax will > just be silently ignored. > > > > > -Hoss > http://www.lucidworks.com/ > > --------------------------------------------------------------------- > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > >