Right, I agree that my example was way too extreme. Good for a slide/hackathon, but I recognize the need for most of the other items.
But the important part here is whether there are other people already working on this? I would much prefer to have a group effort around this than going at it alone. Regards, Alex. P.s. I meant 30 lines as compared to 560 :-) I am not sweating the difference between 30 and 50 lines too much. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 22 November 2014 at 17:06, Shawn Heisey <[email protected]> wrote: > On 11/22/2014 10:41 AM, Alexandre Rafalovitch wrote: >> I can't find a relevant Jira/discussion space if this exists. I >> strongly feel that the "basic" example is still far from basic and >> there needs to be a subgroup of people discussing of what can be >> cut-off to demonstrate a true minimal configuration. >> >> I am happy to take a lead on that if nobody is doing it, but I would >> like to do it as part of a group that can focus on _deleting_ >> explanations, defaults, and near-identical definitions. >> >> My hope would be to have a solrconfig and schema that are under 30 >> lines each not counting license. As a too-extreme example, I can offer >> my earlier attempts to under-15-lines configuration: >> https://github.com/arafalov/simplest-solr-config/tree/master/simplest-solr/collection1/conf >> >> I think such an example schema would go hand-in-hand with writing a >> tutorial and would assist in telling an interesting story in the >> "simplest terms" possible. > > The general goal you've outlined sounds really good to me. The only > criticism I have (and I hope it's constructive) is that your small > schema/solrconfig files are basically hiding EVERYTHING. I agree that > the examples we currently have count as information overload, but > stripping it too far might represent another problem. > > In particular, the lack of an analyzed textField type makes simple > keyword search impossible with that example -- and I believe that > keyword search is one of the primary reasons that a new user will look > into Solr. The text_general type in the full example seems reasonable > ... there's some complexity, but it's not SUPER complicated. One > numeric type/field might be a good thing to add as well, and perhaps > _version_ too. > > The solrconfig should have the transaction log turned on. The default > directory factory is the NRT version, which in some circumstances can > hold onto index data only in RAM, which the transaction log protects > against. Having the transaction log turned on means that autoCommit > with openSearcher=false should be configured as well. While these may > not be strictly required for a proof of concept or demo system, these > are part of what I believe are best practices, which we should encourage > in ALL our examples. > > A very simple and short config snippet: > > <!-- the default high-performance update handler --> > <updateHandler class="solr.DirectUpdateHandler2"> > <autoCommit> > <maxDocs>25000</maxDocs> > <maxTime>300000</maxTime> > <openSearcher>false</openSearcher> > </autoCommit> > <updateLog /> > </updateHandler> > > With both of these ideas added, the size of the schema would still be in > the ballpark of 30 lines, and the solrconfig would be a lot less. There > may be other best practices that need to be considered, which might push > things beyond the 30 line goal you have mentioned. > > Thanks, > Shawn > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
