Re: Solr 5: Is anybody working on better basic example/tutorial?
Right, I agree that my example was way too extreme. Good for a slide/hackathon, but I recognize the need for most of the other items. But the important part here is whether there are other people already working on this? I would much prefer to have a group effort around this than going at it alone. Regards, Alex. P.s. I meant 30 lines as compared to 560 :-) I am not sweating the difference between 30 and 50 lines too much. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 On 22 November 2014 at 17:06, Shawn Heisey wrote: > On 11/22/2014 10:41 AM, Alexandre Rafalovitch wrote: >> I can't find a relevant Jira/discussion space if this exists. I >> strongly feel that the "basic" example is still far from basic and >> there needs to be a subgroup of people discussing of what can be >> cut-off to demonstrate a true minimal configuration. >> >> I am happy to take a lead on that if nobody is doing it, but I would >> like to do it as part of a group that can focus on _deleting_ >> explanations, defaults, and near-identical definitions. >> >> My hope would be to have a solrconfig and schema that are under 30 >> lines each not counting license. As a too-extreme example, I can offer >> my earlier attempts to under-15-lines configuration: >> https://github.com/arafalov/simplest-solr-config/tree/master/simplest-solr/collection1/conf >> >> I think such an example schema would go hand-in-hand with writing a >> tutorial and would assist in telling an interesting story in the >> "simplest terms" possible. > > The general goal you've outlined sounds really good to me. The only > criticism I have (and I hope it's constructive) is that your small > schema/solrconfig files are basically hiding EVERYTHING. I agree that > the examples we currently have count as information overload, but > stripping it too far might represent another problem. > > In particular, the lack of an analyzed textField type makes simple > keyword search impossible with that example -- and I believe that > keyword search is one of the primary reasons that a new user will look > into Solr. The text_general type in the full example seems reasonable > ... there's some complexity, but it's not SUPER complicated. One > numeric type/field might be a good thing to add as well, and perhaps > _version_ too. > > The solrconfig should have the transaction log turned on. The default > directory factory is the NRT version, which in some circumstances can > hold onto index data only in RAM, which the transaction log protects > against. Having the transaction log turned on means that autoCommit > with openSearcher=false should be configured as well. While these may > not be strictly required for a proof of concept or demo system, these > are part of what I believe are best practices, which we should encourage > in ALL our examples. > > A very simple and short config snippet: > > > > > 25000 > 30 > false > > > > > With both of these ideas added, the size of the schema would still be in > the ballpark of 30 lines, and the solrconfig would be a lot less. There > may be other best practices that need to be considered, which might push > things beyond the 30 line goal you have mentioned. > > Thanks, > Shawn > > > - > To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org > For additional commands, e-mail: dev-h...@lucene.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Re: Solr 5: Is anybody working on better basic example/tutorial?
On 11/22/2014 10:41 AM, Alexandre Rafalovitch wrote: > I can't find a relevant Jira/discussion space if this exists. I > strongly feel that the "basic" example is still far from basic and > there needs to be a subgroup of people discussing of what can be > cut-off to demonstrate a true minimal configuration. > > I am happy to take a lead on that if nobody is doing it, but I would > like to do it as part of a group that can focus on _deleting_ > explanations, defaults, and near-identical definitions. > > My hope would be to have a solrconfig and schema that are under 30 > lines each not counting license. As a too-extreme example, I can offer > my earlier attempts to under-15-lines configuration: > https://github.com/arafalov/simplest-solr-config/tree/master/simplest-solr/collection1/conf > > I think such an example schema would go hand-in-hand with writing a > tutorial and would assist in telling an interesting story in the > "simplest terms" possible. The general goal you've outlined sounds really good to me. The only criticism I have (and I hope it's constructive) is that your small schema/solrconfig files are basically hiding EVERYTHING. I agree that the examples we currently have count as information overload, but stripping it too far might represent another problem. In particular, the lack of an analyzed textField type makes simple keyword search impossible with that example -- and I believe that keyword search is one of the primary reasons that a new user will look into Solr. The text_general type in the full example seems reasonable ... there's some complexity, but it's not SUPER complicated. One numeric type/field might be a good thing to add as well, and perhaps _version_ too. The solrconfig should have the transaction log turned on. The default directory factory is the NRT version, which in some circumstances can hold onto index data only in RAM, which the transaction log protects against. Having the transaction log turned on means that autoCommit with openSearcher=false should be configured as well. While these may not be strictly required for a proof of concept or demo system, these are part of what I believe are best practices, which we should encourage in ALL our examples. A very simple and short config snippet: 25000 30 false With both of these ideas added, the size of the schema would still be in the ballpark of 30 lines, and the solrconfig would be a lot less. There may be other best practices that need to be considered, which might push things beyond the 30 line goal you have mentioned. Thanks, Shawn - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org
Solr 5: Is anybody working on better basic example/tutorial?
Hello, I can't find a relevant Jira/discussion space if this exists. I strongly feel that the "basic" example is still far from basic and there needs to be a subgroup of people discussing of what can be cut-off to demonstrate a true minimal configuration. I am happy to take a lead on that if nobody is doing it, but I would like to do it as part of a group that can focus on _deleting_ explanations, defaults, and near-identical definitions. My hope would be to have a solrconfig and schema that are under 30 lines each not counting license. As a too-extreme example, I can offer my earlier attempts to under-15-lines configuration: https://github.com/arafalov/simplest-solr-config/tree/master/simplest-solr/collection1/conf I think such an example schema would go hand-in-hand with writing a tutorial and would assist in telling an interesting story in the "simplest terms" possible. Regard, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter: http://www.solr-start.com/ and @solrstart Solr popularizers community: https://www.linkedin.com/groups?gid=6713853 - To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org