Right,

I agree that my example was way too extreme. Good for a
slide/hackathon, but I recognize the need for most of the other items.

But the important part here is whether there are other people already
working on this? I would much prefer to have a group effort around
this than going at it alone.

Regards,
   Alex.
P.s. I meant 30 lines as compared to 560 :-) I am not sweating the
difference between 30 and 50 lines too much.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 22 November 2014 at 17:06, Shawn Heisey <[email protected]> wrote:
> On 11/22/2014 10:41 AM, Alexandre Rafalovitch wrote:
>> I can't find a relevant Jira/discussion space if this exists. I
>> strongly feel that the "basic" example is still far from basic and
>> there needs to be a subgroup of people discussing of what can be
>> cut-off to demonstrate a true minimal configuration.
>>
>> I am happy to take a lead on that if nobody is doing it, but I would
>> like to do it as part of a group that can focus on _deleting_
>> explanations, defaults, and near-identical definitions.
>>
>> My hope would be to have a solrconfig and schema that are under 30
>> lines each not counting license. As a too-extreme example, I can offer
>> my earlier attempts to under-15-lines configuration:
>> https://github.com/arafalov/simplest-solr-config/tree/master/simplest-solr/collection1/conf
>>
>> I think such an example schema would go hand-in-hand with writing a
>> tutorial and would assist in telling an interesting story in the
>> "simplest terms" possible.
>
> The general goal you've outlined sounds really good to me.  The only
> criticism I have (and I hope it's constructive) is that your small
> schema/solrconfig files are basically hiding EVERYTHING.  I agree that
> the examples we currently have count as information overload, but
> stripping it too far might represent another problem.
>
> In particular, the lack of an analyzed textField type makes simple
> keyword search impossible with that example -- and I believe that
> keyword search is one of the primary reasons that a new user will look
> into Solr.  The text_general type in the full example seems reasonable
> ... there's some complexity, but it's not SUPER complicated.  One
> numeric type/field might be a good thing to add as well, and perhaps
> _version_ too.
>
> The solrconfig should have the transaction log turned on.  The default
> directory factory is the NRT version, which in some circumstances can
> hold onto index data only in RAM, which the transaction log protects
> against.  Having the transaction log turned on means that autoCommit
> with openSearcher=false should be configured as well.  While these may
> not be strictly required for a proof of concept or demo system, these
> are part of what I believe are best practices, which we should encourage
> in ALL our examples.
>
> A very simple and short config snippet:
>
> <!-- the default high-performance update handler -->
> <updateHandler class="solr.DirectUpdateHandler2">
>   <autoCommit>
>     <maxDocs>25000</maxDocs>
>     <maxTime>300000</maxTime>
>     <openSearcher>false</openSearcher>
>   </autoCommit>
>   <updateLog />
> </updateHandler>
>
> With both of these ideas added, the size of the schema would still be in
> the ballpark of 30 lines, and the solrconfig would be a lot less.  There
> may be other best practices that need to be considered, which might push
> things beyond the 30 line goal you have mentioned.
>
> Thanks,
> Shawn
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to