Re: Solr 5: Is anybody working on better basic example/tutorial?

2014-11-23 Thread Alexandre Rafalovitch
Right,

I agree that my example was way too extreme. Good for a
slide/hackathon, but I recognize the need for most of the other items.

But the important part here is whether there are other people already
working on this? I would much prefer to have a group effort around
this than going at it alone.

Regards,
   Alex.
P.s. I meant 30 lines as compared to 560 :-) I am not sweating the
difference between 30 and 50 lines too much.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853


On 22 November 2014 at 17:06, Shawn Heisey  wrote:
> On 11/22/2014 10:41 AM, Alexandre Rafalovitch wrote:
>> I can't find a relevant Jira/discussion space if this exists. I
>> strongly feel that the "basic" example is still far from basic and
>> there needs to be a subgroup of people discussing of what can be
>> cut-off to demonstrate a true minimal configuration.
>>
>> I am happy to take a lead on that if nobody is doing it, but I would
>> like to do it as part of a group that can focus on _deleting_
>> explanations, defaults, and near-identical definitions.
>>
>> My hope would be to have a solrconfig and schema that are under 30
>> lines each not counting license. As a too-extreme example, I can offer
>> my earlier attempts to under-15-lines configuration:
>> https://github.com/arafalov/simplest-solr-config/tree/master/simplest-solr/collection1/conf
>>
>> I think such an example schema would go hand-in-hand with writing a
>> tutorial and would assist in telling an interesting story in the
>> "simplest terms" possible.
>
> The general goal you've outlined sounds really good to me.  The only
> criticism I have (and I hope it's constructive) is that your small
> schema/solrconfig files are basically hiding EVERYTHING.  I agree that
> the examples we currently have count as information overload, but
> stripping it too far might represent another problem.
>
> In particular, the lack of an analyzed textField type makes simple
> keyword search impossible with that example -- and I believe that
> keyword search is one of the primary reasons that a new user will look
> into Solr.  The text_general type in the full example seems reasonable
> ... there's some complexity, but it's not SUPER complicated.  One
> numeric type/field might be a good thing to add as well, and perhaps
> _version_ too.
>
> The solrconfig should have the transaction log turned on.  The default
> directory factory is the NRT version, which in some circumstances can
> hold onto index data only in RAM, which the transaction log protects
> against.  Having the transaction log turned on means that autoCommit
> with openSearcher=false should be configured as well.  While these may
> not be strictly required for a proof of concept or demo system, these
> are part of what I believe are best practices, which we should encourage
> in ALL our examples.
>
> A very simple and short config snippet:
>
> 
> 
>   
> 25000
> 30
> false
>   
>   
> 
>
> With both of these ideas added, the size of the schema would still be in
> the ballpark of 30 lines, and the solrconfig would be a lot less.  There
> may be other best practices that need to be considered, which might push
> things beyond the 30 line goal you have mentioned.
>
> Thanks,
> Shawn
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
> For additional commands, e-mail: dev-h...@lucene.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Re: Solr 5: Is anybody working on better basic example/tutorial?

2014-11-22 Thread Shawn Heisey
On 11/22/2014 10:41 AM, Alexandre Rafalovitch wrote:
> I can't find a relevant Jira/discussion space if this exists. I
> strongly feel that the "basic" example is still far from basic and
> there needs to be a subgroup of people discussing of what can be
> cut-off to demonstrate a true minimal configuration.
> 
> I am happy to take a lead on that if nobody is doing it, but I would
> like to do it as part of a group that can focus on _deleting_
> explanations, defaults, and near-identical definitions.
> 
> My hope would be to have a solrconfig and schema that are under 30
> lines each not counting license. As a too-extreme example, I can offer
> my earlier attempts to under-15-lines configuration:
> https://github.com/arafalov/simplest-solr-config/tree/master/simplest-solr/collection1/conf
> 
> I think such an example schema would go hand-in-hand with writing a
> tutorial and would assist in telling an interesting story in the
> "simplest terms" possible.

The general goal you've outlined sounds really good to me.  The only
criticism I have (and I hope it's constructive) is that your small
schema/solrconfig files are basically hiding EVERYTHING.  I agree that
the examples we currently have count as information overload, but
stripping it too far might represent another problem.

In particular, the lack of an analyzed textField type makes simple
keyword search impossible with that example -- and I believe that
keyword search is one of the primary reasons that a new user will look
into Solr.  The text_general type in the full example seems reasonable
... there's some complexity, but it's not SUPER complicated.  One
numeric type/field might be a good thing to add as well, and perhaps
_version_ too.

The solrconfig should have the transaction log turned on.  The default
directory factory is the NRT version, which in some circumstances can
hold onto index data only in RAM, which the transaction log protects
against.  Having the transaction log turned on means that autoCommit
with openSearcher=false should be configured as well.  While these may
not be strictly required for a proof of concept or demo system, these
are part of what I believe are best practices, which we should encourage
in ALL our examples.

A very simple and short config snippet:



  
25000
30
false
  
  


With both of these ideas added, the size of the schema would still be in
the ballpark of 30 lines, and the solrconfig would be a lot less.  There
may be other best practices that need to be considered, which might push
things beyond the 30 line goal you have mentioned.

Thanks,
Shawn


-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org



Solr 5: Is anybody working on better basic example/tutorial?

2014-11-22 Thread Alexandre Rafalovitch
Hello,

I can't find a relevant Jira/discussion space if this exists. I
strongly feel that the "basic" example is still far from basic and
there needs to be a subgroup of people discussing of what can be
cut-off to demonstrate a true minimal configuration.

I am happy to take a lead on that if nobody is doing it, but I would
like to do it as part of a group that can focus on _deleting_
explanations, defaults, and near-identical definitions.

My hope would be to have a solrconfig and schema that are under 30
lines each not counting license. As a too-extreme example, I can offer
my earlier attempts to under-15-lines configuration:
https://github.com/arafalov/simplest-solr-config/tree/master/simplest-solr/collection1/conf

I think such an example schema would go hand-in-hand with writing a
tutorial and would assist in telling an interesting story in the
"simplest terms" possible.

Regard,
   Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853

-
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org