I did not say it was trivial, but I also did not quite mention the previous
research.

https://github.com/arafalov/solr-refguide-indexing/blob/master/src/com/solrstart/refguide/Indexer.java

Uses official AsciidoctorJ library directory. Not sure if that's just JRuby
version of Asciidoctor we currently use to build. But this should only
affect the development process, not the final built package.

I think I am more trying to figure out what people think about shipping an
actual core with the distribution. That is something I haven't seen
done before. And may have issues I did not think of.

Regards,
    Alex

On Mon., Aug. 31, 2020, 10:11 p.m. Gus Heck, <gus.h...@gmail.com> wrote:

> Some background to consider before committing to that... it might not be
> as trivial as you think. (I've often thought it ironic that we don't have
> real search for our ref guide... )
>
> https://www.youtube.com/watch?v=DixlnxAk08s
>
> -Gus
>
> On Mon, Aug 31, 2020 at 2:06 PM Ishan Chattopadhyaya <
> ichattopadhy...@gmail.com> wrote:
>
>> I love the idea of making the ref guide itself as an example dataset.
>> That way, we won't need to ship anything separately. Python's beautiful
>> soup can extract text from the html pages. I'm sure there maybe such things
>> in Java too (can Tika do this?).
>>
>> On Mon, 31 Aug, 2020, 11:18 pm Alexandre Rafalovitch, <arafa...@gmail.com>
>> wrote:
>>
>>> Hi,
>>> I need a sanity check.
>>>
>>> I am in the planning stages for the new example datasets to ship with
>>> Solr 9. The one I am looking at is great for structured information,
>>> but is quite light on full-text content. So, I am thinking of how
>>> important that is and what other sources could be used.
>>>
>>> One - only slightly - crazy idea is to use Solr Reference Guide itself
>>> as a document source. I am not saying we need to include the guide
>>> with Solr distribution, but:
>>> 1) I could include a couple of sample pages
>>> 2) I could index the whole guide (with custom Java-code) during the
>>> final build and we could ship the full index (with stored=false) with
>>> Solr, which then basically becomes a local search for the remote guide
>>> (with absolute URLs).
>>>
>>> Either way would allow us to also explore what a good search
>>> configuration could look like for the Ref Guide for when we are
>>> actually ready to move beyond its current "headings-only" javascript
>>> search. Actually, done right, same/similar tool could also feed
>>> subheadings into the javascript search.
>>>
>>> Like I said, sanity check?
>>>
>>> Regards,
>>>    Alex.
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
>>> For additional commands, e-mail: dev-h...@lucene.apache.org
>>>
>>>
>
> --
> http://www.needhamsoftware.com (work)
> http://www.the111shift.com (play)
>

Reply via email to