Beginners should experience as little black magic as possible. Post tool is
black magic. Schemaless is black magic. I feel we should remove both.

On Thu, 29 Apr, 2021, 2:56 am Alexandre Rafalovitch, <[email protected]>
wrote:

> "Good enough/Recommended" for what? Serious question.
>
> Because it may be - more than - good enough to "send files to the
> server", but the post tool is also doing a lot of Solr business logic
> that beginner users may not have understood yet. Like automatic
> commit. Like choosing endpoint and content type based on the file
> extension. Like actually saying what it is doing. Beginners may not
> have the bandwidth to understand all those elements in order to index
> their second document (first document being the tutorial one
> copy/paste here).
>
> Removing a post tool because curl is good enough - in my personal view
> - is abandoning beginners. Unless, that "for what" is clear and the
> gap between curl and post tool is filled in some other ways, through
> better documentation or improved API or whatever.
>
> On the original question, I think the post tool is like DIH and like
> the default schema, people stick to them and push their boundaries
> because our beginner->production story is full of gaps. What to do
> about it though, I am not sure. A suggested warning seems like a
> reasonable non-harmful suggestion, though.
>
> Regards,
>    Alex.
>
> On Wed, 28 Apr 2021 at 17:04, Ishan Chattopadhyaya
> <[email protected]> wrote:
> >
> > We should remove the post tool
> > Altogether. Curl is good enough and recommended.
> >
> > On Thu, 29 Apr, 2021, 2:15 am Gus Heck, <[email protected]> wrote:
> >>
> >> I've generally been of the impression/opinion that the Post Tool is
> really just a convenience for folks testing out solr to see what it can do,
> and not really meant as a production ingestion solution.
> >>
> >> A little while back I had a client that had a third party tool that
> "integrated with solr" by invoking post.jar on documents with a script to
> loop through all the files in a directory and post them (the third party
> software's direct example of how to integrate, not the client's idea at
> all). Needless to say this caused difficulties with the gigabytes of data
> the third party tool had stored in many directories. Of course I don't
> know, but I'd guess that someone with little experience was tasked with the
> integration with solr at the third party software company and they followed
> some examples... then turned them into an "integration" blissfully unaware
> of the limitations of what they had done.
> >>
> >> I just re-read the ref guide page on post tool, and there's nothing
> there to indicate to the reader that this might not be a good production
> level solution. Also I notice a couple of recent Jira issues regarding
> handling of corner cases of strange (broken) behavior or content in a web
> site's response, giving the impression that that user (who reported both
> issues) might be treading a path that will stretch the bounds of what the
> post tool can/should be relied upon for.
> >>
> >> https://issues.apache.org/jira/browse/SOLR-15381
> >> https://issues.apache.org/jira/browse/SOLR-15370
> >>
> >> How do folks feel about adding a warning or info box at the top of post
> tool docs indicating that it is not meant as a production solution, only as
> a quick way to test out documents. We might also say something more
> concrete like "virtually any use for a corpus containing over a few
> thousand documents is a bad idea"? ... or something like that, suggestions
> welcome...
> >>
> >> If folks agree then it seems that these two issues are likely to be
> WONTFIX.
> >>
> >> -Gus
> >>
> >> --
> >> http://www.needhamsoftware.com (work)
> >> http://www.the111shift.com (play)
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>

Reply via email to