Yea, let's add some warnings and keep post tool for demo purposes.
Perhaps in the tutorial https://solr.apache.org/guide/8_8/solr-tutorial.html we 
could add cURL examples for indexing the data as well as post.jar (using tabs 
like we do with v1/v2 api)?
We can also do a better job suggesting where to look for proper filesystem / 
web crawlers for those who need that.
And as SimplePostTool is not either a good example of how to integrate with 
Solr in Java, we could really need a Solr SDK with code examples of integration 
best practices and "ready-to-use" snippets, using SolrJ.

Jan

> 28. apr. 2021 kl. 22:45 skrev Gus Heck <[email protected]>:
> 
> I've generally been of the impression/opinion that the Post Tool is really 
> just a convenience for folks testing out solr to see what it can do, and not 
> really meant as a production ingestion solution. 
> 
> A little while back I had a client that had a third party tool that 
> "integrated with solr" by invoking post.jar on documents with a script to 
> loop through all the files in a directory and post them (the third party 
> software's direct example of how to integrate, not the client's idea at all). 
> Needless to say this caused difficulties with the gigabytes of data the third 
> party tool had stored in many directories. Of course I don't know, but I'd 
> guess that someone with little experience was tasked with the integration 
> with solr at the third party software company and they followed some 
> examples... then turned them into an "integration" blissfully unaware of the 
> limitations of what they had done.
> 
> I just re-read the ref guide page on post tool 
> <https://solr.apache.org/guide/8_8/post-tool.html>, and there's nothing there 
> to indicate to the reader that this might not be a good production level 
> solution. Also I notice a couple of recent Jira issues regarding handling of 
> corner cases of strange (broken) behavior or content in a web site's 
> response, giving the impression that that user (who reported both issues) 
> might be treading a path that will stretch the bounds of what the post tool 
> can/should be relied upon for. 
> 
> https://issues.apache.org/jira/browse/SOLR-15381 
> <https://issues.apache.org/jira/browse/SOLR-15381>
> https://issues.apache.org/jira/browse/SOLR-15370 
> <https://issues.apache.org/jira/browse/SOLR-15370>
> 
> How do folks feel about adding a warning or info box at the top of post tool 
> docs indicating that it is not meant as a production solution, only as a 
> quick way to test out documents. We might also say something more concrete 
> like "virtually any use for a corpus containing over a few thousand documents 
> is a bad idea"? ... or something like that, suggestions welcome... 
> 
> If folks agree then it seems that these two issues are likely to be WONTFIX.
> 
> -Gus
> 
> -- 
> http://www.needhamsoftware.com <http://www.needhamsoftware.com/> (work)
> http://www.the111shift.com <http://www.the111shift.com/> (play)

Reply via email to