Yea, let's add some warnings and keep post tool for demo purposes. Perhaps in the tutorial https://solr.apache.org/guide/8_8/solr-tutorial.html we could add cURL examples for indexing the data as well as post.jar (using tabs like we do with v1/v2 api)? We can also do a better job suggesting where to look for proper filesystem / web crawlers for those who need that. And as SimplePostTool is not either a good example of how to integrate with Solr in Java, we could really need a Solr SDK with code examples of integration best practices and "ready-to-use" snippets, using SolrJ.
Jan > 28. apr. 2021 kl. 22:45 skrev Gus Heck <[email protected]>: > > I've generally been of the impression/opinion that the Post Tool is really > just a convenience for folks testing out solr to see what it can do, and not > really meant as a production ingestion solution. > > A little while back I had a client that had a third party tool that > "integrated with solr" by invoking post.jar on documents with a script to > loop through all the files in a directory and post them (the third party > software's direct example of how to integrate, not the client's idea at all). > Needless to say this caused difficulties with the gigabytes of data the third > party tool had stored in many directories. Of course I don't know, but I'd > guess that someone with little experience was tasked with the integration > with solr at the third party software company and they followed some > examples... then turned them into an "integration" blissfully unaware of the > limitations of what they had done. > > I just re-read the ref guide page on post tool > <https://solr.apache.org/guide/8_8/post-tool.html>, and there's nothing there > to indicate to the reader that this might not be a good production level > solution. Also I notice a couple of recent Jira issues regarding handling of > corner cases of strange (broken) behavior or content in a web site's > response, giving the impression that that user (who reported both issues) > might be treading a path that will stretch the bounds of what the post tool > can/should be relied upon for. > > https://issues.apache.org/jira/browse/SOLR-15381 > <https://issues.apache.org/jira/browse/SOLR-15381> > https://issues.apache.org/jira/browse/SOLR-15370 > <https://issues.apache.org/jira/browse/SOLR-15370> > > How do folks feel about adding a warning or info box at the top of post tool > docs indicating that it is not meant as a production solution, only as a > quick way to test out documents. We might also say something more concrete > like "virtually any use for a corpus containing over a few thousand documents > is a bad idea"? ... or something like that, suggestions welcome... > > If folks agree then it seems that these two issues are likely to be WONTFIX. > > -Gus > > -- > http://www.needhamsoftware.com <http://www.needhamsoftware.com/> (work) > http://www.the111shift.com <http://www.the111shift.com/> (play)
