[ https://issues.apache.org/jira/browse/SOLR-3691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435851#comment-13435851 ]
Jan Høydahl commented on SOLR-3691: ----------------------------------- Here's the new help screen including "web" mode, "depth" and "delay" support: {noformat} SimplePostTool version 1.5 Usage: java [SystemProperties] -jar post.jar [-h|-] [<file|folder|url|arg> [<file|folder|url|arg>...]] Supported System Properties and their defaults: -Ddata=files|web|args|stdin (default=files) -Dtype=<content-type> (default=application/xml) -Durl=<solr-update-url> (default=http://localhost:8983/solr/update) -Dauto=yes|no (default=no) -Drecursive=yes|no|<depth> (default=0) -Ddelay=<seconds> (default=0 for files, 10 for web) -Dfiletypes=<type>[,<type>,...] (default=xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log) -Dparams="<key>=<value>[&<key>=<value>...]" (values must be URL-encoded) -Dcommit=yes|no (default=yes) -Doptimize=yes|no (default=no) -Dout=yes|no (default=no) This is a simple command line tool for POSTing raw data to a Solr port. Data can be read from files specified as commandline args, URLs specified as args, as raw commandline arg strings or via STDIN. Examples: java -jar post.jar *.xml java -Ddata=args -jar post.jar '<delete><id>42</id></delete>' java -Ddata=stdin -jar post.jar < hd.xml java -Ddata=web -jar post.jar http://example.com/ java -Dtype=text/csv -jar post.jar *.csv java -Dtype=application/json -jar post.jar *.json java -Durl=http://localhost:8983/solr/update/extract -Dparams=literal.id=a -Dtype=application/pdf -jar post.jar a.pdf java -Dauto -jar post.jar * java -Dauto -Drecursive -jar post.jar afolder java -Dauto -Dfiletypes=ppt,html -jar post.jar afolder The options controlled by System Properties include the Solr URL to POST to, the Content-Type of the data, whether a commit or optimize should be executed, and whether the response should be written to STDOUT. If auto=yes the tool will try to set type and url automatically from file name. When posting rich documents the file name will be propagated as "resource.name" and also used as "literal.id". You may override these or any other request parameter through the -Dparams property. To do a commit only, use "-" as argument. The web mode is a simple crawler following links within domain, default delay=10s. {noformat} > SimplePostTool: Mode for indexing a web page > -------------------------------------------- > > Key: SOLR-3691 > URL: https://issues.apache.org/jira/browse/SOLR-3691 > Project: Solr > Issue Type: Bug > Components: scripts and tools > Reporter: Jan Høydahl > Assignee: Jan Høydahl > Fix For: 4.0 > > Attachments: SOLR-3691.patch, SOLR-3691.patch, SOLR-3691.patch, > SOLR-3691.patch > > > The simple post.jar tool should both show some sample code as well as aid > users in testing Solr from the command line. Missing is an easy way to index > a web page. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org