So you are suggesting me to iterate file system and index fs tree entities
including: directory names, file names, file size etc. and then post it to
solr?
I need to index the FS tree, not the file contents.

On Tue, Mar 5, 2013 at 5:54 PM, Erik Hatcher <erik.hatc...@gmail.com> wrote:

> Would Solr's post.jar work for you?   It has a directory recurse option.
>  The usage/help output is pasted below.
>
> Here's what should work for you: "java -Dauto -Drecursive -jar post.jar
> /some/folder"
>
>         Erik
>
>
>
> exampledocs  java -jar post.jar --help
> SimplePostTool version 1.5
> Usage: java [SystemProperties] -jar post.jar [-h|-] [<file|folder|url|arg>
> [<file|folder|url|arg>...]]
>
> Supported System Properties and their defaults:
>   -Ddata=files|web|args|stdin (default=files)
>   -Dtype=<content-type> (default=application/xml)
>   -Durl=<solr-update-url> (default=http://localhost:8983/solr/update)
>   -Dauto=yes|no (default=no)
>   -Drecursive=yes|no|<depth> (default=0)
>   -Ddelay=<seconds> (default=0 for files, 10 for web)
>   -Dfiletypes=<type>[,<type>,...]
> (default=xml,json,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf,htm,html,txt,log)
>   -Dparams="<key>=<value>[&<key>=<value>...]" (values must be URL-encoded)
>   -Dcommit=yes|no (default=yes)
>   -Doptimize=yes|no (default=no)
>   -Dout=yes|no (default=no)
>
> This is a simple command line tool for POSTing raw data to a Solr
> port.  Data can be read from files specified as commandline args,
> URLs specified as args, as raw commandline arg strings or via STDIN.
> Examples:
>   java -jar post.jar *.xml
>   java -Ddata=args  -jar post.jar '<delete><id>42</id></delete>'
>   java -Ddata=stdin -jar post.jar < hd.xml
>   java -Ddata=web -jar post.jar http://example.com/
>   java -Dtype=text/csv -jar post.jar *.csv
>   java -Dtype=application/json -jar post.jar *.json
>   java -Durl=http://localhost:8983/solr/update/extract -Dparams=literal.id=a
> -Dtype=application/pdf -jar post.jar a.pdf
>   java -Dauto -jar post.jar *
>   java -Dauto -Drecursive -jar post.jar afolder
>   java -Dauto -Dfiletypes=ppt,html -jar post.jar afolder
> The options controlled by System Properties include the Solr
> URL to POST to, the Content-Type of the data, whether a commit
> or optimize should be executed, and whether the response should
> be written to STDOUT. If auto=yes the tool will try to set type
> and url automatically from file name. When posting rich documents
> the file name will be propagated as "resource.name" and also used
> as "literal.id". You may override these or any other request parameter
> through the -Dparams property. To do a commit only, use "-" as argument.
> The web mode is a simple crawler following links within domain, default
> delay=10s.
>
>
> On Mar 5, 2013, at 04:38 , Syao Work wrote:
>
> > Hello,
> >
> > I am trying to index some FS folder tree.
> > Spent 2 days finding what could be the problem - got nothing :) There
> are not so much examples on indexing File System.
> > In the logs I cant find any exceptions why it does not process the info
> > Data import configuration and debug response are attached
> >
> >
> > Using:
> > 1. solr web admin tool,
> > 2. Java version "1.7.0_09-icedtea"
> >    OpenJDK Runtime Environment (fedora-2.3.7.0.fc17-x86_64)
> >    OpenJDK 64-Bit Server VM (build 23.7-b01, mixed mode)
> >
> > Thank you for your time,
> > Ro
> >
> > P.S. Excuse my bad English, I am not a native English speaker.
> > <data-config.xml><import-debug-response.json>
>
>

Reply via email to