I'd be happy to! My username is SebastianGreenholtz On Mon, Aug 1, 2016, 1:04 PM Mattmann, Chris A (3980) < chris.a.mattm...@jpl.nasa.gov> wrote:
> Great work Sebastien thank you for this. Would you be willing to > update the wiki with this info? Please let me know your username > and I will grant you permissions. > > Cheers, > Chris > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Chris Mattmann, Ph.D. > Chief Architect > Instrument Software and Science Data Systems Section (398) > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > Office: 168-519, Mailstop: 168-527 > Email: chris.a.mattm...@nasa.gov > WWW: http://sunset.usc.edu/~mattmann/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > Director, Information Retrieval and Data Science Group (IRDS) > Adjunct Associate Professor, Computer Science Department > University of Southern California, Los Angeles, CA 90089 USA > WWW: http://irds.usc.edu/ > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > > On 8/1/16, 11:01 AM, "Sebastian Greenholtz" <smgreenho...@gmail.com> > wrote: > > >I struggled with the same thing recently. Nurch 1.12 does work with Solr > >6.1.0, but you have to do two things differently. > > > >1. The schema file that comes with Solr is originally named managed_schema > >and it's stored in > >${SOLR_HOME}/server/solr/configsets/managed_schema > > > >This file should be renamed to schema.xml. > > > >2. To index with Solr, first start up Solr using the command line command > > > >${SOLR_HOME}/bin/start -e cloud -noprompt > > > >Solr should start up at localhost:8983/solr > > > >To run the indexing: > > > >${NUTCH_HOME}/bin/crawl -I -D solr.server.url= > >http://localhost:8983/solr/gettingstarted urls/ segments/ 2 > > > >Some of these parameters can be changed. They are explained here: > >https://wiki.apache.org/nutch/bin/crawl > > > >The thing that isn't explained anywhere is that your solr.server.url value > >is the base url for Solr admin with the core name after the forward slash. > >For the example project, the core is called gettingstarted. > > > >Hope that helps! > > > >Sebastian > > > >On Mon, Aug 1, 2016, 11:39 AM Ondřej Sojka <ondrej.so...@gmail.com> > wrote: > > > >> The last three days, I've been struggling with making Nutch index one > web > >> into Solr. The tutorial on your wiki is extremely outdated and the > command > >> line tool doesn't work like expected. Now I think I may have managed to > >> crawl the web, but not index it into solr. I'm trying to run bin/nutch > >> solrindex crawl (my crawldb I previously entered into bin/crawl), but It > >> returns just the help of solrindex. By the help it outputs, it makes me > >> think the crawldb is the only mandatory parameter. > >> > >> I think there must be an other source of documentation other than the > wiki > >> for recent versions of Nutch, or is the wiki the only source of > >> documentation? With what versions of Solr is Nutch 1.12 compatible? > >> > >> Ondrej Sojka > >> >