Hello again, I managed to do it. Getting the entire thing to work was tricky. I had to resort to a hack.
I will post how I managed to do it here soon, for people that might be interested in the future. Thanks again. Best, Emre On Fri, Jun 8, 2012 at 12:33 AM, Emre Çelikten <e...@celikten.name> wrote: > Hello Markus, > > Thanks very much for your help. > > I have looked at Nutch source. I think I need to make a different version > of indexSolr method in SolrIndexer.java, yes? The current version is: > > public void indexSolr(String solrUrl, Path crawlDb, Path linkDb, > List<Path> segments, boolean noCommit, boolean deleteGone, String > solrParams) > > I will try to change "String solrUrl" part to "SolrServer server" in the > new method and use my own SolrServer that was created in the application. > Do you think this is a correct approach? > > Best, > > Emre > > > On Thu, Jun 7, 2012 at 11:27 PM, Markus Jelsma <markus.jel...@openindex.io > > wrote: > >> Hello! >> >> Sounds very interesting. Anyway, Solr can run embedded in a Java >> application called EmbeddedSolrServer. You do need to make some changes to >> the SolrIndexer tools in Nutch. >> >> Cheers >> >> -----Original message----- >> > From:Emre Çelikten <e...@celikten.name> >> > Sent: Thu 07-Jun-2012 22:24 >> > To: user@nutch.apache.org >> > Subject: Building Lucene index with Nutch 1.4 >> > >> > Hello everybody, >> > >> > As part of a project, I am working on a FOSS tool that will build >> language >> > models using data obtained from the web which will then be used for >> speech >> > recognition. I plan to make this tool quite compact by encapsulating as >> > much as I can in a single Java application and not requiring the user to >> > install/configure tons of stuff. >> > >> > I have managed to set up Nutch and am able to crawl a website inside a >> Java >> > application. The next thing I need to do is to search for certain >> keywords >> > in the obtained data. I have read that the ability to build Lucene >> indexes >> > has been removed from Nutch and we now need to use Solr instead. The way >> > Solr works (servlets, HTTP) is not really appropriate for a tool that >> only >> > needs search functionality that is invisible to the user. >> > >> > What would you recommend me to do in this case? Is there absolutely no >> way >> > of building Lucene indexes? I could not find anything other than >> > recommendations to use Solr instead. Should I try to use an older >> version >> > of Nutch? >> > >> > Thanks in advance, >> > >> > Emre >> > >> > >