Hello, I have done as you asked. I hope I have done it correctly as this was my first patch. Here's the issue: https://issues.apache.org/jira/browse/NUTCH-1382
Here's a tutorial for people that might be interested: http://cmusphinx.sourceforge.net/2012/06/building-a-java-application-with-apache-nutch-and-solr/ It might have slight changes soon since I will proof-read it this evening. Hope that helps. Best, Emre On Fri, Jun 8, 2012 at 2:00 PM, Lewis John Mcgibbney < lewis.mcgibb...@gmail.com> wrote: > Hi Emre, > > Even if you were to open a Jira issue for this and submit a patch of > your hack it would be excellent to have the code available to the > community. > > All the best, oh and glad you got your application working. > Lewis > > On Fri, Jun 8, 2012 at 4:22 AM, Emre Çelikten <e...@celikten.name> wrote: > > Hello again, > > > > I managed to do it. Getting the entire thing to work was tricky. I had to > > resort to a hack. > > > > I will post how I managed to do it here soon, for people that might be > > interested in the future. > > > > Thanks again. > > > > Best, > > > > Emre > > > > On Fri, Jun 8, 2012 at 12:33 AM, Emre Çelikten <e...@celikten.name> > wrote: > > > >> Hello Markus, > >> > >> Thanks very much for your help. > >> > >> I have looked at Nutch source. I think I need to make a different > version > >> of indexSolr method in SolrIndexer.java, yes? The current version is: > >> > >> public void indexSolr(String solrUrl, Path crawlDb, Path linkDb, > >> List<Path> segments, boolean noCommit, boolean deleteGone, String > >> solrParams) > >> > >> I will try to change "String solrUrl" part to "SolrServer server" in the > >> new method and use my own SolrServer that was created in the > application. > >> Do you think this is a correct approach? > >> > >> Best, > >> > >> Emre > >> > >> > >> On Thu, Jun 7, 2012 at 11:27 PM, Markus Jelsma < > markus.jel...@openindex.io > >> > wrote: > >> > >>> Hello! > >>> > >>> Sounds very interesting. Anyway, Solr can run embedded in a Java > >>> application called EmbeddedSolrServer. You do need to make some > changes to > >>> the SolrIndexer tools in Nutch. > >>> > >>> Cheers > >>> > >>> -----Original message----- > >>> > From:Emre Çelikten <e...@celikten.name> > >>> > Sent: Thu 07-Jun-2012 22:24 > >>> > To: user@nutch.apache.org > >>> > Subject: Building Lucene index with Nutch 1.4 > >>> > > >>> > Hello everybody, > >>> > > >>> > As part of a project, I am working on a FOSS tool that will build > >>> language > >>> > models using data obtained from the web which will then be used for > >>> speech > >>> > recognition. I plan to make this tool quite compact by encapsulating > as > >>> > much as I can in a single Java application and not requiring the > user to > >>> > install/configure tons of stuff. > >>> > > >>> > I have managed to set up Nutch and am able to crawl a website inside > a > >>> Java > >>> > application. The next thing I need to do is to search for certain > >>> keywords > >>> > in the obtained data. I have read that the ability to build Lucene > >>> indexes > >>> > has been removed from Nutch and we now need to use Solr instead. The > way > >>> > Solr works (servlets, HTTP) is not really appropriate for a tool that > >>> only > >>> > needs search functionality that is invisible to the user. > >>> > > >>> > What would you recommend me to do in this case? Is there absolutely > no > >>> way > >>> > of building Lucene indexes? I could not find anything other than > >>> > recommendations to use Solr instead. Should I try to use an older > >>> version > >>> > of Nutch? > >>> > > >>> > Thanks in advance, > >>> > > >>> > Emre > >>> > > >>> > >> > >> > > > > -- > Lewis >