hello I am trying install the Xml Parser but when the run ant in the step 7 and 8 showme this message
BUILD FAILED C:\nutch-0.9\build.xml:61: Specify at least one source--a file or r source collection. why? Rida Benjelloun wrote: > > Hi, > Here is the steps to install the Xml Parser plugin : > 1- Copy parse-xml in the src/plugin directory > > 2- Copy xmlparser-conf.xml in the conf directory > 3- Add to nutch-site.xml (conf directory) the following property > <property> > <name>plugin.includes</name> > <value>protocol-http|urlfilter > > -regex|parse-(text|xml|html|js)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic</value> > > <description>Regular expression naming plugin directory names to > include. Any plugin not matching this expression is excluded. > In any case you need at least include the nutch-extensionpoints plugin. > By > > default Nutch includes crawling just HTML and plain text via HTTP, > and basic indexing and search plugins. > </description> > </property> > > 4- Modify parse-plugins.xml (conf directory) > <mimeType name="text/xml"> > <plugin id="parse-xml" /> > <plugin id="parse-text" /> > <plugin id="parse-html" /> > <plugin id="parse-rss" /> > </mimeType> > > 5- Modify build.xml in the root directory add parse-xml > 6 - Modify src\plugin build.xml add parse-xml > 7 - Execute ant in src/plugin directory > 8 - Execute ant in the root directory > 9 - Copy parse-xml directory located in nutch-0.8.1/build/plugins to > nutch-0.8.1/plugins > > Best regards > > Rida Benjelloun > > > > > On 11/7/06, Jim Wilson <[EMAIL PROTECTED]> wrote: >> >> I think you should stop sending *bump* emails. >> >> -- Jim >> >> On 11/7/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote: >> > >> > *bump* >> > >> > Any thoughts, anyone? >> > >> > Thanks, >> > Jayant >> > >> > On 11/6/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote: >> > > Hello, >> > > >> > > I have been working on it since then.. I have found one problem. It >> > > seems the plugin parse-xml plugin is not loading. >> > > >> > > One thing I did was put the plugin in the parse-plugins.xml to enable >> > > nutch-0.8.1 to detect that parse-xml is the plugin to be used for xml >> > > content. This is not given in the instructions for the plugin though. >> > > >> > > Because of it I started to get the following error in hadoop.log:- >> > > >> > > 2006-11-06 15:12:33,156 WARN parse.ParserFactory - ParserFactory: >> > > Plugin: parse-xml mapped to contentType text/xml via >> > > parse-plugins.xml, but not enabled via plugin.includes in >> > > nutch-default.xml >> > > >> > > The issue is that I have the plugin enabled in the nutch-site.xml. I >> > > also tried to enable the plugin in nutch-default.xml but I still get >> > > the same error. >> > > >> > > Any thoughts/ pointers on how to make the plugin work? >> > > >> > > Thanks and Best Regards, >> > > Jayant Gandhi >> > > >> > > >> > > On 11/5/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote: >> > > > I am using the default xmlparser-conf.xml, just copied it into >> > > > nutch/conf dir. To test it I used the xml file given in the sample >> > > > directory xmltest.xml and is uploaded at >> http://www.jkg.in/xmltest.xml >> > > > . >> > > > >> > > > I do not get any errors while indexing or parsing. The crawl log is >> > > > attached. I am able to get the xml file in the results when I >> search >> > > > for 'XPath' but when I click the explain link, it doesn't show me >> the >> > > > field dctitle in the index which it should. >> > > > >> > > > I just noticed that hadoop.log has some error for handling xml >> files >> > > > and I cannot see parse-xml loaded, but I have it enabled in my >> > > > nutch-site.conf. I am new to nutch-0.8 and hadoop so I have no idea >> > > > whether this is expected behaviour/ how to fix it. >> > > > >> > > > Thanks and Best Regards, >> > > > Jayant >> > > > >> > > > On 11/5/06, Nutch Newbie <[EMAIL PROTECTED]> wrote: >> > > > > Can you post your "xmlparser-conf.xml" from the nutch/conf dir ? >> > > > > Also what kind of error message do you get when you index? >> > > > > You can use Luke to see the index... >> > > > > >> > > > > Regards, >> > > > > >> > > > > On 11/4/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote: >> > > > > > Hello Everyone, >> > > > > > >> > > > > > I am just installed nutch-0.8.1 on my dev machine. I installed >> a >> > new >> > > > > > plugin called XML Parser available at >> > > > > > http://issues.apache.org/jira/browse/NUTCH-185 >> > > > > > The issue is that I am unable get it to work. >> > > > > > I copied the parse-xml folder to src/plugin folder. I made the >> > > > > > corresponding deploy/ clean entries in the build xml file. >> > > > > > >> > > > > > Also, I have editied the nutch conf to enable xml plugin. >> > > > > > The plugin is still not working. After compiling using ant, I >> > started >> > > > > > indexing. After the indexing was finished and query done, I >> > couldnt >> > > > > > see the indexed fields on the explain page. >> > > > > > >> > > > > > Any inputs? >> > > > > > >> > > > > > Thanks, >> > > > > > Jayant >> > > > > > >> > > > > >> > > > >> > > > -- >> > > > www.jkg.in | http://www.jkg.in/contact-me/ >> > > > Jayant Kr. Gandhi >> > > >> > > -- >> > > www.jkg.in | http://www.jkg.in/contact-me/ >> > > Jayant Kr. Gandhi >> > > >> > >> > >> > -- >> > www.jkg.in | http://www.jkg.in/contact-me/ >> > Jayant Kr. Gandhi >> > M.Tech. Computer Tech. Class of 2007, >> > IIT Delhi >> > >> >> > > -- View this message in context: http://www.nabble.com/XMLParser-for-Nutch-tf2575183.html#a13471028 Sent from the Nutch - User mailing list archive at Nabble.com.
