*bump* Any thoughts, anyone?
Thanks, Jayant On 11/6/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote: > Hello, > > I have been working on it since then.. I have found one problem. It > seems the plugin parse-xml plugin is not loading. > > One thing I did was put the plugin in the parse-plugins.xml to enable > nutch-0.8.1 to detect that parse-xml is the plugin to be used for xml > content. This is not given in the instructions for the plugin though. > > Because of it I started to get the following error in hadoop.log:- > > 2006-11-06 15:12:33,156 WARN parse.ParserFactory - ParserFactory: > Plugin: parse-xml mapped to contentType text/xml via > parse-plugins.xml, but not enabled via plugin.includes in > nutch-default.xml > > The issue is that I have the plugin enabled in the nutch-site.xml. I > also tried to enable the plugin in nutch-default.xml but I still get > the same error. > > Any thoughts/ pointers on how to make the plugin work? > > Thanks and Best Regards, > Jayant Gandhi > > > On 11/5/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote: > > I am using the default xmlparser-conf.xml, just copied it into > > nutch/conf dir. To test it I used the xml file given in the sample > > directory xmltest.xml and is uploaded at http://www.jkg.in/xmltest.xml > > . > > > > I do not get any errors while indexing or parsing. The crawl log is > > attached. I am able to get the xml file in the results when I search > > for 'XPath' but when I click the explain link, it doesn't show me the > > field dctitle in the index which it should. > > > > I just noticed that hadoop.log has some error for handling xml files > > and I cannot see parse-xml loaded, but I have it enabled in my > > nutch-site.conf. I am new to nutch-0.8 and hadoop so I have no idea > > whether this is expected behaviour/ how to fix it. > > > > Thanks and Best Regards, > > Jayant > > > > On 11/5/06, Nutch Newbie <[EMAIL PROTECTED]> wrote: > > > Can you post your "xmlparser-conf.xml" from the nutch/conf dir ? > > > Also what kind of error message do you get when you index? > > > You can use Luke to see the index... > > > > > > Regards, > > > > > > On 11/4/06, Jayant Kumar Gandhi <[EMAIL PROTECTED]> wrote: > > > > Hello Everyone, > > > > > > > > I am just installed nutch-0.8.1 on my dev machine. I installed a new > > > > plugin called XML Parser available at > > > > http://issues.apache.org/jira/browse/NUTCH-185 > > > > The issue is that I am unable get it to work. > > > > I copied the parse-xml folder to src/plugin folder. I made the > > > > corresponding deploy/ clean entries in the build xml file. > > > > > > > > Also, I have editied the nutch conf to enable xml plugin. > > > > The plugin is still not working. After compiling using ant, I started > > > > indexing. After the indexing was finished and query done, I couldnt > > > > see the indexed fields on the explain page. > > > > > > > > Any inputs? > > > > > > > > Thanks, > > > > Jayant > > > > > > > > > > > -- > > www.jkg.in | http://www.jkg.in/contact-me/ > > Jayant Kr. Gandhi > > -- > www.jkg.in | http://www.jkg.in/contact-me/ > Jayant Kr. Gandhi > -- www.jkg.in | http://www.jkg.in/contact-me/ Jayant Kr. Gandhi M.Tech. Computer Tech. Class of 2007, IIT Delhi ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
