Great...maybe this is a bug in the Tika codebase!

On Thu, Nov 20, 2014 at 10:02 AM, MengYing Wang <mengyingwa...@gmail.com>
wrote:

> Dear Lewis,
>
> Problem solved by replacing the rome-1.0.jar back to rome-0.9.jar in 
> parse-tika.
> Same idea as the feed parser in
> https://issues.apache.org/jira/browse/NUTCH-1494. Thanks.
>
> Best,
> Mengying (Angela) Wang
>
> On Wed, Nov 19, 2014 at 9:08 PM, Lewis John Mcgibbney <
> lewis.mcgibb...@gmail.com> wrote:
>
>> Try removing 0.9 from that directory (copy elsewhere) and attempt to re
>> parse the directory.
>> Thanks
>>
>> On Wed, Nov 19, 2014 at 8:36 PM, MengYing Wang <mengyingwa...@gmail.com>
>> wrote:
>>
>>> Dear Lewis,
>>>
>>> In feed, it is rome-0.9 (
>>> http://svn.apache.org/repos/asf/nutch/trunk/src/plugin/feed/ivy.xml).
>>> While, in parse-Tika, it is rome-1.0 (
>>> http://svn.apache.org/repos/asf/nutch/trunk/src/plugin/parse-tika/plugin.xml).
>>> I have enabled both feed and parse-tika in the nutch-site.xml. Thanks.
>>>
>>> Best,
>>> Mengying (Angela) Wang
>>>
>>>
>>>
>>> On Wed, Nov 19, 2014 at 8:42 AM, Lewis John Mcgibbney <
>>> lewis.mcgibb...@gmail.com> wrote:
>>>
>>>> Which version of Rome feed parser is in your class path?
>>>> It may be activated via the Nutch 'feed' plugin or may also be come via
>>>> Nutch 'parse-Tika' plugin.
>>>> Please determine which version(s) are in class path and which are being
>>>> used.
>>>>
>>>> On Wednesday, November 19, 2014, MengYing Wang <mengyingwa...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi Everyone,
>>>>>
>>>>> In the Nutch parse step, I received the following error. Does Anyone
>>>>> know how to solve the problem? Appreciate for your help!
>>>>>
>>>>> $ /cygdrive/d/nutch_trunk/runtime/local/bin/nutch parse -D
>>>>> mapred.reduce.tasks=2 -D mapred.child.java.opts=-Xmx1000m -D
>>>>> mapred.reduce.tasks.speculative.execution=false -D
>>>>> mapred.map.tasks.speculative.execution=false -D
>>>>> mapred.compress.map.output=true -D 
>>>>> mapred.skip.attempts.to.start.skipping=2
>>>>> -D mapred.skip.map.max.skip.records=1 crawlId/segments/20141118235323
>>>>>
>>>>> java.lang.ExceptionInInitializerError
>>>>> at com.sun.syndication.io.SyndFeedInput.build(SyndFeedInput.java:136)
>>>>> at org.apache.tika.parser.feed.FeedParser.parse(FeedParser.java:70)
>>>>> at org.apache.nutch.parse.tika.TikaParser.getParse(TikaParser.java:103)
>>>>> at org.apache.nutch.parse.ParseUtil.parse(ParseUtil.java:95)
>>>>> at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:101)
>>>>> at org.apache.nutch.parse.ParseSegment.map(ParseSegment.java:44)
>>>>> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>>>>> at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
>>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
>>>>> at
>>>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
>>>>> Caused by: java.lang.NullPointerException
>>>>> at java.util.Properties$LineReader.readLine(Properties.java:434)
>>>>> at java.util.Properties.load0(Properties.java:353)
>>>>> at java.util.Properties.load(Properties.java:341)
>>>>> at
>>>>> com.sun.syndication.io.impl.PropertiesLoader.<init>(PropertiesLoader.java:74)
>>>>> at
>>>>> com.sun.syndication.io.impl.PropertiesLoader.getPropertiesLoader(PropertiesLoader.java:46)
>>>>> at
>>>>> com.sun.syndication.io.impl.PluginManager.<init>(PluginManager.java:54)
>>>>> at
>>>>> com.sun.syndication.io.impl.PluginManager.<init>(PluginManager.java:46)
>>>>> at
>>>>> com.sun.syndication.feed.synd.impl.Converters.<init>(Converters.java:40)
>>>>> at
>>>>> com.sun.syndication.feed.synd.SyndFeedImpl.<clinit>(SyndFeedImpl.java:59)
>>>>> ... 10 more
>>>>>
>>>>> --
>>>>> Best,
>>>>> Mengying (Angela) Wang
>>>>>
>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "nsf-polar-usc-students" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to nsf-polar-usc-students+unsubscr...@googlegroups.com.
>>>>> To post to this group, send email to
>>>>> nsf-polar-usc-stude...@googlegroups.com.
>>>>> Visit this group at
>>>>> http://groups.google.com/group/nsf-polar-usc-students.
>>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/nsf-polar-usc-students/CAJX%3DLAuzcTtYe61Avq1EthNRYN6M-%2BGk%2B7PntdOYvQ4ZkrEJKw%40mail.gmail.com
>>>>> <https://groups.google.com/d/msgid/nsf-polar-usc-students/CAJX%3DLAuzcTtYe61Avq1EthNRYN6M-%2BGk%2B7PntdOYvQ4ZkrEJKw%40mail.gmail.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>
>>>>
>>>>
>>>> --
>>>> *Lewis*
>>>>
>>>>
>>>
>>>
>>> --
>>> Best,
>>> Mengying (Angela) Wang
>>>
>>
>>
>>
>> --
>> *Lewis*
>>
>
>
>
> --
> Best,
> Mengying (Angela) Wang
>



-- 
*Lewis*

Reply via email to