[jira] [Commented] (NUTCH-2357) Index metadata throw Exception because writable object cannot be cast to Text
[ https://issues.apache.org/jira/browse/NUTCH-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925276#comment-15925276 ] Hudson commented on NUTCH-2357: --- SUCCESS: Integrated in Jenkins build Nutch-trunk #3412 (See [https://builds.apache.org/job/Nutch-trunk/3412/]) NUTCH-2357 Index metadata throw Exception because writable object cannot (snagel: [https://github.com/apache/nutch/commit/439f1153991ec104acdb73420ddc816cd9c665e8]) * (edit) src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java > Index metadata throw Exception because writable object cannot be cast to Text > - > > Key: NUTCH-2357 > URL: https://issues.apache.org/jira/browse/NUTCH-2357 > Project: Nutch > Issue Type: Bug > Components: indexer >Affects Versions: 1.12 > Environment: It was detected using Linux mint 18. >Reporter: Eyeris Rodriguez Rueda >Assignee: Chris A. Mattmann >Priority: Minor > Fix For: 1.13 > > > Index Metadata plugin use this property(see below), to take keys from Datum > and index it. > > index.db.md > > > ... > > > Using any value from this property one Exception is thrown. > The problem occurs because Writable object can not be cast to Text see this > line. > https://github.com/apache/nutch/blob/master/src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java#L58 > A little change will fix it. > This is the Exception: > ** > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: digest dest: > digest > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: tstamp dest: > tstamp > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: > metatag.description dest: description > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: > metatag.keywords dest: keywords > 2017-02-06 18:18:30,134 WARN mapred.LocalJobRunner - job_local1516_0001 > java.lang.Exception: java.lang.ClassCastException: > org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) > Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable > cannot be cast to org.apache.hadoop.io.Text > at > org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58) > at > org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:51) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:330) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2017-02-06 18:18:30,777 ERROR indexer.IndexingJob - Indexer: > java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836) > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237) > ** -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (NUTCH-2357) Index metadata throw Exception because writable object cannot be cast to Text
[ https://issues.apache.org/jira/browse/NUTCH-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925236#comment-15925236 ] Chris A. Mattmann commented on NUTCH-2357: -- Thanks [~eyeris] and [~wastl-nagel]! > Index metadata throw Exception because writable object cannot be cast to Text > - > > Key: NUTCH-2357 > URL: https://issues.apache.org/jira/browse/NUTCH-2357 > Project: Nutch > Issue Type: Bug > Components: indexer >Affects Versions: 1.12 > Environment: It was detected using Linux mint 18. >Reporter: Eyeris Rodriguez Rueda >Assignee: Chris A. Mattmann >Priority: Minor > Fix For: 1.13 > > > Index Metadata plugin use this property(see below), to take keys from Datum > and index it. > > index.db.md > > > ... > > > Using any value from this property one Exception is thrown. > The problem occurs because Writable object can not be cast to Text see this > line. > https://github.com/apache/nutch/blob/master/src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java#L58 > A little change will fix it. > This is the Exception: > ** > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: digest dest: > digest > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: tstamp dest: > tstamp > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: > metatag.description dest: description > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: > metatag.keywords dest: keywords > 2017-02-06 18:18:30,134 WARN mapred.LocalJobRunner - job_local1516_0001 > java.lang.Exception: java.lang.ClassCastException: > org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) > Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable > cannot be cast to org.apache.hadoop.io.Text > at > org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58) > at > org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:51) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:330) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2017-02-06 18:18:30,777 ERROR indexer.IndexingJob - Indexer: > java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836) > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237) > ** -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (NUTCH-2357) Index metadata throw Exception because writable object cannot be cast to Text
[ https://issues.apache.org/jira/browse/NUTCH-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925232#comment-15925232 ] ASF GitHub Bot commented on NUTCH-2357: --- Github user chrismattmann closed the pull request at: https://github.com/apache/nutch/pull/177 > Index metadata throw Exception because writable object cannot be cast to Text > - > > Key: NUTCH-2357 > URL: https://issues.apache.org/jira/browse/NUTCH-2357 > Project: Nutch > Issue Type: Bug > Components: indexer >Affects Versions: 1.12 > Environment: It was detected using Linux mint 18. >Reporter: Eyeris Rodriguez Rueda >Priority: Minor > Fix For: 1.13 > > > Index Metadata plugin use this property(see below), to take keys from Datum > and index it. > > index.db.md > > > ... > > > Using any value from this property one Exception is thrown. > The problem occurs because Writable object can not be cast to Text see this > line. > https://github.com/apache/nutch/blob/master/src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java#L58 > A little change will fix it. > This is the Exception: > ** > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: digest dest: > digest > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: tstamp dest: > tstamp > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: > metatag.description dest: description > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: > metatag.keywords dest: keywords > 2017-02-06 18:18:30,134 WARN mapred.LocalJobRunner - job_local1516_0001 > java.lang.Exception: java.lang.ClassCastException: > org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) > Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable > cannot be cast to org.apache.hadoop.io.Text > at > org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58) > at > org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:51) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:330) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2017-02-06 18:18:30,777 ERROR indexer.IndexingJob - Indexer: > java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836) > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237) > ** -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (NUTCH-2357) Index metadata throw Exception because writable object cannot be cast to Text
[ https://issues.apache.org/jira/browse/NUTCH-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859275#comment-15859275 ] ASF GitHub Bot commented on NUTCH-2357: --- GitHub user sebastian-nagel opened a pull request: https://github.com/apache/nutch/pull/177 NUTCH-2357 Index metadata throw Exception because writable object can… …not be cast to Text - do not cast CrawlDatum metadata values from o.a.hadoop.io.Writable to o.a.hadoop.io.Text (contributed by Eyeris Rodriguez Rueda) You can merge this pull request into a Git repository by running: $ git pull https://github.com/sebastian-nagel/nutch NUTCH-2357 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/177.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #177 commit 439f1153991ec104acdb73420ddc816cd9c665e8 Author: Sebastian NagelDate: 2017-02-09T09:55:40Z NUTCH-2357 Index metadata throw Exception because writable object cannot be cast to Text - do not cast CrawlDatum metadata values from o.a.hadoop.io.Writable to o.a.hadoop.io.Text (contributed by Eyeris Rodriguez Rueda) > Index metadata throw Exception because writable object cannot be cast to Text > - > > Key: NUTCH-2357 > URL: https://issues.apache.org/jira/browse/NUTCH-2357 > Project: Nutch > Issue Type: Bug > Components: indexer >Affects Versions: 1.12 > Environment: It was detected using Linux mint 18. >Reporter: Eyeris Rodriguez Rueda >Priority: Minor > Fix For: 1.13 > > > Index Metadata plugin use this property(see below), to take keys from Datum > and index it. > > index.db.md > > > ... > > > Using any value from this property one Exception is thrown. > The problem occurs because Writable object can not be cast to Text see this > line. > https://github.com/apache/nutch/blob/master/src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java#L58 > A little change will fix it. > This is the Exception: > ** > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: digest dest: > digest > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: tstamp dest: > tstamp > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: > metatag.description dest: description > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: > metatag.keywords dest: keywords > 2017-02-06 18:18:30,134 WARN mapred.LocalJobRunner - job_local1516_0001 > java.lang.Exception: java.lang.ClassCastException: > org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) > Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable > cannot be cast to org.apache.hadoop.io.Text > at > org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58) > at > org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:51) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:330) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2017-02-06 18:18:30,777 ERROR indexer.IndexingJob - Indexer: > java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836) > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237) > ** -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (NUTCH-2357) Index metadata throw Exception because writable object cannot be cast to Text
[ https://issues.apache.org/jira/browse/NUTCH-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859272#comment-15859272 ] Sebastian Nagel commented on NUTCH-2357: Thanks! See also [this discussion on the user mailing list|https://lists.apache.org/thread.html/33996cecba325401837f461eb4c687676a0158665e9a5513cebe7b71@%3Cuser.nutch.apache.org%3E] Confirmed, minimalistic command to reproduce the problem: {noformat} % bin/nutch indexchecker -Dplugin.includes='protocol-http|parse-html|index-metadata' \ -Dindex.db.md=_rs_ -Dhttp.store.responsetime=true http://localhost/ ... Exception in thread "main" java.lang.ClassCastException: org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text at org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58) {noformat} > Index metadata throw Exception because writable object cannot be cast to Text > - > > Key: NUTCH-2357 > URL: https://issues.apache.org/jira/browse/NUTCH-2357 > Project: Nutch > Issue Type: Bug > Components: indexer >Affects Versions: 1.12 > Environment: It was detected using Linux mint 18. >Reporter: Eyeris Rodriguez Rueda >Priority: Minor > Fix For: 1.13 > > > Index Metadata plugin use this property(see below), to take keys from Datum > and index it. > > index.db.md > > > ... > > > Using any value from this property one Exception is thrown. > The problem occurs because Writable object can not be cast to Text see this > line. > https://github.com/apache/nutch/blob/master/src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java#L58 > A little change will fix it. > This is the Exception: > ** > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: digest dest: > digest > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: tstamp dest: > tstamp > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: > metatag.description dest: description > 2017-02-06 18:18:29,969 INFO solr.SolrMappingReader - source: > metatag.keywords dest: keywords > 2017-02-06 18:18:30,134 WARN mapred.LocalJobRunner - job_local1516_0001 > java.lang.Exception: java.lang.ClassCastException: > org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text > at > org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462) > at > org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529) > Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable > cannot be cast to org.apache.hadoop.io.Text > at > org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58) > at > org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:51) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:330) > at > org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56) > at > org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444) > at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392) > at > org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319) > at > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2017-02-06 18:18:30,777 ERROR indexer.IndexingJob - Indexer: > java.io.IOException: Job failed! > at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836) > at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145) > at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228) > at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70) > at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237) > ** -- This message was sent by Atlassian JIRA (v6.3.15#6346)