[jira] [Commented] (NUTCH-2357) Index metadata throw Exception because writable object cannot be cast to Text

2017-03-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925276#comment-15925276
 ] 

Hudson commented on NUTCH-2357:
---

SUCCESS: Integrated in Jenkins build Nutch-trunk #3412 (See 
[https://builds.apache.org/job/Nutch-trunk/3412/])
NUTCH-2357 Index metadata throw Exception because writable object cannot 
(snagel: 
[https://github.com/apache/nutch/commit/439f1153991ec104acdb73420ddc816cd9c665e8])
* (edit) 
src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java


> Index metadata throw Exception because writable object cannot be cast to Text
> -
>
> Key: NUTCH-2357
> URL: https://issues.apache.org/jira/browse/NUTCH-2357
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 1.12
> Environment: It was detected using Linux mint 18.
>Reporter: Eyeris Rodriguez Rueda
>Assignee: Chris A. Mattmann
>Priority: Minor
> Fix For: 1.13
>
>
> Index Metadata plugin use this property(see below), to take keys from Datum 
> and index it.
> 
>   index.db.md
>   
>   
> ...
>   
> 
> Using any value from this property one Exception is thrown.
> The problem occurs because Writable object can not be cast to Text see this 
> line.
> https://github.com/apache/nutch/blob/master/src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java#L58
> A little change will fix it.
> This is the Exception:
> **
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: digest dest: 
> digest
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: tstamp dest: 
> tstamp
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: 
> metatag.description dest: description
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: 
> metatag.keywords dest: keywords
> 2017-02-06 18:18:30,134 WARN  mapred.LocalJobRunner - job_local1516_0001
> java.lang.Exception: java.lang.ClassCastException: 
> org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
> cannot be cast to org.apache.hadoop.io.Text
>   at 
> org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58)
>   at 
> org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:51)
>   at 
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:330)
>   at 
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 2017-02-06 18:18:30,777 ERROR indexer.IndexingJob - Indexer: 
> java.io.IOException: Job failed!
>   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>   at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
>   at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
> **



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NUTCH-2357) Index metadata throw Exception because writable object cannot be cast to Text

2017-03-14 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925236#comment-15925236
 ] 

Chris A. Mattmann commented on NUTCH-2357:
--

Thanks [~eyeris] and [~wastl-nagel]!

> Index metadata throw Exception because writable object cannot be cast to Text
> -
>
> Key: NUTCH-2357
> URL: https://issues.apache.org/jira/browse/NUTCH-2357
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 1.12
> Environment: It was detected using Linux mint 18.
>Reporter: Eyeris Rodriguez Rueda
>Assignee: Chris A. Mattmann
>Priority: Minor
> Fix For: 1.13
>
>
> Index Metadata plugin use this property(see below), to take keys from Datum 
> and index it.
> 
>   index.db.md
>   
>   
> ...
>   
> 
> Using any value from this property one Exception is thrown.
> The problem occurs because Writable object can not be cast to Text see this 
> line.
> https://github.com/apache/nutch/blob/master/src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java#L58
> A little change will fix it.
> This is the Exception:
> **
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: digest dest: 
> digest
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: tstamp dest: 
> tstamp
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: 
> metatag.description dest: description
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: 
> metatag.keywords dest: keywords
> 2017-02-06 18:18:30,134 WARN  mapred.LocalJobRunner - job_local1516_0001
> java.lang.Exception: java.lang.ClassCastException: 
> org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
> cannot be cast to org.apache.hadoop.io.Text
>   at 
> org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58)
>   at 
> org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:51)
>   at 
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:330)
>   at 
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 2017-02-06 18:18:30,777 ERROR indexer.IndexingJob - Indexer: 
> java.io.IOException: Job failed!
>   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>   at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
>   at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
> **



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NUTCH-2357) Index metadata throw Exception because writable object cannot be cast to Text

2017-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15925232#comment-15925232
 ] 

ASF GitHub Bot commented on NUTCH-2357:
---

Github user chrismattmann closed the pull request at:

https://github.com/apache/nutch/pull/177


> Index metadata throw Exception because writable object cannot be cast to Text
> -
>
> Key: NUTCH-2357
> URL: https://issues.apache.org/jira/browse/NUTCH-2357
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 1.12
> Environment: It was detected using Linux mint 18.
>Reporter: Eyeris Rodriguez Rueda
>Priority: Minor
> Fix For: 1.13
>
>
> Index Metadata plugin use this property(see below), to take keys from Datum 
> and index it.
> 
>   index.db.md
>   
>   
> ...
>   
> 
> Using any value from this property one Exception is thrown.
> The problem occurs because Writable object can not be cast to Text see this 
> line.
> https://github.com/apache/nutch/blob/master/src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java#L58
> A little change will fix it.
> This is the Exception:
> **
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: digest dest: 
> digest
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: tstamp dest: 
> tstamp
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: 
> metatag.description dest: description
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: 
> metatag.keywords dest: keywords
> 2017-02-06 18:18:30,134 WARN  mapred.LocalJobRunner - job_local1516_0001
> java.lang.Exception: java.lang.ClassCastException: 
> org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
> cannot be cast to org.apache.hadoop.io.Text
>   at 
> org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58)
>   at 
> org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:51)
>   at 
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:330)
>   at 
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 2017-02-06 18:18:30,777 ERROR indexer.IndexingJob - Indexer: 
> java.io.IOException: Job failed!
>   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>   at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
>   at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
> **



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NUTCH-2357) Index metadata throw Exception because writable object cannot be cast to Text

2017-02-09 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859275#comment-15859275
 ] 

ASF GitHub Bot commented on NUTCH-2357:
---

GitHub user sebastian-nagel opened a pull request:

https://github.com/apache/nutch/pull/177

NUTCH-2357 Index metadata throw Exception because writable object can…

…not be cast to Text

- do not cast CrawlDatum metadata values from o.a.hadoop.io.Writable to 
o.a.hadoop.io.Text
(contributed by Eyeris Rodriguez Rueda)

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sebastian-nagel/nutch NUTCH-2357

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nutch/pull/177.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #177


commit 439f1153991ec104acdb73420ddc816cd9c665e8
Author: Sebastian Nagel 
Date:   2017-02-09T09:55:40Z

NUTCH-2357 Index metadata throw Exception because writable object cannot be 
cast to Text
- do not cast CrawlDatum metadata values from o.a.hadoop.io.Writable to 
o.a.hadoop.io.Text
(contributed by Eyeris Rodriguez Rueda)




> Index metadata throw Exception because writable object cannot be cast to Text
> -
>
> Key: NUTCH-2357
> URL: https://issues.apache.org/jira/browse/NUTCH-2357
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 1.12
> Environment: It was detected using Linux mint 18.
>Reporter: Eyeris Rodriguez Rueda
>Priority: Minor
> Fix For: 1.13
>
>
> Index Metadata plugin use this property(see below), to take keys from Datum 
> and index it.
> 
>   index.db.md
>   
>   
> ...
>   
> 
> Using any value from this property one Exception is thrown.
> The problem occurs because Writable object can not be cast to Text see this 
> line.
> https://github.com/apache/nutch/blob/master/src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java#L58
> A little change will fix it.
> This is the Exception:
> **
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: digest dest: 
> digest
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: tstamp dest: 
> tstamp
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: 
> metatag.description dest: description
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: 
> metatag.keywords dest: keywords
> 2017-02-06 18:18:30,134 WARN  mapred.LocalJobRunner - job_local1516_0001
> java.lang.Exception: java.lang.ClassCastException: 
> org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
> cannot be cast to org.apache.hadoop.io.Text
>   at 
> org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58)
>   at 
> org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:51)
>   at 
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:330)
>   at 
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 2017-02-06 18:18:30,777 ERROR indexer.IndexingJob - Indexer: 
> java.io.IOException: Job failed!
>   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>   at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
>   at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
> **



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (NUTCH-2357) Index metadata throw Exception because writable object cannot be cast to Text

2017-02-09 Thread Sebastian Nagel (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15859272#comment-15859272
 ] 

Sebastian Nagel commented on NUTCH-2357:


Thanks! See also [this discussion on the user mailing 
list|https://lists.apache.org/thread.html/33996cecba325401837f461eb4c687676a0158665e9a5513cebe7b71@%3Cuser.nutch.apache.org%3E]

Confirmed, minimalistic command to reproduce the problem:
{noformat}
% bin/nutch indexchecker 
-Dplugin.includes='protocol-http|parse-html|index-metadata' \
   -Dindex.db.md=_rs_ -Dhttp.store.responsetime=true  http://localhost/
...
Exception in thread "main" java.lang.ClassCastException: 
org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text
at 
org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58)
{noformat}


> Index metadata throw Exception because writable object cannot be cast to Text
> -
>
> Key: NUTCH-2357
> URL: https://issues.apache.org/jira/browse/NUTCH-2357
> Project: Nutch
>  Issue Type: Bug
>  Components: indexer
>Affects Versions: 1.12
> Environment: It was detected using Linux mint 18.
>Reporter: Eyeris Rodriguez Rueda
>Priority: Minor
> Fix For: 1.13
>
>
> Index Metadata plugin use this property(see below), to take keys from Datum 
> and index it.
> 
>   index.db.md
>   
>   
> ...
>   
> 
> Using any value from this property one Exception is thrown.
> The problem occurs because Writable object can not be cast to Text see this 
> line.
> https://github.com/apache/nutch/blob/master/src/plugin/index-metadata/src/java/org/apache/nutch/indexer/metadata/MetadataIndexer.java#L58
> A little change will fix it.
> This is the Exception:
> **
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: digest dest: 
> digest
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: tstamp dest: 
> tstamp
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: 
> metatag.description dest: description
> 2017-02-06 18:18:29,969 INFO  solr.SolrMappingReader - source: 
> metatag.keywords dest: keywords
> 2017-02-06 18:18:30,134 WARN  mapred.LocalJobRunner - job_local1516_0001
> java.lang.Exception: java.lang.ClassCastException: 
> org.apache.hadoop.io.IntWritable cannot be cast to org.apache.hadoop.io.Text
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:529)
> Caused by: java.lang.ClassCastException: org.apache.hadoop.io.IntWritable 
> cannot be cast to org.apache.hadoop.io.Text
>   at 
> org.apache.nutch.indexer.metadata.MetadataIndexer.filter(MetadataIndexer.java:58)
>   at 
> org.apache.nutch.indexer.IndexingFilters.filter(IndexingFilters.java:51)
>   at 
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:330)
>   at 
> org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:56)
>   at 
> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:444)
>   at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:392)
>   at 
> org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:319)
>   at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>   at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> 2017-02-06 18:18:30,777 ERROR indexer.IndexingJob - Indexer: 
> java.io.IOException: Job failed!
>   at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:836)
>   at org.apache.nutch.indexer.IndexingJob.index(IndexingJob.java:145)
>   at org.apache.nutch.indexer.IndexingJob.run(IndexingJob.java:228)
>   at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>   at org.apache.nutch.indexer.IndexingJob.main(IndexingJob.java:237)
> **



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)