[jira] [Commented] (NUTCH-937) When nutch is run on hadoop > 0.20.2 (or cdh) it will not find plugins because MapReduce will not unpack plugin/ directory from the job's pack (due to MAPREDUCE-967)

2011-08-29 Thread Ferdy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092741#comment-13092741 ] Ferdy commented on NUTCH-937: - @Julien: I double checked and it seems you're right, "mapreduce

[jira] [Commented] (NUTCH-981) Add tests for solr* tasks

2011-08-29 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-981?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13092782#comment-13092782 ] Lewis John McGibbney commented on NUTCH-981: Hi Markus, as you mention in NUTCH

[jira] [Updated] (NUTCH-1097) application/xhtml+xml should be enabled in plugin.xml of parse-html

2011-08-29 Thread Ferdy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy updated NUTCH-1097: - Attachment: NUTCH-1097-v2.patch Patch v1 results in a warning. This patch allows html-parse to accept all mimety

[jira] [Created] (NUTCH-1099) Add HBase and Cassandra storage properties to nutch-default.xml

2011-08-29 Thread Lewis John McGibbney (JIRA)
Add HBase and Cassandra storage properties to nutch-default.xml --- Key: NUTCH-1099 URL: https://issues.apache.org/jira/browse/NUTCH-1099 Project: Nutch Issue Type: Improvement

[jira] [Updated] (NUTCH-1099) Add HBase and Cassandra storage properties to nutch-default.xml

2011-08-29 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1099?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1099: Attachment: NUTCH-1099-20110829.patch patch attachment for trunk. I thought this

[jira] [Updated] (NUTCH-1099) Add HBase and Cassandra storage properties to nutch-default.xml

2011-08-29 Thread Lewis John McGibbney (JIRA)
ts: storage >Affects Versions: 2.0 > Environment: Ubuntu 11.04 natty >Reporter: Lewis John McGibbney >Assignee: Lewis John McGibbney >Priority: Trivial > Fix For: 2.0 > > Attachments: NUTCH-1099-20110829.patc

Re: Gora CassandraStore is not thread safe?

2011-08-29 Thread lewis john mcgibbney
Hi Tom, Apologies for cross posting, this would not usually be the case but I'm hoping that if any results come from the thread then both communities can benefit. I'm in the process of getting Cassandra 0.8.4 working with Nutch 2.0 and Gora 0.2 myself and seem to be having some nasty problems. S

RE: Gora CassandraStore is not thread safe?

2011-08-29 Thread Tom Davidson
Hi Lewis, I was running Nutch deployed with a dedicated Cassandra cluster. Frankly, I have given up on using Nutch 2 at this time as it seems highly unstable and not really in active development. Your effort to address this is encouraging. Because Nutch uses multithreading in the fetchers, I wa

InvocationTargetException with Nutch 2.0 Gora 0.2 and Cassandra 0.8.4

2011-08-29 Thread lewis john mcgibbney
Hi, I believe the following error can be attributed to the java compiler finding (or not finding) more than one version of me.prettyprint.hector.api.Serializer. Has anyone experienced this whilst getting the above (or similar) setup configured and running? lewis@lewis-01:~/ASF/trunk/runtime/local

RE: InvocationTargetException with Nutch 2.0 Gora 0.2 and Cassandra 0.8.4

2011-08-29 Thread Tom Davidson
I had similar classpath issues. Are there any versions of Hector in your classpath (in your Hadoop lib folder?) that are not the same as in your nutch deployment jar? From: lewis john mcgibbney [mailto:lewis.mcgibb...@gmail.com] Sent: Monday, August 29, 2011 1:57 PM To: dev@nutch.apache.org Subj

Build failed in Jenkins: Nutch-trunk #1589

2011-08-29 Thread Apache Jenkins Server
See -- [...truncated 986 lines...] A src/plugin/subcollection/src/java/org/apache/nutch/collection/CollectionManager.java A src/plugin/subcollection/src/java/org/apache/nutch/collection/pack