[jira] [Updated] (NUTCH-956) solrindex issues

2011-07-12 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-956: - Attachment: solr.patch2 - NPE related to content-type field - tld field in Solr schema - string comparison in Java

[jira] [Commented] (NUTCH-956) solrindex issues

2011-07-12 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064148#comment-13064148 ] Alexis commented on NUTCH-956: -- I do get the NPE when indexing this url http://www.truveo.com

[jira] Updated: (NUTCH-965) Skip parsing for truncated documents

2011-02-10 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-965: - Summary: Skip parsing for truncated documents (was: Parsing takes up 100% CPU) > Skip parsing for truncated docu

[jira] Updated: (NUTCH-965) Parsing takes up 100% CPU

2011-02-08 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-965: - Attachment: parserJob.patch In the parser mapper, compare Content-Length header to the size of the content buffer

[jira] Created: (NUTCH-965) Parsing takes up 100% CPU

2011-02-08 Thread Alexis (JIRA)
Parsing takes up 100% CPU - Key: NUTCH-965 URL: https://issues.apache.org/jira/browse/NUTCH-965 Project: Nutch Issue Type: Improvement Components: parser Reporter: Alexis The issue you're likel

[jira] Commented: (NUTCH-955) Ivy configuration

2011-01-18 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983125#action_12983125 ] Alexis commented on NUTCH-955: -- Sorry please disregard the nutch.root first bullet in the previ

[jira] Updated: (NUTCH-956) solrindex issues

2011-01-13 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-956: - Summary: solrindex issues (was: soldindex issues) > solrindex issues > > > Key:

[jira] Updated: (NUTCH-956) soldindex issues

2011-01-13 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-956: - Attachment: solr.patch Here are the changes: - Avoid multiple values for id field. (NUTCH-819) - Allow multiple v

[jira] Created: (NUTCH-956) soldindex issues

2011-01-13 Thread Alexis (JIRA)
soldindex issues Key: NUTCH-956 URL: https://issues.apache.org/jira/browse/NUTCH-956 Project: Nutch Issue Type: Bug Components: indexer Affects Versions: 2.0 Reporter: Alexis I ran into a few cave

[jira] Resolved: (NUTCH-950) Content-Length limit, URL filter and few minor issues

2011-01-10 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis resolved NUTCH-950. -- Resolution: Fixed Fix Version/s: 2.0 Sorry I missed the Ivy configuration file in the plugin directory.

[jira] Issue Comment Edited: (NUTCH-955) Ivy configuration

2011-01-10 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979525#action_12979525 ] Alexis edited comment on NUTCH-955 at 1/10/11 5:27 AM: --- In the patch,

[jira] Updated: (NUTCH-955) Ivy configuration

2011-01-10 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-955: - Attachment: ivy.patch In the patch, the required dependencies for MySQL and HBase are included in the Ivy config,

[jira] Created: (NUTCH-955) Ivy configuration

2011-01-10 Thread Alexis (JIRA)
Ivy configuration - Key: NUTCH-955 URL: https://issues.apache.org/jira/browse/NUTCH-955 Project: Nutch Issue Type: Improvement Components: build Affects Versions: 2.0 Reporter: Alexis As mentioned

[jira] Updated: (NUTCH-950) Content-Length limit, URL filter and few minor issues

2011-01-01 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-950: - Attachment: nutch3.patch nutch2.patch nutch1.patch > Content-Length limit, URL fil

[jira] Updated: (NUTCH-950) Content-Length limit, URL filter and few minor issues

2011-01-01 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-950: - Attachment: nutch4.patch > Content-Length limit, URL filter and few minor issues > ---

[jira] Created: (NUTCH-950) Content-Length limit, URL filter and few minor issues

2011-01-01 Thread Alexis (JIRA)
Content-Length limit, URL filter and few minor issues - Key: NUTCH-950 URL: https://issues.apache.org/jira/browse/NUTCH-950 Project: Nutch Issue Type: Bug Affects Versions: 2.0

[jira] Updated: (NUTCH-899) java.sql.BatchUpdateException: Data truncation: Data too long for column 'content' at row 1

2010-12-18 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexis updated NUTCH-899: - Attachment: httpContentLimit.patch We stick with the default gora schema for the MySQL backend, which says "byte

[jira] Commented: (NUTCH-899) java.sql.BatchUpdateException: Data truncation: Data too long for column 'content' at row 1

2010-12-10 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970336#action_12970336 ] Alexis commented on NUTCH-899: -- I ran into the exact same issue, with MySQL. The blob column ty

[jira] Commented: (NUTCH-880) REST API for Nutch

2010-11-05 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928896#action_12928896 ] Alexis commented on NUTCH-880: -- This revision introduced a bug in the nutch inject command. It

[jira] Issue Comment Edited: (NUTCH-873) Ivy configuration settings don't include Gora

2010-11-05 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928788#action_12928788 ] Alexis edited comment on NUTCH-873 at 11/5/10 3:52 PM: --- It did not wor

[jira] Issue Comment Edited: (NUTCH-873) Ivy configuration settings don't include Gora

2010-11-05 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928788#action_12928788 ] Alexis edited comment on NUTCH-873 at 11/5/10 3:51 PM: --- It did not wor

[jira] Issue Comment Edited: (NUTCH-873) Ivy configuration settings don't include Gora

2010-11-05 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928788#action_12928788 ] Alexis edited comment on NUTCH-873 at 11/5/10 3:48 PM: --- It did not wor

[jira] Commented: (NUTCH-873) Ivy configuration settings don't include Gora

2010-11-05 Thread Alexis (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928788#action_12928788 ] Alexis commented on NUTCH-873: -- It did not work as seamless for me. The gora build created a ~