[
https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexis updated NUTCH-956:
-
Attachment: solr.patch2
- NPE related to content-type field
- tld field in Solr schema
- string comparison in Java
[
https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13064148#comment-13064148
]
Alexis commented on NUTCH-956:
--
I do get the NPE when indexing this url
http://www.truveo.com
[
https://issues.apache.org/jira/browse/NUTCH-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexis updated NUTCH-965:
-
Summary: Skip parsing for truncated documents (was: Parsing takes up 100%
CPU)
> Skip parsing for truncated docu
[
https://issues.apache.org/jira/browse/NUTCH-965?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexis updated NUTCH-965:
-
Attachment: parserJob.patch
In the parser mapper, compare Content-Length header to the size of the content
buffer
Parsing takes up 100% CPU
-
Key: NUTCH-965
URL: https://issues.apache.org/jira/browse/NUTCH-965
Project: Nutch
Issue Type: Improvement
Components: parser
Reporter: Alexis
The issue you're likel
[
https://issues.apache.org/jira/browse/NUTCH-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12983125#action_12983125
]
Alexis commented on NUTCH-955:
--
Sorry please disregard the nutch.root first bullet in the previ
[
https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexis updated NUTCH-956:
-
Summary: solrindex issues (was: soldindex issues)
> solrindex issues
>
>
> Key:
[
https://issues.apache.org/jira/browse/NUTCH-956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexis updated NUTCH-956:
-
Attachment: solr.patch
Here are the changes:
- Avoid multiple values for id field. (NUTCH-819)
- Allow multiple v
soldindex issues
Key: NUTCH-956
URL: https://issues.apache.org/jira/browse/NUTCH-956
Project: Nutch
Issue Type: Bug
Components: indexer
Affects Versions: 2.0
Reporter: Alexis
I ran into a few cave
[
https://issues.apache.org/jira/browse/NUTCH-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexis resolved NUTCH-950.
--
Resolution: Fixed
Fix Version/s: 2.0
Sorry I missed the Ivy configuration file in the plugin directory.
[
https://issues.apache.org/jira/browse/NUTCH-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12979525#action_12979525
]
Alexis edited comment on NUTCH-955 at 1/10/11 5:27 AM:
---
In the patch,
[
https://issues.apache.org/jira/browse/NUTCH-955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexis updated NUTCH-955:
-
Attachment: ivy.patch
In the patch, the required dependencies for MySQL and HBase are included in the
Ivy config,
Ivy configuration
-
Key: NUTCH-955
URL: https://issues.apache.org/jira/browse/NUTCH-955
Project: Nutch
Issue Type: Improvement
Components: build
Affects Versions: 2.0
Reporter: Alexis
As mentioned
[
https://issues.apache.org/jira/browse/NUTCH-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexis updated NUTCH-950:
-
Attachment: nutch3.patch
nutch2.patch
nutch1.patch
> Content-Length limit, URL fil
[
https://issues.apache.org/jira/browse/NUTCH-950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexis updated NUTCH-950:
-
Attachment: nutch4.patch
> Content-Length limit, URL filter and few minor issues
> ---
Content-Length limit, URL filter and few minor issues
-
Key: NUTCH-950
URL: https://issues.apache.org/jira/browse/NUTCH-950
Project: Nutch
Issue Type: Bug
Affects Versions: 2.0
[
https://issues.apache.org/jira/browse/NUTCH-899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alexis updated NUTCH-899:
-
Attachment: httpContentLimit.patch
We stick with the default gora schema for the MySQL backend, which says
"byte
[
https://issues.apache.org/jira/browse/NUTCH-899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12970336#action_12970336
]
Alexis commented on NUTCH-899:
--
I ran into the exact same issue, with MySQL. The blob column ty
[
https://issues.apache.org/jira/browse/NUTCH-880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928896#action_12928896
]
Alexis commented on NUTCH-880:
--
This revision introduced a bug in the nutch inject command. It
[
https://issues.apache.org/jira/browse/NUTCH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928788#action_12928788
]
Alexis edited comment on NUTCH-873 at 11/5/10 3:52 PM:
---
It did not wor
[
https://issues.apache.org/jira/browse/NUTCH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928788#action_12928788
]
Alexis edited comment on NUTCH-873 at 11/5/10 3:51 PM:
---
It did not wor
[
https://issues.apache.org/jira/browse/NUTCH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928788#action_12928788
]
Alexis edited comment on NUTCH-873 at 11/5/10 3:48 PM:
---
It did not wor
[
https://issues.apache.org/jira/browse/NUTCH-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928788#action_12928788
]
Alexis commented on NUTCH-873:
--
It did not work as seamless for me. The gora build created a
~
23 matches
Mail list logo