[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2011-06-10 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13047130#comment-13047130 ] Gabriele Kahlout commented on NUTCH-961: {quote}it needs to use a different

[jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-05 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044456#comment-13044456 ] Gabriele Kahlout commented on NUTCH-995: Sorry, to stick in the gullet but this

[jira] [Issue Comment Edited] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-05 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044456#comment-13044456 ] Gabriele Kahlout edited comment on NUTCH-995 at 6/5/11 7:07 AM:

[jira] [Issue Comment Edited] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-05 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044456#comment-13044456 ] Gabriele Kahlout edited comment on NUTCH-995 at 6/5/11 7:11 AM:

[jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-05 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044554#comment-13044554 ] Gabriele Kahlout commented on NUTCH-995: I'm ! sure the excluded dependencies you

[jira] [Issue Comment Edited] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-05 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044554#comment-13044554 ] Gabriele Kahlout edited comment on NUTCH-995 at 6/5/11 3:52 PM:

[jira] [Issue Comment Edited] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-05 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044554#comment-13044554 ] Gabriele Kahlout edited comment on NUTCH-995 at 6/5/11 3:51 PM:

[jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-05 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044574#comment-13044574 ] Gabriele Kahlout commented on NUTCH-995: BTW as Julien remarked earlier adding a

[jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-05 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044580#comment-13044580 ] Gabriele Kahlout commented on NUTCH-995: {quote}Nutch is not Solr. It doesn't have

[jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-05 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044596#comment-13044596 ] Gabriele Kahlout commented on NUTCH-995: {quote} What's stopping you from doing

[jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-04 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044349#comment-13044349 ] Gabriele Kahlout commented on NUTCH-995: opps..I've issues getting proper diffs, I

[jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-04 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13044360#comment-13044360 ] Gabriele Kahlout commented on NUTCH-995: I'm actually still trying to build it w/o

[jira] [Updated] (NUTCH-961) Expose Tika's boilerpipe support

2011-06-02 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-961: --- Attachment: (was: NUTCH-961-1.3-tikaparser1.patch) Expose Tika's boilerpipe support

[jira] [Updated] (NUTCH-961) Expose Tika's boilerpipe support

2011-06-02 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-961: --- Attachment: NUTCH-961v2.patch Tested the patch against a checkout of 1.3 branch at revision

[jira] [Updated] (NUTCH-1001) bin/nutch fetch/parse handle crawl/segments directory

2011-06-02 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-1001: Attachment: (was: multipleSegs-fetch-parse.patch) bin/nutch fetch/parse handle

[jira] [Updated] (NUTCH-1001) bin/nutch fetch/parse handle crawl/segments directory

2011-06-02 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-1001: Attachment: NUTCH-1001.patch I'm having formatting snv-diff netbeans issues. This patch

[jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-01 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042063#comment-13042063 ] Gabriele Kahlout commented on NUTCH-995: @Julien: for the second patch: {quote} $

[jira] [Issue Comment Edited] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-06-01 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042063#comment-13042063 ] Gabriele Kahlout edited comment on NUTCH-995 at 6/1/11 9:47 AM:

[jira] [Updated] (NUTCH-1001) bin/nutch fetch/parse handle crawl/segments directory

2011-06-01 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-1001: Attachment: multipleSegs-fetch-parse.patch This patch modifers Fetcher.java and

[jira] [Issue Comment Edited] (NUTCH-1001) bin/nutch fetch/parse handle crawl/segments directory

2011-06-01 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13042395#comment-13042395 ] Gabriele Kahlout edited comment on NUTCH-1001 at 6/1/11 8:10 PM:

[jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-05-24 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038463#comment-13038463 ] Gabriele Kahlout commented on NUTCH-995: {code} BUILD FAILED

[jira] [Issue Comment Edited] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-05-24 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038463#comment-13038463 ] Gabriele Kahlout edited comment on NUTCH-995 at 5/24/11 9:04 AM:

[jira] [Commented] (NUTCH-995) Generate POM file using the Ivy makepom task

2011-05-24 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13038488#comment-13038488 ] Gabriele Kahlout commented on NUTCH-995: the first patch worked for me. Generate

[jira] [Updated] (NUTCH-961) Expose Tika's boilerpipe support

2011-05-11 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-961: --- Attachment: NUTCH-961-1.3-tikaparser1.patch Same as NUTCH-961-1.3-tikaparser.patch by Markus

[jira] [Updated] (NUTCH-961) Expose Tika's boilerpipe support

2011-05-11 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-961: --- Attachment: NUTCH-961-1.3-tikaparser1.patch Modified to include necessary changes to

[jira] [Commented] (NUTCH-990) protocol-httpclient fails with short pages

2011-04-29 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027051#comment-13027051 ] Gabriele Kahlout commented on NUTCH-990: @Julien - can we mark this related to the

[jira] [Commented] (NUTCH-990) protocol-httpclient fails with short pages

2011-04-29 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13027293#comment-13027293 ] Gabriele Kahlout commented on NUTCH-990: @Julien - my bad with the pdfs.

[jira] [Updated] (NUTCH-990) protocol-httpclient fails with short pages

2011-04-27 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-990: --- Description: Using protocol-http with a few words html pages works fine. But with

[jira] [Issue Comment Edited] (NUTCH-990) protocol-httpclient fails with short pages

2011-04-27 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13025946#comment-13025946 ] Gabriele Kahlout edited comment on NUTCH-990 at 4/27/11 6:46 PM:

[jira] [Issue Comment Edited] (NUTCH-990) protocol-httpclient fails with short pages

2011-04-27 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13025946#comment-13025946 ] Gabriele Kahlout edited comment on NUTCH-990 at 4/27/11 6:51 PM:

[jira] [Updated] (NUTCH-990) protocol-httpclient fails with short pages

2011-04-27 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-990: --- Description: Using protocol-http with a few words html pages works fine. But with

[jira] [Commented] (NUTCH-990) protocol-httpclient fails with short pages

2011-04-27 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13025977#comment-13025977 ] Gabriele Kahlout commented on NUTCH-990: the logs look full of INFO noise indeed. I

[jira] [Commented] (NUTCH-961) Expose Tika's boilerpipe support

2011-04-26 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13025286#comment-13025286 ] Gabriele Kahlout commented on NUTCH-961: @Markus - Thank you. Watch out for [1] in

[jira] [Commented] (NUTCH-967) Upgrade to Tika 0.9

2011-04-06 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-967?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13016465#comment-13016465 ] Gabriele Kahlout commented on NUTCH-967: Julien, why doesn't your patch modify

[jira] [Created] (NUTCH-972) Mergedb doesn't merge with empty directory, as is the case with merge (for indexes)

2011-03-27 Thread Gabriele Kahlout (JIRA)
Mergedb doesn't merge with empty directory, as is the case with merge (for indexes) --- Key: NUTCH-972 URL: https://issues.apache.org/jira/browse/NUTCH-972 Project:

[jira] [Updated] (NUTCH-972) Mergedb doesn't merge with empty directory, as is the case with merge (for indexes)

2011-03-27 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-972?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-972: --- Attachment: check_empty.diff Mergedb doesn't merge with empty directory, as is the case with

[jira] [Created] (NUTCH-971) IndexMerger produces indexes itself cannot merge anymore

2011-03-26 Thread Gabriele Kahlout (JIRA)
IndexMerger produces indexes itself cannot merge anymore Key: NUTCH-971 URL: https://issues.apache.org/jira/browse/NUTCH-971 Project: Nutch Issue Type: Bug Components:

[jira] [Updated] (NUTCH-971) IndexMerger produces indexes itself cannot merge anymore

2011-03-26 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-971: --- Attachment: IndexMerger-part.diff Checks if the output index path ends with a part directory

[jira] [Updated] (NUTCH-971) IndexMerger produces indexes itself cannot merge anymore

2011-03-26 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-971: --- Attachment: (was: IndexMerger-part.diff) IndexMerger produces indexes itself cannot

[jira] [Updated] (NUTCH-971) IndexMerger produces indexes itself cannot merge anymore

2011-03-26 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-971: --- Comment: was deleted (was: Checks if the output index path ends with a part directory and if

[jira] [Updated] (NUTCH-971) IndexMerger produces indexes itself cannot merge anymore

2011-03-26 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gabriele Kahlout updated NUTCH-971: --- Attachment: IndexMerger-part.diff Checks if output path ends in a part dir and if not adds

[jira] [Commented] (NUTCH-971) IndexMerger produces indexes itself cannot merge anymore

2011-03-26 Thread Gabriele Kahlout (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13011627#comment-13011627 ] Gabriele Kahlout commented on NUTCH-971: I expect that installing solr and then