[jira] [Assigned] (NUTCH-1117) JUnit test for index-anchor

2011-10-11 Thread Lewis John McGibbney (Assigned) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned NUTCH-1117: --- Assignee: Lewis John McGibbney > JUnit test for index-anchor > --

[jira] [Commented] (NUTCH-628) Host database to keep track of host-level information

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125443#comment-13125443 ] Lewis John McGibbney commented on NUTCH-628: Hi Markus, can you confirm if this

[jira] [Resolved] (NUTCH-629) Detect slow and timeout servers and drop their URLs

2011-10-11 Thread Lewis John McGibbney (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-629. Resolution: Won't Fix As Otis is no longer with us, as as per Markus' comments I thi

[jira] [Commented] (NUTCH-1005) Index headings plugin

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125436#comment-13125436 ] Lewis John McGibbney commented on NUTCH-1005: - Hi Markus & Julien, I really li

[jira] [Commented] (NUTCH-1098) better url-normalizer basic

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125429#comment-13125429 ] Lewis John McGibbney commented on NUTCH-1098: - Hi Radim are you happy with thi

[jira] [Resolved] (NUTCH-623) Change plugin source directory "languageidentifier" to "language-identifier"

2011-10-11 Thread Lewis John McGibbney (Resolved) (JIRA)
gt; > Attachments: NUTCH-623-branch-1.4-20110810.patch, > NUTCH-623-branch-1.4-20110810.patch, NUTCH-623-branch-1.4-20110910-v2.patch, > NUTCH-623-nutchgora-20111011.patch, NUTCH-623-trunk-1.4-20110924.patch, > NUTCH-623-trunk-2.0-20110810.patch > > > When trying

[jira] [Closed] (NUTCH-623) Change plugin source directory "languageidentifier" to "language-identifier"

2011-10-11 Thread Lewis John McGibbney (Closed) (JIRA)
>Reporter: Ignacio J. Ortega >Assignee: Lewis John McGibbney >Priority: Trivial > Fix For: 1.4, nutchgora > > Attachments: NUTCH-623-branch-1.4-20110810.patch, > NUTCH-623-branch-1.4-20110810.patch, NUTCH-623-branch-1.4-20110910-v2.pat

[jira] [Updated] (NUTCH-623) Change plugin source directory "languageidentifier" to "language-identifier"

2011-10-11 Thread Lewis John McGibbney (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-623: --- Attachment: NUTCH-623-nutchgora-20111011.patch patch attachment for nutchgora branch

[jira] [Commented] (NUTCH-1097) application/xhtml+xml should be enabled in plugin.xml of parse-html; allow multiple mimetypes for plugin.xml

2011-10-11 Thread Andrzej Bialecki (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125414#comment-13125414 ] Andrzej Bialecki commented on NUTCH-1097: -- +1 the idea makes sense. Patch looks

[jira] [Commented] (NUTCH-1097) application/xhtml+xml should be enabled in plugin.xml of parse-html; allow multiple mimetypes for plugin.xml

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125367#comment-13125367 ] Lewis John McGibbney commented on NUTCH-1097: - Does anyone else have input for

[jira] [Reopened] (NUTCH-623) Change plugin source directory "languageidentifier" to "language-identifier"

2011-10-11 Thread Lewis John McGibbney (Reopened) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-623?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reopened NUTCH-623: reopening and applying to nutchgora branch as this is a fairly trivial mapping

[jira] [Created] (NUTCH-1170) Write JUnit tests for urlfilter-validator

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for urlfilter-validator - Key: NUTCH-1170 URL: https://issues.apache.org/jira/browse/NUTCH-1170 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporte

[jira] [Created] (NUTCH-1169) Write JUnit tests for urlfilter-prefix

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for urlfilter-prefix -- Key: NUTCH-1169 URL: https://issues.apache.org/jira/browse/NUTCH-1169 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter: Lew

unsubscribe

2011-10-11 Thread Dr. Klaus Mapara
Am 11.10.2011 um 22:11 schrieb Lewis John McGibbney (Resolved) (JIRA): > > [ > https://issues.apache.org/jira/browse/NUTCH-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel > ] > > Lewis John McGibbney resolved NUTCH-1132. > - >

[jira] [Created] (NUTCH-1168) Write JUnit tests for tld

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for tld - Key: NUTCH-1168 URL: https://issues.apache.org/jira/browse/NUTCH-1168 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter: Lewis John McGibbney

[jira] [Created] (NUTCH-1166) Write JUnit tests for scoring-link

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for scoring-link -- Key: NUTCH-1166 URL: https://issues.apache.org/jira/browse/NUTCH-1166 Project: Nutch Issue Type: Sub-task Components: linkdb Affects Versions: nutchgora

[jira] [Created] (NUTCH-1167) Write JUnit tests for scoring-opic

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for scoring-opic -- Key: NUTCH-1167 URL: https://issues.apache.org/jira/browse/NUTCH-1167 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter: Lewis John

[jira] [Created] (NUTCH-1165) Write JUnit tests for protocol-sftp

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for protocol-sftp --- Key: NUTCH-1165 URL: https://issues.apache.org/jira/browse/NUTCH-1165 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter: Lewis Joh

[jira] [Created] (NUTCH-1164) Write JUnit tests for protocol-http

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for protocol-http --- Key: NUTCH-1164 URL: https://issues.apache.org/jira/browse/NUTCH-1164 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter: Lewis Joh

[jira] [Created] (NUTCH-1163) Write JUnit tests for protocol-ftp

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for protocol-ftp -- Key: NUTCH-1163 URL: https://issues.apache.org/jira/browse/NUTCH-1163 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora Reporter: Lewis John

[jira] [Created] (NUTCH-1162) Write JUnit tests for parse-js

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for parse-js -- Key: NUTCH-1162 URL: https://issues.apache.org/jira/browse/NUTCH-1162 Project: Nutch Issue Type: Sub-task Components: parser Affects Versions: nutchgora

[jira] [Created] (NUTCH-1161) Write JUnit tests for microformats-reltag plugin

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for microformats-reltag plugin Key: NUTCH-1161 URL: https://issues.apache.org/jira/browse/NUTCH-1161 Project: Nutch Issue Type: Sub-task Affects Versions: nutchgora

[jira] [Created] (NUTCH-1159) Write JUnit tests for index-anchor

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for index-anchor -- Key: NUTCH-1159 URL: https://issues.apache.org/jira/browse/NUTCH-1159 Project: Nutch Issue Type: Sub-task Components: indexer Affects Versions: nutchgora

[jira] [Created] (NUTCH-1160) Write JUnit tests for index-basic

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for index-basic - Key: NUTCH-1160 URL: https://issues.apache.org/jira/browse/NUTCH-1160 Project: Nutch Issue Type: Sub-task Components: indexer Affects Versions: nutchgora

[jira] [Created] (NUTCH-1158) Write JUnit tests for all nutchgora plugins

2011-10-11 Thread Lewis John McGibbney (Created) (JIRA)
Write JUnit tests for all nutchgora plugins --- Key: NUTCH-1158 URL: https://issues.apache.org/jira/browse/NUTCH-1158 Project: Nutch Issue Type: Improvement Affects Versions: nutchgora

[jira] [Resolved] (NUTCH-1134) Fix TestFetcher for Nutchgora

2011-10-11 Thread Lewis John McGibbney (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1134. - Resolution: Fixed Committed @ revision 1182060 in nutchgora branch

[jira] [Resolved] (NUTCH-1132) Fix TestGenerator for Nutchgora

2011-10-11 Thread Lewis John McGibbney (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1132. - Resolution: Fixed Committed @ revision 1182060 in nutchgora branch

[jira] [Resolved] (NUTCH-1133) Fix TestInjector for Nutchgora

2011-10-11 Thread Lewis John McGibbney (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-1133. - Resolution: Fixed Committed @ revision 1182060 in nutchgora branch

[jira] [Commented] (NUTCH-1135) Fix TestGoraStorage for Nutchgora

2011-10-11 Thread Ferdy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125306#comment-13125306 ] Ferdy commented on NUTCH-1135: -- No problem I'll work out a patch that fixes the test (at leas

[jira] [Updated] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2011-10-11 Thread Andrzej Bialecki (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated NUTCH-797: Attachment: NUTCH-797.patch Tentative patch, which changes the meaning of "fixEmbeddedParams

[jira] [Commented] (NUTCH-1081) ant tests fail

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125151#comment-13125151 ] Lewis John McGibbney commented on NUTCH-1081: - Thanks Ferdy. It was also my in

[jira] [Commented] (NUTCH-1135) Fix TestGoraStorage for Nutchgora

2011-10-11 Thread Lewis John McGibbney (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125147#comment-13125147 ] Lewis John McGibbney commented on NUTCH-1135: - Hi Ferdy, firstly thanks for lo

[jira] [Commented] (NUTCH-1081) ant tests fail

2011-10-11 Thread Ferdy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125146#comment-13125146 ] Ferdy commented on NUTCH-1081: -- It seems like your patch is fine, at least as a temporary sol

[jira] [Commented] (NUTCH-1135) Fix TestGoraStorage for Nutchgora

2011-10-11 Thread Ferdy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125145#comment-13125145 ] Ferdy commented on NUTCH-1135: -- It seems like Gora simply tries to connect to a non-existing

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2011-10-11 Thread Markus Jelsma (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125142#comment-13125142 ] Markus Jelsma commented on NUTCH-797: - Mmm, i think you are correct. It's bit confusing

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2011-10-11 Thread Andrzej Bialecki (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125129#comment-13125129 ] Andrzej Bialecki commented on NUTCH-797: - Well, I would expect http://www.funkybab

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2011-10-11 Thread Markus Jelsma (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125093#comment-13125093 ] Markus Jelsma commented on NUTCH-797: - I would expect http://www.funkybabes.nl/forumreg

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2011-10-11 Thread Andrzej Bialecki (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125077#comment-13125077 ] Andrzej Bialecki commented on NUTCH-797: - I'm puzzled by the algorithm in fixEmbed

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2011-10-11 Thread Andrzej Bialecki (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125016#comment-13125016 ] Andrzej Bialecki commented on NUTCH-797: - Uhh, sorry - I'll fix this in a moment.

[jira] [Commented] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2011-10-11 Thread Markus Jelsma (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125002#comment-13125002 ] Markus Jelsma commented on NUTCH-797: - Andrzej, it looks like the fix for NUTCH-1115 is

[jira] [Updated] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2011-10-11 Thread Andrzej Bialecki (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki updated NUTCH-797: Fix Version/s: (was: 1.4) Committed in rev. 1181747 to trunk. Nutchgora needs more work,

[jira] [Updated] (NUTCH-1156) building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies

2011-10-11 Thread Ferdy (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy updated NUTCH-1156: - Attachment: NUTCH-1156-v1.patch > building errors with gora-hbase as a backend; update ivy.xml to use correct

[jira] [Closed] (NUTCH-1157) building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies

2011-10-11 Thread Ferdy (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy closed NUTCH-1157. Resolution: Duplicate My bad I clicked twice. See Nutch-1156. > building errors with gora-hbase as

[jira] [Created] (NUTCH-1157) building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies

2011-10-11 Thread Ferdy (Created) (JIRA)
building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies Key: NUTCH-1157 URL: https://issues.apache.org/jira/browse/NUTCH-1157 Pr

[jira] [Created] (NUTCH-1156) building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies

2011-10-11 Thread Ferdy (Created) (JIRA)
building errors with gora-hbase as a backend; update ivy.xml to use correct dependancies Key: NUTCH-1156 URL: https://issues.apache.org/jira/browse/NUTCH-1156 Pr

[jira] [Updated] (NUTCH-1053) Parsing of RSS feeds fails

2011-10-11 Thread Julien Nioche (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1053: - Fix Version/s: (was: 1.4) > Parsing of RSS feeds fails > --- > >

[jira] [Updated] (NUTCH-1053) Parsing of RSS feeds fails

2011-10-11 Thread Julien Nioche (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Nioche updated NUTCH-1053: - Fix Version/s: 1.5 I'd happily give an example of fix it myself if only I could find it :-) Moved

[jira] [Resolved] (NUTCH-1154) Upgrade to Tika 0.10

2011-10-11 Thread Andrzej Bialecki (Resolved) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrzej Bialecki resolved NUTCH-1154. -- Resolution: Fixed Fix Version/s: 1.4 Assignee: Andrzej Bialecki Commit

[jira] [Commented] (NUTCH-1097) application/xhtml+xml should be enabled in plugin.xml of parse-html; allow multiple mimetypes for plugin.xml

2011-10-11 Thread Ferdy (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13124761#comment-13124761 ] Ferdy commented on NUTCH-1097: -- Hi, As far as I know, currently parse-tika is used as a catc