Jenkins build is back to normal : Nutch-nutchgora #615

2013-05-22 Thread Apache Jenkins Server
See

[jira] [Commented] (NUTCH-1569) Upgrade 2.x to Gora 0.3

2013-05-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664828#comment-13664828 ] Hudson commented on NUTCH-1569: --- Integrated in Nutch-nutchgora #615 (See [https://builds.ap

[jira] [Reopened] (NUTCH-1569) Upgrade 2.x to Gora 0.3

2013-05-22 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reopened NUTCH-1569: - > Upgrade 2.x to Gora 0.3 > --- > > Key: NUTC

[jira] [Commented] (NUTCH-1569) Upgrade 2.x to Gora 0.3

2013-05-22 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664633#comment-13664633 ] Lewis John McGibbney commented on NUTCH-1569: - Reopened @revision 1485475. The

[jira] [Updated] (NUTCH-1566) bin/nutch to allow whitespace in paths

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1566: --- Attachment: NUTCH-1566-v2-trunk.patch New patch including [~tejas.patil]'s suggestions. Also

[jira] [Commented] (NUTCH-1563) FetchSchedule#getFields is never used by GeneraterJob

2013-05-22 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13664408#comment-13664408 ] Tejas Patil commented on NUTCH-1563: I think this is relevant to only 2.x and [~amusem

[jira] [Updated] (NUTCH-1575) support solr authentication in nutch 2.x

2013-05-22 Thread lufeng (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] lufeng updated NUTCH-1575: -- Attachment: NUTCH-1575.patch add solr authentication > support solr authentication in nutch 2.

[jira] [Work started] (NUTCH-1575) support solr authentication in nutch 2.x

2013-05-22 Thread lufeng (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1575?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-1575 started by lufeng. > support solr authentication in nutch 2.x > > > Key: NUTCH-1575 >

[jira] [Created] (NUTCH-1575) support solr authentication in nutch 2.x

2013-05-22 Thread lufeng (JIRA)
lufeng created NUTCH-1575: - Summary: support solr authentication in nutch 2.x Key: NUTCH-1575 URL: https://issues.apache.org/jira/browse/NUTCH-1575 Project: Nutch Issue Type: Improvement Co

Build failed in Jenkins: Nutch-nutchgora #614

2013-05-22 Thread Apache Jenkins Server
See Changes: [tejasp] NUTCH-1249 and NUTCH-1275 : Resolve all issues flagged up by adding javac -Xlint argument [lewismc] NUTCH-1569 Upgrade 2.x to Gora 0.3 -- [...truncated 1674 lines...] [

[jira] [Commented] (NUTCH-1569) Upgrade 2.x to Gora 0.3

2013-05-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1569?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663974#comment-13663974 ] Hudson commented on NUTCH-1569: --- Integrated in Nutch-nutchgora #614 (See [https://builds.ap

[jira] [Commented] (NUTCH-1275) Fix [unchecked] javac warnings

2013-05-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663972#comment-13663972 ] Hudson commented on NUTCH-1275: --- Integrated in Nutch-nutchgora #614 (See [https://builds.ap

[jira] [Commented] (NUTCH-1249) Resolve all issues flagged up by adding javac -Xlint arguement

2013-05-22 Thread Hudson (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13663973#comment-13663973 ] Hudson commented on NUTCH-1249: --- Integrated in Nutch-nutchgora #614 (See [https://builds.ap

[jira] [Resolved] (NUTCH-1249) Resolve all issues flagged up by adding javac -Xlint arguement

2013-05-22 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil resolved NUTCH-1249. Resolution: Fixed Fix Version/s: 2.2 Assignee: Tejas Patil (was: Lewis John McGibbn

[jira] [Resolved] (NUTCH-1275) Fix [unchecked] javac warnings

2013-05-22 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1275?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil resolved NUTCH-1275. Resolution: Fixed Fix Version/s: 2.2 Got resolved with NUTCH-1249 > Fix [un

[jira] [Updated] (NUTCH-1483) Can't crawl filesystem with protocol-file plugin

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1483: --- Fix Version/s: 1.7 > Can't crawl filesystem with protocol-file plugin > -

fix version 1.7 removed in Jira

2013-05-22 Thread Sebastian Nagel
Hi, please take care not to remove the fix version when applying bulk changes, e.g., 2.2 => 2.3 Alternative fix versions (1.7) are not kept. Luckily Jira is quite powerful, I restored the 1.x fix version using this awful filter: project = NUTCH AND fixVersion in ("2.3") AND status = Open AND

[jira] [Updated] (NUTCH-1483) Can't crawl filesystem with protocol-file plugin

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1483: --- Priority: Critical (was: Major) > Can't crawl filesystem with protocol-file plugin > ---

[jira] [Updated] (NUTCH-351) Protocol forward proxy

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-351: -- Fix Version/s: 1.8 > Protocol forward proxy > -- > > Key

[jira] [Updated] (NUTCH-490) Extension point with filters for Neko HTML parser (with patch)

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-490?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-490: -- Fix Version/s: 1.8 > Extension point with filters for Neko HTML parser (with patch) > --

[jira] [Updated] (NUTCH-1531) URL filtering takes long time for very long URLs

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1531: --- Fix Version/s: 1.8 > URL filtering takes long time for very long URLs > -

[jira] [Updated] (NUTCH-710) Support for rel="canonical" attribute

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-710?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-710: -- Fix Version/s: 1.8 > Support for rel="canonical" attribute > ---

[jira] [Updated] (NUTCH-409) Add "short circuit" notion to filters to speedup mixed site/subsite crawling

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-409: -- Fix Version/s: 1.8 > Add "short circuit" notion to filters to speedup mixed site/subsite cra

[jira] [Updated] (NUTCH-945) Indexing to multiple SOLR Servers

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-945?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-945: -- Fix Version/s: 1.8 > Indexing to multiple SOLR Servers > - >

[jira] [Updated] (NUTCH-1250) parse-html does not parse links with empty anchor

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1250: --- Fix Version/s: 1.8 > parse-html does not parse links with empty anchor >

[jira] [Updated] (NUTCH-1562) Order of execution for scoring filters

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1562: --- Fix Version/s: 1.8 > Order of execution for scoring filters > ---

[jira] [Updated] (NUTCH-827) HTTP POST Authentication

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-827?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-827: -- Fix Version/s: 1.8 > HTTP POST Authentication > > >

[jira] [Updated] (NUTCH-1486) Upgrade to Solr 4.2.1

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1486: --- Fix Version/s: 1.8 > Upgrade to Solr 4.2.1 > - > > Ke

[jira] [Updated] (NUTCH-797) parse-tika is not properly constructing URLs when the target begins with a "?"

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-797: -- Fix Version/s: 1.8 > parse-tika is not properly constructing URLs when the target begins wit

[jira] [Updated] (NUTCH-566) Sun's URL class has bug in creation of relative query URLs

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-566: -- Fix Version/s: 1.8 > Sun's URL class has bug in creation of relative query URLs > --

[jira] [Updated] (NUTCH-1190) MoreIndexingFilter refactor: move data formats used to parse "lastModified" to a config file.

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1190?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1190: --- Fix Version/s: 1.8 > MoreIndexingFilter refactor: move data formats used to parse "lastMo

[jira] [Updated] (NUTCH-410) Faster RegexNormalize with more features

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-410: -- Fix Version/s: 1.8 > Faster RegexNormalize with more features >

[jira] [Updated] (NUTCH-1253) Incompatible neko and xerces versions

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-1253: --- Fix Version/s: 1.8 > Incompatible neko and xerces versions >

[jira] [Updated] (NUTCH-356) Plugin repository cache can lead to memory leak

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-356: -- Fix Version/s: 1.8 > Plugin repository cache can lead to memory leak > -

[jira] [Updated] (NUTCH-840) Port tests from parse-html to parse-tika

2013-05-22 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel updated NUTCH-840: -- Fix Version/s: 1.8 > Port tests from parse-html to parse-tika >