[jira] [Commented] (NUTCH-2064) URLNormalizer basic to properly encode non-ASCII characters

2015-09-14 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744959#comment-14744959 ] Markus Jelsma commented on NUTCH-2064: -- I think having it in CC makes sense indeed. I

[jira] [Resolved] (NUTCH-2093) Indexing filters have no signature in CrawlDatum if crawled via FreeGenerator

2015-09-14 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma resolved NUTCH-2093. -- Resolution: Fixed Assignee: Markus Jelsma Committed to trunk in revision 1703111. > Inde

[jira] [Comment Edited] (NUTCH-2097) Proposal for Nutch 3.x

2015-09-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744364#comment-14744364 ] Lewis John McGibbney edited comment on NUTCH-2097 at 9/15/15 6:51 AM: --

[jira] [Comment Edited] (NUTCH-2097) Proposal for Nutch 3.x

2015-09-14 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744953#comment-14744953 ] Markus Jelsma edited comment on NUTCH-2097 at 9/15/15 6:50 AM: -

[jira] [Commented] (NUTCH-2097) Proposal for Nutch 3.x

2015-09-14 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744953#comment-14744953 ] Markus Jelsma commented on NUTCH-2097: -- Interesting! What does 'Complete Ant + Ivy bu

[jira] [Resolved] (NUTCH-2094) When stopping a crawl in Nutch 2.3, I was having trouble when I start an already stopped crawl and then stop it again.

2015-09-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-2094. - Resolution: Not A Problem This issue is already resolved in 2.X branch https://git

[jira] [Commented] (NUTCH-2094) When stopping a crawl in Nutch 2.3, I was having trouble when I start an already stopped crawl and then stop it again.

2015-09-14 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744857#comment-14744857 ] ASF GitHub Bot commented on NUTCH-2094: --- GitHub user prernasatija opened a pull requ

[GitHub] nutch pull request: fix for NUTCH-2094 contributed by prernasatija

2015-09-14 Thread prernasatija
GitHub user prernasatija opened a pull request: https://github.com/apache/nutch/pull/58 fix for NUTCH-2094 contributed by prernasatija nutch-2094 jira fix You can merge this pull request into a Git repository by running: $ git pull https://github.com/prernasatija/nutch NUTCH-20

[GitHub] nutch pull request: 2.x

2015-09-14 Thread prernasatija
GitHub user prernasatija opened a pull request: https://github.com/apache/nutch/pull/57 2.x You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/nutch 2.x Alternatively you can review and apply these changes as the patch at:

[jira] [Commented] (NUTCH-2097) Proposal for Nutch 3.x

2015-09-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744364#comment-14744364 ] Lewis John McGibbney commented on NUTCH-2097: - Hi Folks, After being hooked up

[jira] [Assigned] (NUTCH-2097) Proposal for Nutch 3.x

2015-09-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney reassigned NUTCH-2097: --- Assignee: Lewis John McGibbney > Proposal for Nutch 3.x > ---

[jira] [Created] (NUTCH-2097) Proposal for Nutch 3.x

2015-09-14 Thread Nadeem Douba (JIRA)
Nadeem Douba created NUTCH-2097: --- Summary: Proposal for Nutch 3.x Key: NUTCH-2097 URL: https://issues.apache.org/jira/browse/NUTCH-2097 Project: Nutch Issue Type: Improvement Affects Versio

Re: Introducing myself (Aron Ahmadia)

2015-09-14 Thread Sebastian Nagel
Welcome, Aron! Thanks for the introduction and the many links! On 09/14/2015 04:25 PM, Aron Ahmadia wrote: > Hi Folks, > > Since I'll be spending some time with the Nutch REST API and the 1.x code > base, I figured I'd send a > quick introduction email to the Nutch developer's mailing list. >

[jira] [Commented] (NUTCH-2064) URLNormalizer basic to properly encode non-ASCII characters

2015-09-14 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14744078#comment-14744078 ] Julien Nioche commented on NUTCH-2064: -- yep, can discuss that post 1.11 > URLNormali

[jira] [Updated] (NUTCH-1800) Documentation for Nutch 1.X and 2.X REST APIs

2015-09-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1800: Summary: Documentation for Nutch 1.X and 2.X REST APIs (was: Substantiate Javadoc f

[jira] [Updated] (NUTCH-1800) Documentation for Nutch 1.X and 2.X REST APIs

2015-09-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1800: Assignee: (was: Fjodor Vershinin) > Documentation for Nutch 1.X and 2.X REST API

[jira] [Comment Edited] (NUTCH-2086) Nutch 1.X Webui

2015-09-14 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743775#comment-14743775 ] Sujen Shah edited comment on NUTCH-2086 at 9/14/15 4:23 PM: Hi

[jira] [Commented] (NUTCH-2086) Nutch 1.X Webui

2015-09-14 Thread Sujen Shah (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743775#comment-14743775 ] Sujen Shah commented on NUTCH-2086: --- Hi [~lewismc] and [~chrismattmann], I was facing s

[jira] [Commented] (NUTCH-2086) Nutch 1.X Webui

2015-09-14 Thread Chris A. Mattmann (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743765#comment-14743765 ] Chris A. Mattmann commented on NUTCH-2086: -- yep: https://github.com/apache/nutch/

[jira] [Commented] (NUTCH-2086) Nutch 1.X Webui

2015-09-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743749#comment-14743749 ] Lewis John McGibbney commented on NUTCH-2086: - Is there a PR for this issue [~

[jira] [Commented] (NUTCH-2064) URLNormalizer basic to properly encode non-ASCII characters

2015-09-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743747#comment-14743747 ] Lewis John McGibbney commented on NUTCH-2064: - [~jnioche], what do you think a

[jira] [Commented] (NUTCH-1943) Form authentication should not be global and ignore

2015-09-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1943?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743745#comment-14743745 ] Lewis John McGibbney commented on NUTCH-1943: - I'll have a crack at this today

[jira] [Commented] (NUTCH-2093) Indexing filters have no signature in CrawlDatum if crawled via FreeGenerator

2015-09-14 Thread Lewis John McGibbney (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743743#comment-14743743 ] Lewis John McGibbney commented on NUTCH-2093: - [~markus.jel...@openindex.io] c

[jira] [Commented] (NUTCH-2093) Indexing filters have no signature in CrawlDatum if crawled via FreeGenerator

2015-09-14 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2093?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743717#comment-14743717 ] Sebastian Nagel commented on NUTCH-2093: +1 Also a fetch datum of an injected URL

Re: Introducing myself (Aron Ahmadia)

2015-09-14 Thread Mattmann, Chris A (3980)
Woo hoo welcome Aron!!! ++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: chris.a.mattm.

Introducing myself (Aron Ahmadia)

2015-09-14 Thread Aron Ahmadia
Hi Folks, Since I'll be spending some time with the Nutch REST API and the 1.x code base, I figured I'd send a quick introduction email to the Nutch developer's mailing list. My name is Aron Ahmadia, and I'm a computational/data scientist with Continuum Analytics. We're working with several acti