[
https://issues.apache.org/jira/browse/NUTCH-1392?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294730#comment-13294730
]
Lewis John McGibbney commented on NUTCH-1392:
-
Additionally this issue should
Hi Sebastian,
On Wed, Jun 13, 2012 at 11:30 PM, Sebastian Nagel
wrote:
>I'll managed to perform a crawl with 2.0 and HBase: it rocks, indeed.
> Much simpler than 1.x (no segments!).
:0)
> % ./bin/nutch readdb -stats
> WebTable statistics start
> WebTableReader: java.io.EOFException
> at
Lewis John McGibbney created NUTCH-1394:
---
Summary: backport NUTCH-1232 Remove host field from index-basic
Key: NUTCH-1394
URL: https://issues.apache.org/jira/browse/NUTCH-1394
Project: Nutch
Lewis John McGibbney created NUTCH-1393:
---
Summary: Display consistent usage of GeneratorJob with 1.X
Key: NUTCH-1393
URL: https://issues.apache.org/jira/browse/NUTCH-1393
Project: Nutch
Lewis John McGibbney created NUTCH-1392:
---
Summary: -force and -resume arguments being ignored in ParserJob
Key: NUTCH-1392
URL: https://issues.apache.org/jira/browse/NUTCH-1392
Project: Nutch
Lewis John McGibbney created NUTCH-1391:
---
Summary: readdb -stats fires java.io.EOFException
Key: NUTCH-1391
URL: https://issues.apache.org/jira/browse/NUTCH-1391
Project: Nutch
Issue Ty
Lewis John McGibbney created NUTCH-1390:
---
Summary: readdb -url $url throws NPE with gora-cassandra
Key: NUTCH-1390
URL: https://issues.apache.org/jira/browse/NUTCH-1390
Project: Nutch
I
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The "FrontPage" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/FrontPage?action=diff&rev1=242&rev2=243
=== Tutorials ===
* NutchTutorial - How to confi
Hi Lewis,
> Please see http://wiki.apache.org/nutch/Nutch2Tutorial which is an
> update of Julien's (I think) page on GORA_HBase. Thsi will get you
> rocking with HBase. The changes between Cassandra, Accumulo and the
> other data stores are fairly trivial.
I'll managed to perform a crawl with 2.
+1 to the description w/o experimental too (I agree with Ferdy).
You guys ROCK.
Cheers,
Chris
On Jun 13, 2012, at 5:29 AM, Lewis John Mcgibbney wrote:
> Hi,
>
> Seeing as we have the ball rolling with the 2.0 RC. I thought I'd ask
> about a suitable project descriptor.
>
> So far on trunk we
Hi Guys,
Whilst updating the Nutch2Tutorial I got thinking that within Gora we don't
supply binary distributions of the code, this is because when using Gora a
user may wish/require to recompile the code to accomodate config changes
etc. We only supply src distributions...
Does this principle app
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The "Nutch2Tutorial" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/Nutch2Tutorial?action=diff&rev1=3&rev2=4
This document describes how to get Nutch 2.0 to
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The "Nutch2Tutorial" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/Nutch2Tutorial?action=diff&rev1=2&rev2=3
gora.datastore.default=org.apache.gora.hbase
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The "Nutch2Tutorial" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/Nutch2Tutorial?action=diff&rev1=1&rev2=2
-
+ }}}
+
+ * Ensure t
" and and array other document " looks like a typo, rest is fine
On 13 June 2012 13:45, Ferdy Galema wrote:
> Hi,
>
> I would remove the 'experimental' notion. Aside from that it's fine with
> me.
>
> Ferdy.
>
>
> On Wed, Jun 13, 2012 at 2:29 PM, Lewis John Mcgibbney <
> lewis.mcgibb...@gmail.co
Ferdy
>
> The Nutch job jar is not present in the binary archive. This means
> distributed running of jobs is not supported. I'm not sure if this is a
> problem (since users can always build one themselves), merely pointing it
> out. The recently released 1.5 also lacks this job jar, so at least n
[
https://issues.apache.org/jira/browse/NUTCH-1342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13294429#comment-13294429
]
Ferdy Galema commented on NUTCH-1342:
-
Do you have any clue as to why protocol-httpcli
Hi,
I would remove the 'experimental' notion. Aside from that it's fine with me.
Ferdy.
On Wed, Jun 13, 2012 at 2:29 PM, Lewis John Mcgibbney <
lewis.mcgibb...@gmail.com> wrote:
> Hi,
>
> Seeing as we have the ball rolling with the 2.0 RC. I thought I'd ask
> about a suitable project descriptor
Hi,
Seeing as we have the ball rolling with the 2.0 RC. I thought I'd ask
about a suitable project descriptor.
So far on trunk we have
** Apache Nutch is an open source web-search software project.
Stemming from Apache Lucene, it now builds on Apache Solr adding
web-specifics, such as a crawler,
Hi Seb,
Quick update
On Tue, Jun 12, 2012 at 11:33 PM, Sebastian Nagel
wrote:
>1 some guidance would be nice. README.txt points
> to http://wiki.apache.org/nutch/NutchTutorial which refers to 1.x
Please see http://wiki.apache.org/nutch/Nutch2Tutorial which is an
update of Julien's (I think) pag
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The "FrontPage" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/FrontPage?action=diff&rev1=241&rev2=242
* Nutch2Roadmap -- Discussions on the architecture an
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The "Nutch2Tutorial" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/Nutch2Tutorial
New page:
= Nutch 2.0 Tutorial =
{{http://www.interadvertising.co.uk/files/n
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The "FrontPage" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/FrontPage?action=diff&rev1=240&rev2=241
* [[NutchMavenSupport|Using Nutch as a Maven dependen
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The "GORA_HBase" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/GORA_HBase?action=diff&rev1=13&rev2=14
org.apache.gora.hbase.store.HBaseStore
Default cla
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The "GORA_HBase" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/GORA_HBase?action=diff&rev1=12&rev2=13
This document describes how to get Nutch 2.0 to use
Dear Wiki user,
You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change
notification.
The "GORA_HBase" page has been changed by LewisJohnMcgibbney:
http://wiki.apache.org/nutch/GORA_HBase?action=diff&rev1=11&rev2=12
- This document describes how to get Nutch to use HBase
Hi Seb,
As Chris said, the issues you highlight well justify another RC.
I can shift it by the end of play today.
Thanks very much for having a look through guys
Lewis
On Tue, Jun 12, 2012 at 11:33 PM, Sebastian Nagel
wrote:
> Hi Lewis,
>
> my first steps with 2.0 (to be continued, still stru
Hmm please ignore "the parse text limited to 100 chars", this is actually
not the case. (Only in our branch that has a fix for limiting anchor texts;
not yet present in in the nutchgora branch because it still needs
polishing). So no need to wait for commits on my part.
On Wed, Jun 13, 2012 at 11:
Findings about Nutch-2.0 RC 1.
The Nutch job jar is not present in the binary archive. This means
distributed running of jobs is not supported. I'm not sure if this is a
problem (since users can always build one themselves), merely pointing it
out. The recently released 1.5 also lacks this job jar
29 matches
Mail list logo