[
https://issues.apache.org/jira/browse/NUTCH-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13723956#comment-13723956
]
Ferdy Galema commented on NUTCH-1457:
-
Hi,
Thanks for submitting the patch. It s
[
https://issues.apache.org/jira/browse/NUTCH-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13711529#comment-13711529
]
Ferdy Galema commented on NUTCH-1457:
-
Ok cool. Like Lewis said it would be bes
[
https://issues.apache.org/jira/browse/NUTCH-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13704505#comment-13704505
]
Ferdy Galema commented on NUTCH-1457:
-
That seems like a nice solution, alth
[
https://issues.apache.org/jira/browse/NUTCH-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13700978#comment-13700978
]
Ferdy Galema commented on NUTCH-1457:
-
That should work. Can't think of a r
[
https://issues.apache.org/jira/browse/NUTCH-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545766#comment-13545766
]
Ferdy Galema commented on NUTCH-1508:
-
NUTCH-1431 (aka 'distance'
[
https://issues.apache.org/jira/browse/NUTCH-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545748#comment-13545748
]
Ferdy Galema edited comment on NUTCH-1508 at 1/7/13 10:1
[
https://issues.apache.org/jira/browse/NUTCH-1508?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13545748#comment-13545748
]
Ferdy Galema commented on NUTCH-1508:
-
Hi,
Is this related to?
h
[
https://issues.apache.org/jira/browse/NUTCH-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500974#comment-13500974
]
Ferdy Galema commented on NUTCH-1495:
-
Fair enough.
I understand the reasonin
[
https://issues.apache.org/jira/browse/NUTCH-1495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13500895#comment-13500895
]
Ferdy Galema commented on NUTCH-1495:
-
Hi,
Nice one! I took a glance at your p
[
https://issues.apache.org/jira/browse/NUTCH-1370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493885#comment-13493885
]
Ferdy Galema commented on NUTCH-1370:
-
Hi,
I checked the patch, it seems you
[
https://issues.apache.org/jira/browse/NUTCH-1489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493829#comment-13493829
]
Ferdy Galema commented on NUTCH-1489:
-
Agree with Lewis, it seems there is alr
[
https://issues.apache.org/jira/browse/NUTCH-1484?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493820#comment-13493820
]
Ferdy Galema commented on NUTCH-1484:
-
Hi,
I checked the patch (attached in N
[
https://issues.apache.org/jira/browse/NUTCH-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13493565#comment-13493565
]
Ferdy Galema commented on NUTCH-1457:
-
There is a limited description of the Nu
[
https://issues.apache.org/jira/browse/NUTCH-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13491289#comment-13491289
]
Ferdy Galema commented on NUTCH-1457:
-
Hi,
Not really because with a partial up
[
https://issues.apache.org/jira/browse/NUTCH-1457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13471480#comment-13471480
]
Ferdy Galema commented on NUTCH-1457:
-
Included effort is resolving the conflic
[
https://issues.apache.org/jira/browse/NUTCH-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema resolved NUTCH-1468.
-
Resolution: Fixed
Fix Version/s: 2.1
Committed @ Nutch2.x ref 1386526
Thanks for the
2.1 sounds good!
On Sun, Sep 16, 2012 at 12:14 AM, Lewis John Mcgibbney <
lewis.mcgibb...@gmail.com> wrote:
> Hi,
>
> On Sat, Sep 15, 2012 at 10:38 PM, Markus Jelsma
> wrote:
> > Trunk has some unresolved issues that are eligible for 1.6. Someone here
> can create a 1.7 version in Jira? Then we
[
https://issues.apache.org/jira/browse/NUTCH-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451910#comment-13451910
]
Ferdy Galema commented on NUTCH-872:
That IS really weird. Not sure why it doe
[
https://issues.apache.org/jira/browse/NUTCH-1468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13451897#comment-13451897
]
Ferdy Galema commented on NUTCH-1468:
-
A nice catch indeed. Looks fine.
I'
[
https://issues.apache.org/jira/browse/NUTCH-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1456.
---
Resolution: Fixed
Tested the patch and it works. Thanks Alexander.
Commited @ Nutch2.x ref 1382037
[
https://issues.apache.org/jira/browse/NUTCH-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450514#comment-13450514
]
Ferdy Galema commented on NUTCH-1459:
-
Ok. (If it still not right, just let me
[
https://issues.apache.org/jira/browse/NUTCH-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450488#comment-13450488
]
Ferdy Galema commented on NUTCH-1459:
-
Hi,
Do you mean "Committed @ Nutc
[
https://issues.apache.org/jira/browse/NUTCH-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13450461#comment-13450461
]
Ferdy Galema commented on NUTCH-872:
Christian, I ran a testcrawl with Nutch2.x br
[
https://issues.apache.org/jira/browse/NUTCH-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1459.
---
Resolution: Fixed
Committed.
> Remove dead code (phase2) from Injector
[
https://issues.apache.org/jira/browse/NUTCH-1459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1459:
Attachment: nutch-1459.txt
> Remove dead code (phase2) from Injector
[
https://issues.apache.org/jira/browse/NUTCH-1461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446518#comment-13446518
]
Ferdy Galema commented on NUTCH-1461:
-
Added comment in NUTCH-
[
https://issues.apache.org/jira/browse/NUTCH-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446515#comment-13446515
]
Ferdy Galema commented on NUTCH-1448:
-
Yes it does show up as an outlink.
About
[
https://issues.apache.org/jira/browse/NUTCH-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446511#comment-13446511
]
Ferdy Galema commented on NUTCH-872:
Yes that is correct.
>
[
https://issues.apache.org/jira/browse/NUTCH-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1431.
---
Resolution: Fixed
committed
> Introduce link 'distance' and add con
[
https://issues.apache.org/jira/browse/NUTCH-1456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1456:
Fix Version/s: 2.1
> Updater not setting batchId in markers correc
[
https://issues.apache.org/jira/browse/NUTCH-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1448.
---
Resolution: Fixed
Committed.
> Redirected urls should be handled more cleanly (m
[
https://issues.apache.org/jira/browse/NUTCH-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1463.
---
Resolution: Fixed
committed.
> Elasticsearch indexer should wait and check respo
[
https://issues.apache.org/jira/browse/NUTCH-1463?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1463:
Attachment: nutch-1463.patch
> Elasticsearch indexer should wait and check response for l
Ferdy Galema created NUTCH-1463:
---
Summary: Elasticsearch indexer should wait and check response for
last flush
Key: NUTCH-1463
URL: https://issues.apache.org/jira/browse/NUTCH-1463
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1462.
---
Resolution: Fixed
committed
> Elasticsearch not indexing when type==null
[
https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445878#comment-13445878
]
Ferdy Galema commented on NUTCH-1445:
-
Created NUTCH-1462 for a fix. For a quick
[
https://issues.apache.org/jira/browse/NUTCH-1462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1462:
Attachment: nutch-1462.patch
> Elasticsearch not indexing when type==null in NutchDocum
Ferdy Galema created NUTCH-1462:
---
Summary: Elasticsearch not indexing when type==null in
NutchDocument metadata
Key: NUTCH-1462
URL: https://issues.apache.org/jira/browse/NUTCH-1462
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445871#comment-13445871
]
Ferdy Galema commented on NUTCH-1445:
-
Ah I got it now.
It's definitely a
[
https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445850#comment-13445850
]
Ferdy Galema commented on NUTCH-1445:
-
("feature requests" should be &q
[
https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13445849#comment-13445849
]
Ferdy Galema commented on NUTCH-1445:
-
Hi Matt,
Sure we can resolve your issue
[
https://issues.apache.org/jira/browse/NUTCH-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1448:
Attachment: nutch-1448.txt
Thank you for you interest Christian. This issue should indeed prevent
Ferdy Galema created NUTCH-1459:
---
Summary: Remove dead code (phase2) from InjectorJob
Key: NUTCH-1459
URL: https://issues.apache.org/jira/browse/NUTCH-1459
Project: Nutch
Issue Type
Hi,
Yeah this is something I noticed too some while ago. Although it does not
directly break the crawling directly, it is not a nice implementation.
Notice that the Generator tries to correct for fetchtime too far off in the
future. (In the AbstractFetchSchedule shouldFetch method.)
As a matter o
Ferdy Galema created NUTCH-1457:
---
Summary: Nutch2 Refactor the update process so that fetched items
are only processed once
Key: NUTCH-1457
URL: https://issues.apache.org/jira/browse/NUTCH-1457
Project
Hi,
This bug was already remarked some posts ago on the mailing list, but
thanks anyway for reporting.
I have created issue for keeping track:
https://issues.apache.org/jira/browse/NUTCH-1456
Ferdy.
On Wed, Aug 15, 2012 at 1:59 PM, lin weijian wrote:
> Hi,
> i find a bug in nu
Ferdy Galema created NUTCH-1456:
---
Summary: Updater not setting batchId in markers correctly.
Key: NUTCH-1456
URL: https://issues.apache.org/jira/browse/NUTCH-1456
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-1434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13435000#comment-13435000
]
Ferdy Galema commented on NUTCH-1434:
-
+1 for removing commandline args and u
Ferdy Galema created NUTCH-1452:
---
Summary: hadoop.job.history.user.location in nutch-default making
job history useless
Key: NUTCH-1452
URL: https://issues.apache.org/jira/browse/NUTCH-1452
Project
FYI I've created a Jira for followup discussion.
https://issues.apache.org/jira/browse/NUTCH-1452
On Tue, Aug 7, 2012 at 11:21 AM, Ferdy Galema wrote:
> Hi,
>
> There still is a property in nutch-default
> 'hadoop.job.history.user.location' that redirects the creation
[
https://issues.apache.org/jira/browse/NUTCH-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1365.
---
Resolution: Fixed
committed
> Fix crawlId functionalilty by making using of
[
https://issues.apache.org/jira/browse/NUTCH-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1442.
---
> indexingfilter.order is property is misread in c
[
https://issues.apache.org/jira/browse/NUTCH-1442?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13433955#comment-13433955
]
Ferdy Galema commented on NUTCH-1442:
-
Thanks. Looks fine.
Assertions should
[
https://issues.apache.org/jira/browse/NUTCH-1448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1448:
Description: This is specifically for Nutch2.x. Handling a redirects url
like an outlink is much
Ferdy Galema created NUTCH-1448:
---
Summary: Redirected urls should be handled more cleanly (more like
an outlink url)
Key: NUTCH-1448
URL: https://issues.apache.org/jira/browse/NUTCH-1448
Project: Nutch
Cheers!
On Thu, Aug 9, 2012 at 9:56 AM, Julien Nioche wrote:
> Doug Cutting on twitter :
> https://twitter.com/cutting/status/233415059798372353
>
> *RT @StefanGroschupf: Happy 10th birthday#Nutch! Registered at sourceforce
> august 2002. Turned out to be quite a game changer. #Hadoop
> *
> Happ
Hi,
There still is a property in nutch-default
'hadoop.job.history.user.location' that redirects the creation of history
files from job output locations to a custom location. I noticed that the
current value does not work well with CDH, because ${hadoop.log.dir} is not
defined. This actually cause
[
https://issues.apache.org/jira/browse/NUTCH-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429986#comment-13429986
]
Ferdy Galema commented on NUTCH-1444:
-
Just to add:
The following exception is f
Ferdy Galema created NUTCH-1446:
---
Summary: Port NUTCH-1444 to trunk (Indexing should not create
temporary files)
Key: NUTCH-1446
URL: https://issues.apache.org/jira/browse/NUTCH-1446
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429104#comment-13429104
]
Ferdy Galema commented on NUTCH-1047:
-
Ah yes I think that is what we should aim
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429095#comment-13429095
]
Ferdy Galema commented on NUTCH-1047:
-
I did not mean to confuse people by u
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429093#comment-13429093
]
Ferdy Galema commented on NUTCH-1047:
-
Changing NutchIndexWriter into an endp
[
https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429067#comment-13429067
]
Ferdy Galema commented on NUTCH-1445:
-
Hi Julien,
Agreed to wait a while be
[
https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1445.
---
Resolution: Fixed
> Add ElasticIndexerJob that indexes to elasticsea
[
https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1445:
Attachment: NUTCH-1445-addPropsToConfig.patch
Final addition that adds the properties to nutch
[
https://issues.apache.org/jira/browse/NUTCH-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13427226#comment-13427226
]
Ferdy Galema commented on NUTCH-1365:
-
Nutch should be updated to Gora
[
https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1445:
Attachment: NUTCH-1445-addToNutchScript.patch
Added and committed patch that adds command to Nutch
[
https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13426660#comment-13426660
]
Ferdy Galema commented on NUTCH-1445:
-
committed in Nutch2
&
[
https://issues.apache.org/jira/browse/NUTCH-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1444.
---
Resolution: Fixed
committed
> Indexing should not create temporary files (do
[
https://issues.apache.org/jira/browse/NUTCH-1445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1445:
Attachment: NUTCH-1445.patch
> Add ElasticIndexerJob that indexes to elasticsea
[
https://issues.apache.org/jira/browse/NUTCH-1444?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1444:
Attachment: NUTCH-1444.patch
> Indexing should not create temporary files (do not extend f
Ferdy Galema created NUTCH-1445:
---
Summary: Add ElasticIndexerJob that indexes to elasticsearch
Key: NUTCH-1445
URL: https://issues.apache.org/jira/browse/NUTCH-1445
Project: Nutch
Issue Type
Ferdy Galema created NUTCH-1444:
---
Summary: Indexing should not create temporary files (do not extend
from FileOutputFormat)
Key: NUTCH-1444
URL: https://issues.apache.org/jira/browse/NUTCH-1444
Project
[
https://issues.apache.org/jira/browse/NUTCH-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1441:
Attachment: NUTCH-1441-trunk.patch
Patch for trunk. It would be great if you could apply and test
[
https://issues.apache.org/jira/browse/NUTCH-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema reopened NUTCH-1441:
-
> AnchorIndexingFilter should use plain Hash
[
https://issues.apache.org/jira/browse/NUTCH-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1441:
Patch Info: Patch Available
Fix Version/s: 1.6
> AnchorIndexingFilter should use pl
Ferdy Galema created NUTCH-1442:
---
Summary: indexingfilter.order is property is misread in code
Key: NUTCH-1442
URL: https://issues.apache.org/jira/browse/NUTCH-1442
Project: Nutch
Issue Type
[
https://issues.apache.org/jira/browse/NUTCH-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1441.
---
Resolution: Fixed
committed
> AnchorIndexingFilter should use plain Hash
[
https://issues.apache.org/jira/browse/NUTCH-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1441:
Attachment: NUTCH-1441.patch
> AnchorIndexingFilter should use plain Hash
Ferdy Galema created NUTCH-1441:
---
Summary: AnchorIndexingFilter should use plain HashSet
Key: NUTCH-1441
URL: https://issues.apache.org/jira/browse/NUTCH-1441
Project: Nutch
Issue Type: Bug
[
https://issues.apache.org/jira/browse/NUTCH-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1438.
---
Resolution: Fixed
committed
> ParserJob support for option -repa
Ferdy Galema created NUTCH-1438:
---
Summary: ParserJob support for option -reparse
Key: NUTCH-1438
URL: https://issues.apache.org/jira/browse/NUTCH-1438
Project: Nutch
Issue Type: New Feature
[
https://issues.apache.org/jira/browse/NUTCH-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1438:
Attachment: NUTCH-1438.patch
> ParserJob support for option -repa
[
https://issues.apache.org/jira/browse/NUTCH-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1365:
Attachment: NUTCH-1365-v4.patch
new patch fixes crawlId functionality for HostInjectorJob too
[
https://issues.apache.org/jira/browse/NUTCH-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1437.
---
Resolution: Fixed
reopening/closing to set correct resolve status (FIXED
[
https://issues.apache.org/jira/browse/NUTCH-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1437.
---
Resolution: Cannot Reproduce
committed
> HostInjectorJob to accept lines with
[
https://issues.apache.org/jira/browse/NUTCH-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema reopened NUTCH-1437:
-
> HostInjectorJob to accept lines with or without proto
[
https://issues.apache.org/jira/browse/NUTCH-1437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1437:
Attachment: NUTCH-1437.patch
> HostInjectorJob to accept lines with or without proto
Ferdy Galema created NUTCH-1437:
---
Summary: HostInjectorJob to accept lines with or without protocol
Key: NUTCH-1437
URL: https://issues.apache.org/jira/browse/NUTCH-1437
Project: Nutch
Issue
[
https://issues.apache.org/jira/browse/NUTCH-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1365:
Attachment: NUTCH-1365-v3.patch
Small improvement of the patch by showing the crawlId name in the
[
https://issues.apache.org/jira/browse/NUTCH-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1365:
Attachment: NUTCH-1365-v2.patch
Updated patch for new version of GORA-150.
>
Ferdy Galema created NUTCH-1432:
---
Summary: property storage.schema does not work anymore, should be
storage.schema.webpage and storage.schema.host
Key: NUTCH-1432
URL: https://issues.apache.org/jira/browse/NUTCH
[
https://issues.apache.org/jira/browse/NUTCH-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1365:
Attachment: NUTCH-1365.patch
The updated patch. (Because of the splitting up of the corresponding
[
https://issues.apache.org/jira/browse/NUTCH-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417120#comment-13417120
]
Ferdy Galema commented on NUTCH-1365:
-
When we update Gora to 0.3, we can commit
[
https://issues.apache.org/jira/browse/NUTCH-1365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1365:
Attachment: (was: NUTCH-1365.patch)
> Fix crawlId functionalilty by making using of
[
https://issues.apache.org/jira/browse/NUTCH-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13417005#comment-13417005
]
Ferdy Galema commented on NUTCH-1431:
-
It is a way to keep the size of a crawl wi
[
https://issues.apache.org/jira/browse/NUTCH-1431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema updated NUTCH-1431:
Attachment: NUTCH-1431.patch
> Introduce link 'distance' and add configurable ma
Ferdy Galema created NUTCH-1431:
---
Summary: Introduce link 'distance' and add configurable max
distance in the generator
Key: NUTCH-1431
URL: https://issues.apache.org/jira/browse/NUTCH-1431
[
https://issues.apache.org/jira/browse/NUTCH-1360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13410884#comment-13410884
]
Ferdy Galema commented on NUTCH-1360:
-
Thanks! Keep up the good
[
https://issues.apache.org/jira/browse/NUTCH-1428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ferdy Galema closed NUTCH-1428.
---
Resolution: Fixed
committed.
> GeneratorMapper should not initialize filt
1 - 100 of 385 matches
Mail list logo