[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13596042#comment-13596042
]
Lewis John McGibbney commented on NUTCH-1047:
-
Nice worj Julien.
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595809#comment-13595809
]
Hudson commented on NUTCH-1047:
---
Integrated in Nutch-trunk #2144 (See
[https://builds.apach
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595806#comment-13595806
]
Hudson commented on NUTCH-1047:
---
Integrated in Nutch-trunk-Windows #57 (See
[https://builds
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13595252#comment-13595252
]
Sebastian Nagel commented on NUTCH-1047:
Hi Julien,
in overall, all looks good. A
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583111#comment-13583111
]
Julien Nioche commented on NUTCH-1047:
--
Tejas,
The CleaningJob is backend-neutral an
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13583011#comment-13583011
]
Tejas Patil commented on NUTCH-1047:
Hi Julien,
One small change in Java class will b
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582183#comment-13582183
]
Julien Nioche commented on NUTCH-1047:
--
Hi Tejas
Good catch, could do
{color:red}
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582169#comment-13582169
]
Tejas Patil commented on NUTCH-1047:
Hey Julien, One question: Why is this change not
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582163#comment-13582163
]
Tejas Patil commented on NUTCH-1047:
Hey Julien,
While running the solrclean command,
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13582034#comment-13582034
]
Julien Nioche commented on NUTCH-1047:
--
Hi Tejas
Thank you for taking the time to ha
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581964#comment-13581964
]
Tejas Patil commented on NUTCH-1047:
Hi Julien,
The crawl command (with solr option)
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13581910#comment-13581910
]
lufeng commented on NUTCH-1047:
---
The patch v5 is work correctly in nutch 1.6 with solr 3.6.
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566291#comment-13566291
]
Julien Nioche commented on NUTCH-1047:
--
[~wastl-nagel] a text based indexer is a good
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13565202#comment-13565202
]
Tejas Patil commented on NUTCH-1047:
Hi Julien,
As you suggested, I tried to run solr
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564827#comment-13564827
]
Sebastian Nagel commented on NUTCH-1047:
As some test for the interface started to
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564263#comment-13564263
]
Julien Nioche commented on NUTCH-1047:
--
Tejas
The crawl script and the solr index sh
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564252#comment-13564252
]
Tejas Patil commented on NUTCH-1047:
Hi Julien, The solrindex commmand and crawl scrip
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564196#comment-13564196
]
Julien Nioche commented on NUTCH-1047:
--
Hi Tejas
It will work everytime you set it i
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564187#comment-13564187
]
Tejas Patil commented on NUTCH-1047:
Hi Julien,
After reply from @lufeng, I was able
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564173#comment-13564173
]
Julien Nioche commented on NUTCH-1047:
--
@tejasp can reproduce the issue and am lookin
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564095#comment-13564095
]
Tejas Patil commented on NUTCH-1047:
Hi Lufeng,
You are right. There was a problem wi
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564089#comment-13564089
]
lufeng commented on NUTCH-1047:
---
Hi Julien,
I found in bin/nutch there is a line like this
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564076#comment-13564076
]
lufeng commented on NUTCH-1047:
---
Hi Tejas
Maybe you don't add -D option with bin/nutch craw
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563793#comment-13563793
]
Tejas Patil commented on NUTCH-1047:
Hi Julien,
I am trying out the patch and facing a
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13563716#comment-13563716
]
Lewis John McGibbney commented on NUTCH-1047:
-
Hi Julien, it will be early nex
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562558#comment-13562558
]
Julien Nioche commented on NUTCH-1047:
--
Hi Lufeng.
The solrindex command in the nut
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13562369#comment-13562369
]
lufeng commented on NUTCH-1047:
---
Hi, i put the patch , but i do not found how to set solrURI
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13557322#comment-13557322
]
Markus Jelsma commented on NUTCH-1047:
--
Excellent work my friend! I'll be sure to tes
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556096#comment-13556096
]
Markus Jelsma commented on NUTCH-1047:
--
no, i understood correctly :)
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556091#comment-13556091
]
Julien Nioche commented on NUTCH-1047:
--
my suggestion was that you give NUTCH-1047 a
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556084#comment-13556084
]
Markus Jelsma commented on NUTCH-1047:
--
{quote}which is a good way of reviewing it{qu
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556079#comment-13556079
]
Julien Nioche commented on NUTCH-1047:
--
Should not be a big deal as the classes affec
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556075#comment-13556075
]
Markus Jelsma commented on NUTCH-1047:
--
too bad.
I'm not sure, at least 1480 is read
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556054#comment-13556054
]
Julien Nioche commented on NUTCH-1047:
--
Tried, failed.
Re- other issues : wouldn't i
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556052#comment-13556052
]
Markus Jelsma commented on NUTCH-1047:
--
Alright, i'll skip dedup for NUTCH-1480 and s
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556041#comment-13556041
]
Julien Nioche commented on NUTCH-1047:
--
We definitely need a better mechanism for ded
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556039#comment-13556039
]
Markus Jelsma commented on NUTCH-1047:
--
I had an issue with dedup too in NUTCH-1480,
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13556026#comment-13556026
]
Julien Nioche commented on NUTCH-1047:
--
Good point Markus, thanks.
The main issue I a
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1376#comment-1376
]
Markus Jelsma commented on NUTCH-1047:
--
Very nice Julien! Can you also add update() t
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429104#comment-13429104
]
Ferdy Galema commented on NUTCH-1047:
-
Ah yes I think that is what we should aim for.
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429101#comment-13429101
]
Julien Nioche commented on NUTCH-1047:
--
Thanks for your comments Ferdy
bq. What I'v
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429095#comment-13429095
]
Ferdy Galema commented on NUTCH-1047:
-
I did not mean to confuse people by using Nutch
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429093#comment-13429093
]
Ferdy Galema commented on NUTCH-1047:
-
Changing NutchIndexWriter into an endpoint look
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163704#comment-13163704
]
Julien Nioche commented on NUTCH-1047:
--
The class NutchIndexWriter and NutchIndexWrit
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163515#comment-13163515
]
Markus Jelsma commented on NUTCH-1047:
--
Ah yes it makes sense now!
If you look at t
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13163481#comment-13163481
]
Julien Nioche commented on NUTCH-1047:
--
bq. If you'd need WARC files, for some reason
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162977#comment-13162977
]
Markus Jelsma commented on NUTCH-1047:
--
Hi Julien,
I'm not sure i get your point exa
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13162704#comment-13162704
]
Julien Nioche commented on NUTCH-1047:
--
It would be nice to have a plugin implementin
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066484#comment-13066484
]
Julien Nioche commented on NUTCH-1047:
--
{quote}
My interest in your last point is a q
[
https://issues.apache.org/jira/browse/NUTCH-1047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13066179#comment-13066179
]
Lewis John McGibbney commented on NUTCH-1047:
-
I think the suggestion of gener
50 matches
Mail list logo