[
https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12542819
]
Renaud Richardet commented on NUTCH-444:
hi,
i am travelling and will be offline until january 2008. thanks
[
https://issues.apache.org/jira/browse/NUTCH-540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Renaud Richardet updated NUTCH-540:
---
Priority: Major (was: Blocker)
could you please attach log files and error messages? thanks
[
https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12475795
]
Renaud Richardet commented on NUTCH-444:
+1 for the transparency interface
thanks,
Renaud
> Possibly us
[
https://issues.apache.org/jira/browse/NUTCH-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Renaud Richardet updated NUTCH-369:
---
Attachment: remover.diff
just FYI, you can further filter which element neko should keep and
[
https://issues.apache.org/jira/browse/NUTCH-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Renaud Richardet updated NUTCH-369:
---
Priority: Minor (was: Major)
Affects Version/s: (was: 0.8
[
https://issues.apache.org/jira/browse/NUTCH-369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Renaud Richardet updated NUTCH-369:
---
Attachment: patch.diff
unified diff against head.
- fixes encoding, as described by King
rubdabadub wrote:
On 2/20/07, Renaud Richardet <[EMAIL PROTECTED]> wrote:
Hi Thorsten,
I have quickly looked at the Droid code, and was wondering why you don't
want to completely reuse the Nutch plugin API in Droid. This way, you
could reuse the Nutch parse-* plugins without mo
sted in
such a plugin? Does it makes sense?
Please test and report feedback to [EMAIL PROTECTED] I will happily
answer all mails there.
salu2
--
Renaud Richardet +1 617 230 9112
my email is my first name at apache.org http://www.oslutions.com
[
https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12472733
]
Renaud Richardet commented on NUTCH-443:
hi All,
Glad to see that this patch is moving forward :-)
I have
[
https://issues.apache.org/jira/browse/NUTCH-444?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471880
]
Renaud Richardet commented on NUTCH-444:
Gal,
Would you be able to share your code with Stax? What license
[
https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12471878
]
Renaud Richardet commented on NUTCH-443:
Nutch Newbie, Gal, Chris
It's great that you discuss altern
Project: Nutch
Issue Type: Improvement
Components: fetcher
Affects Versions: 0.9.0
Reporter: Renaud Richardet
Priority: Minor
Fix For: 0.9.0
As discussed by Nutch Newbie, Gal, and Chris on NUTCH-443, the current library
[
https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Renaud Richardet updated NUTCH-443:
---
Attachment: NUTCH-443-draft-v4.patch
Hi Dogacan,
Thanks for merging the patches, good
ly.
Could something like that work?
Doug
--
Renaud Richardet +1 617 230 9112
my email is my first name at apache.org http://www.oslutions.com
[
https://issues.apache.org/jira/browse/NUTCH-443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Renaud Richardet updated NUTCH-443:
---
Attachment: parsers.diff
Great, here's my work-in-progress(not finished, not tested
Chris Mattmann wrote:
Guys,
Sorry to be so thick-headed, but could someone explain to me in really
simple language what this change is requesting that is different from the
current Nutch API? I still don't get it, sorry...
Currently, the RSS parser returns a single Parse object that aggregat
Doug Cutting wrote:
Renaud Richardet wrote:
I see. I was thinking that I could index the feed items without
having to fetch them individually.
Okay, so if Parser#parse returned a Map, then the URL
for each parse should be that of its link, since you don't want to
fetch that separ
Issue Type: New Feature
Components: fetcher
Affects Versions: 0.9.0
Reporter: Renaud Richardet
Priority: Minor
Fix For: 0.9.0
allow Parser#parse to return a Map. This way, the RSS parser can
return multiple parse objects, that will all be
Doug Cutting wrote:
Renaud Richardet wrote:
The usecase is that you index RSS-feeds, but your users can search
each feed-entry as a single document. Does it makes sense?
But each feed item also contains a link whose content will be indexed
and that's generally a superset of the
don't know how to handle that Configuration-Objects (setConf() etc.)
What should I do to avoid that error? Where does the
Configuration-Object come from?
TIA
Tobias Zahn
--
Renaud Richardet +1 617 230 9112
my email is my first name at
Mailstop: 171-246
___
Disclaimer: The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.
--
Renaud Richardet +1 617 230 9112
my email is my
___
Jet Propulsion Laboratory Pasadena, CA
Office: 171-266BMailstop: 171-246
___
Disclaimer: The opinions presented within are my own and do not reflect
those of either NASA, JPL, or the California Institute of Technology.
--
renaud richardet +1 617 230 9112
renaud oslutions.com http://www.oslutions.com
[ http://issues.apache.org/jira/browse/NUTCH-412?page=all ]
Renaud Richardet updated NUTCH-412:
---
Attachment: plugin_parse-feedUrl2.diff
> plugin to parse the feed-url (rss/atom) of a b
[ http://issues.apache.org/jira/browse/NUTCH-412?page=all ]
Renaud Richardet updated NUTCH-412:
---
Attachment: plugin_parse-feedUrl.diff
unified diff against head (Rev: 481445)
> plugin to parse the feed-url (rss/atom) of a b
Reporter: Renaud Richardet
Priority: Minor
A plugin that extracts the feed-url (rss/atom) of a blog by retrieving the href
from the element (if found), and stores it in metadata.
The meta can be accessed with
parse.getData().getMeta("feedUrl");
you can test this p
i category on "Nutch Wiki"
for change notification.
The following page has been changed by RenaudRichardet:
http://wiki.apache.org/nutch/RenaudRichardet
New page:
{{{
Renaud Richardet
COO America
Wyona Inc. - Open Source Content Management - Apache Lenya
office +1 857 776-3195
Issue Type: Bug
Components: fetcher
Affects Versions: 0.8
Environment: Ubuntu Dapper
Reporter: Renaud Richardet
Priority: Minor
Attachments: outlink.diff
When Nutch parses the outlinks of a fetched page, the process will fail if a
single
[ http://issues.apache.org/jira/browse/NUTCH-346?page=all ]
Renaud Richardet updated NUTCH-346:
---
Attachment: log4j_plugins.diff
OK, here we go. This patch should be good for 0.8 and trunk.
> Improve readability of logs/hadoop.
dapper
Reporter: Renaud Richardet
Priority: Minor
adding
log4j.logger.org.apache.nutch.plugin.PluginRepository=WARN
to conf/log4j.properties
dramatically improves the readability of the logs in logs/hadoop.log (removes
all INFO)
--
This message is automatically
[
http://issues.apache.org/jira/browse/NUTCH-330?page=comments#action_12426629 ]
Renaud Richardet commented on NUTCH-330:
This bug is obsolte, I just found out that Nutch already allows to search from
the command line via
bin/nutch
[
http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12426579 ]
Renaud Richardet commented on NUTCH-266:
KuroSaka, yes you can download the hadoop jar, release 0.5.0 from the project
website: http://lucene.apache.org
[ http://issues.apache.org/jira/browse/NUTCH-266?page=all ]
Renaud Richardet updated NUTCH-266:
---
Attachment: patch_hadoop-0.5.0.diff
Now that Hadoop 0.5 has been released, here's the patch to use hadoop-0.5.0.jar
in Nutch-0.8.x
HTH,
Renaud
>
[ http://issues.apache.org/jira/browse/NUTCH-266?page=all ]
Renaud Richardet updated NUTCH-266:
---
Attachment: patch.diff
Thank you Sami,
We had a similar problem with Win XP and were able to fix it by using
hadoop-nightly.jar. However, because of
[ http://issues.apache.org/jira/browse/NUTCH-208?page=all ]
Renaud Richardet updated NUTCH-208:
---
Attachment: proxy_exception_list-0.8.diff
I updated the patch to 0.8 and corrected small typo (if
(!"".equals(input[i].trim())){ ). The proxy
[ http://issues.apache.org/jira/browse/NUTCH-330?page=all ]
Renaud Richardet updated NUTCH-330:
---
Attachment: clSearch.diff
forgot the "echo" in sh...
> command line tool to search a
[ http://issues.apache.org/jira/browse/NUTCH-330?page=all ]
Renaud Richardet updated NUTCH-330:
---
Attachment: clSearch.diff
unified diff against head
> command line tool to search a Lucene in
Versions: 0.8-dev
Environment: ubuntu
Reporter: Renaud Richardet
Priority: Minor
Attachments: clSearch.diff
Tool to allow to search a Lucene index from the command line, makes development
and testing faster
usage: bin/nutch searchindex [index dir
37 matches
Mail list logo