(even though svn allows you to commit directly) -
>> witness the recent situation with Grant. If you wish I can start a vote,
>> and I'm sure it will be positive, and we will have a clean situation
>> from the formal POV. Ok?
>>
> +1
>>
+1, as well.
Chee
tively support it, whether
>> we have enough resources to make any new releases or apply patches that
>> sit in JIRA?
>>
>> My opinion is that we should mark it EOL, and close all JIRA issues that
>> are relevant only to 0.7.x, with the status Won't Fix.
&
ssue currently in JIRA.
Thanks!
Cheers,
Chris
______
Chris Mattmann, Ph.D.
[EMAIL PROTECTED]
Cognizant Development Engineer
Early Detection Research Network Project
_
Jet Propulsion Laboratory
ignificant
> contribution. Are there any implementation tasks you guys think would
> be appropriate for a small group of undergrad, upperclass CS students?
> I'm looking for ideas for improving Nutch that they could accomplish
> in a few weeks time.
>
> Thanks,
___
love to hear from
> others in the community. What I think would be best is to come to a
> consensus on this and then have a wiki page describing this and other
> processes for committers.
>
> Dennis Kubes
__
Chris Mattmann, Ph.D.
[EMAIL PROTECTED]
Cogniz
sure if I should submit a JIRA
> issue for this, but I'm happy to do so if anyone else has seen this issue.
No problem: let's discuss the JIRA issue once we get an answer to the above
questions.
Thanks for being more descriptive and looking forward to your response.
Cheers,
Chris
> I think there may be a bug in the Content.java when it tries to convert
> the textual representation of the type to a MimeType. It always returns
> null. I'm trying to fix it but I can't find an API for Tika (or even
> src). Can someone point me in the right direction?
:58 AM, "Dennis Kubes" <[EMAIL PROTECTED]> wrote:
> Quick question about Jira. When we commit, are we supposed to first
> resolve and then close the issue. What is the process on this.
>
> Dennis Kubes
__
Chris Mattmann, Ph
eption in EXE parser: "+e.getMessage());
> e.printStackTrace(LogUtil.getWarnStream(LOG));
> }
> return new ParseStatus(ParseStatus.FAILED,
> "Can't be handled as exe document. " +
> e).getEmptyParse(getConf());
> }
>
> /// i'm not sure
velopers that were interested in
that portion of the code started developing in that arena. I'm not
compariing Hadoop to Tika, but certainly there are some similarities here.
-Chris
__
Chris Mattmann, Ph.D.
[EMAIL PROTECTED]
Cognizant Dev
Folks,
Either way is fine with me. I committed the patch for the following
reasons:
1. Though the patch sat for around 36 hrs, the JIRA issue has been around
nearly 2 weeks, without any comment at all. I used this as a baseline for
relative interest in the patch. Though a patch file is ultimate
No problemo!
Thanks!
Cheers,
Chris
On 6/25/07 9:45 PM, "Dennis Kubes" <[EMAIL PROTECTED]> wrote:
> ooopsgotta remember to do that. Done.
>
> Dennis
>
> Chris Mattmann wrote:
>> On 6/25/07 8:34 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTE
On 6/25/07 8:34 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> Author: kubes
> Date: Mon Jun 25 20:33:59 2007
> New Revision: 550669
>
> URL: http://svn.apache.org/viewvc?view=rev&rev=550669
> Log:
> NUTCH-497: Fixes problems relating to StackOverflow errors
> and extreme nested tags. Adds
Dennis, +1
On 6/25/07 4:42 PM, "Dennis Kubes" <[EMAIL PROTECTED]> wrote:
> If no one has any objections, I will go ahead and commit this.
>
> Dennis Kubes
>
> Dennis Kubes (JIRA) wrote:
>> [
>> https://issues.apache.org/jira/browse/NUTCH-497?page=com.atlassian.jira.plugi
>> n.system.issu
On 6/20/07 8:17 AM, "Doğacan Güney" <[EMAIL PROTECTED]> wrote:
> Since you are doing compile-core, no plugins get compiled
> (say,
urlfilter-prefix), then when you do a ant test in feed
> only
protocol-file gets compiled. So, no urlfilter-prefix, no problem :).
> I
have to say that I am certain t
On 6/20/07 7:17 AM, "Doğacan Güney" <[EMAIL PROTECTED]> wrote:
> It never passes for me (not even when I do it in src/plugin/feed). If
you
> check the output, parseResult only contains a single entry which
is
> rsstest.rss.
Okay, please tell me I'm not crazy here. I'm on Mac OS X 10.4, Java
vers
Doğacan,
This is strange indeed. I noticed this during my testing of parse-feed,
however, thought it was an anomaly. I got this same strange cryptic unit
test error message, and then after some frustration figuring it out, I did
ant clean, then ant compile-core test, and miraculously the error se
+1
Welcome to the team, Doğacan!
Cheers,
Chris
On 6/12/07 9:43 AM, "Sami Siren" <[EMAIL PROTECTED]> wrote:
> Doğacan Güney wrote:
>> Hi all,
>> I hope that together we will make nutch rock even harder.
>
> By looking at your earlier efforts there should be no doubt.
>
> Welcome!
>
Hi Folks,
I'd just like to throw out my +1 for Doğacan Güney's committer status. I've
been impressed by several of his contributions and the guy just keeps them
coming and coming. I'm not a member of the Lucene PMC, so I don't have
official voting rights, however, I would like to express my suppo
Hi Folks,
After some hard work from all folks involved, we've managed to push out
Apache Nutch, release 0.9. This is the second release of Nutch based
entirely on the underlying Hadoop platform. This release includes several
critical bug fixes, as well as key speedups described in more detail at
list announcing
the completion of the release.
Thanks!
Cheers,
Chris
On 4/4/07 7:21 PM, "Chris Mattmann" <[EMAIL PROTECTED]> wrote:
> Hi Guys,
>
> I've just moved forward with step 13 in the release process (waiting for
> release to propogate to mirro
Hi Guys,
I've just moved forward with step 13 in the release process (waiting for
release to propogate to mirrors). Should I just go ahead and do the other
steps (update Nutch site, update Lucene site, Update javadoc, create version
in JIRA, etc.)? It seems that I could do these without the relea
thing wrapped up tonight! :-)
Cheers,
Chris
On 4/4/07 8:04 AM, "Sami Siren" <[EMAIL PROTECTED]> wrote:
> Chris Mattmann wrote:
>> Hi Folks,
>>
>> I have posted a candidate for the Apache Nutch 0.9 release at
>>
>> http://people.apache.org/~mat
Folks,
As an FYI, here is a link to the log of the steps that I followed to get to
this point in the release:
http://people.apache.org/~mattmann/NUTCH_0.9_release_log_v2.doc
Cheers,
Chris
On 4/2/07 10:52 PM, "Chris Mattmann" <[EMAIL PROTECTED]> wrote:
> Hi Folks,
&g
Hi Folks,
I have posted a candidate for the Apache Nutch 0.9 release at
http://people.apache.org/~mattmann/nutch_0.9/rc2/
See the included CHANGES-0.9.txt file for details on release
contents and latest changes. The release was made from the 0.9-dev trunk,
including the recent patch applied
issue. Sorry about
> not getting to it sooner.
>
> Dennis Kubes
>
> Chris Mattmann wrote:
>> Hi Dennis,
>>
>> Thanks for taking care of this. :-) Could you update CHANGES.txt as well?
>> Once you take care of that, in about 2 hrs (when I get home), I
Hi Dennis,
Thanks for taking care of this. :-) Could you update CHANGES.txt as well?
Once you take care of that, in about 2 hrs (when I get home), I'll begin the
release process again.
Thanks!
Cheers,
Chris
On 4/2/07 2:40 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> Author: kubes
Hi Guys,
>> I think we're discussing about the same thing(improving the process), I
>> just don't think 0.9 is out yet :)
>>
>>
>> But to wrap it up for me:
>>
>> +1 for creating 0.9 branch after fixing the bug (and removing the tag),
>> creating new rc
>> and starting a vote.
>
>
> +1.
+1.
My +1 for 1.0.0. I already changed it to 0.10.0, but this can be easily
reverted, and was probably something that I should have brought to the
attention of the dev list before I did that (sorry about that). In any case,
I think 1.0.0 makes a lot of sense, politically, and software wise. Nutch is
pr
Well, it's just going to add more work for me, but in the end, it's probably
something that needs to be in there. I could go either way on this though,
as in, if we don't commit it, 0.9.1 shouldn't be far off. Here's my +1 for
going ahead and committing it...
On 3/28/07 10:21 AM, "Dennis Kubes" <
Folks,
Discussing this with Andrzej, and reading his email below, I tend to agree
more with this procedure below. I would like to call for a vote to change
the existing as-documented procedure (on the wiki) to branch first, do
testing in branch (apply patches where needed), and then when the bra
Hey Sami,
>
> Well the sum itself is obviously the same :) The point in this is to use
> same
> conventions in Lucene family, not strictly required, but still IMO it just
> looks better.
Okey dok -- I will run the md5sum command, and generate a .md5 for the nutch
release that matches that.
I wi
st/lucene/nutch/, using the same
convention as the others. To get the header, I did a gpg --list-keys.
Thanks!
Cheers,
Chris
On 3/27/07 8:14 AM, "Chris Mattmann" <[EMAIL PROTECTED]> wrote:
> Hi Sami,
>
>> A very limited acid test shows that I can do crawling and sear
Hi Sami,
> A very limited acid test shows that I can do crawling and searching
> through web app so that part is ok.
Great! Similar tests of my own showed the same.
>
> About signatures: I can't find your public gpg key anywhere (to verify
> the signature), not in KEYS file nor in keyservers I
Hi Folks,
I have posted a candidate for the Apache Nutch 0.9 release at
http://people.apache.org/~mattmann/nutch_0.9/
See the included CHANGES-0.9.txt file for details on release
contents and latest changes. The release was made from the 0.9-dev trunk.
Please vote on releasing these packages a
!)
Thanks!
Cheers,
Chris
On 3/26/07 10:22 PM, "Sami Siren" <[EMAIL PROTECTED]> wrote:
> Chris Mattmann wrote:
>> Hi Folks,
>>
>> Just to update everyone on progress. I've made it to Step 13 (waiting for
>> release to appear on mirrors) in the R
Hi Folks,
Just to update everyone on progress. I've made it to Step 13 (waiting for
release to appear on mirrors) in the Release Process:
http://wiki.apache.org/nutch/Release_HOWTO
You can view a full log of the fun that I've been having by going to:
http://people.apache.org/~mattmann/NUT
he process goes smoothly, I can
probably get it done on my own. Thanks for the offer: I'll be sure to call
on you if I get stuck. :-)
Cheers,
Chris
On 3/26/07 10:06 AM, "Dennis Kubes" <[EMAIL PROTECTED]> wrote:
> Let me know if I can help in any way?
>
> Dennis
Hi Folks,
As your friendly neighborhood 0.9 release manager, I just wanted to give
you all a heads up that I'd like to begin the release process today. If I
hear no objections by 00:00:00 UTC time, I will begin the release process
then. I will notify the list as soon as I'm done.
Thanks!
Chee
Hey Doug,
Do you think we should do this in Nutch too? I'm in favor of doing this --
what does everyone else feel?
Thanks!
Cheers,
Chris
__
Chris A. Mattmann
[EMAIL PROTECTED]
Staff Member
Modeling and Data Management Systems Section (387)
Data Ma
Dennis,
No probs. Thanks, a lot!
Cheers,
Chris
On 3/10/07 5:35 PM, "Dennis Kubes" <[EMAIL PROTECTED]> wrote:
>
>
> Chris Mattmann wrote:
>> Hi Dennis,
>>
>> Not to nit-pick, but the place where you inserted your change isn't at the
>
Hi Dennis,
Not to nit-pick, but the place where you inserted your change isn't at the
end (where they typically should be placed). You inserted in the middle of
the file, throwing off the numbering (there are now 2 sets of 18, and 19 in
the unreleased changes section). Could you please append you
Cheers,
Chris
On 3/8/07 1:55 PM, "Andrzej Bialecki" <[EMAIL PROTECTED]> wrote:
> Chris Mattmann wrote:
>> Hi Andrzej,
>>
>> Yep, +1. I also want to make a small update, where instead of creating a
>> new NutchConf object, to just pass it throug
Hi Andrzej,
Yep, +1. I also want to make a small update, where instead of creating a
new NutchConf object, to just pass it through (maybe via the protocol
layer?). Does this make sense?
Cheers,
Chris
On 3/8/07 1:47 PM, "Andrzej Bialecki (JIRA)" <[EMAIL PROTECTED]> wrote:
>
> [
> h
Hi Folks,
As suggested by Sami, I'm moving this discussion to the nutch-dev list.
Seems like I am the guy that is going to do the Nutch 0.9 release :-)
However, it seems also that there are some issues that need to be sorted out
first. I'd like to follow up to Andrzej's email about loose ends be
ssions
on the nutch list in the future.
Cheers,
Chris
-- Forwarded Message
From: Chris Mattmann <[EMAIL PROTECTED]>
Date: Mon, 05 Mar 2007 21:25:30 -0800
To: Piotr Kosiorowski <[EMAIL PROTECTED]>
Cc: Chris Mattmann <[EMAIL PROTECTED]>, Andrzej Bialecki
<[EMAIL PROTEC
Hi Guys,
> Blocker
>
> * NUTCH-400 (Update & add missing license headers) - I believe this is
> fixed and should be closed
+1, thanks to Sami for closing it.
>
> * NUTCH-353 (pages that serverside forwards will be refetched every
> time) - this was partially fixed in NUTCH-273, but a m
Dennis,
I take my coffee black: with a single creamer ;) Okay, okay, sorry: I
thought we were talking about *real* hazing ;)
Cheers,
Chris
On 2/28/07 12:31 PM, "Dennis Kubes" <[EMAIL PROTECTED]> wrote:
> Hi All,
>
> Thank you Andrzej for your kind words. I am looking forward to working
>
rdinate our efforts?
>
> Dennis Kubes
>
> Jérôme Charron wrote:
>> Hi Chris,
>>
>> The JIRA issue is the 309 : https://issues.apache.org/jira/browse/NUTCH-309
>> Thanks for your help.
>>
>> Jérôme
>>
>> On 2/13/07, Chris Mattman
Hi Doug, and Jerome,
Ah, yes, the log guard conversation. I remember this from a while back.
Hmmm, do you guys know what issue that this recorded as in JIRA? I have some
free time recently, so I will be able to add this to my list of Nutch stuff
to work on, and would be happy to take the lead on
uired, and contacting the
folks who've begun work on this issue.
Thanks!
Cheers,
Chris
On 2/7/07 1:31 PM, "Doug Cutting" <[EMAIL PROTECTED]> wrote:
> Chris Mattmann wrote:
>> Got it. So, the logic behind this is, why bother waiting until the
>> followin
07 11:11 AM, "Doug Cutting" <[EMAIL PROTECTED]> wrote:
> Chris Mattmann wrote:
>> Sorry to be so thick-headed, but could someone explain to me in really
>> simple language what this change is requesting that is different from the
>> current Nutch API? I still don
Guys,
Sorry to be so thick-headed, but could someone explain to me in really
simple language what this change is requesting that is different from the
current Nutch API? I still don't get it, sorry...
Cheers,
Chris
On 2/7/07 9:58 AM, "Doug Cutting" <[EMAIL PROTECTED]> wrote:
> Renaud Richa
Hi Doug,
> Since the target of the link must still be indexed separately from the
> item itself, how much use is all this? If the RSS document is
> considered a single page that changes frequently, and item's links are
> considered ordinary outlinks, isn't much the same effect achieved?
IMHO, ye
in the site.
>
> IMHO the only thing "missing" in the parse-rss plugin is storing the data in
> the CrawlDatum and "parsing" it in the next fetch phase. Maybe adding a new
> flag to CrawlDatum, that would flag the URL as "parsable" not "fetchab
.nutch
> nutch
> >
> >
> >
> >
> http://lucene.apache.org/nutch
> >
> >
> > news
>
> >
> >
> > kauu
On 1/31/07, Chris
> Mattmann <[EMAIL PROTECTED]> wrote:
>
> Hi there,
>
> I could most
> li
el and callback framework for
parsing RSS/Atom/Feed XML documents. When you mention asynchronous above,
are you talking about the protocol for fetching the different RSS documents?
Thanks!
Cheers,
Chris
>
> Thanks
>
>
> -Original Message-
> From: Chris Mattmann <[
Hi there,
I could most likely be of assistance, if you gave me some more information.
For instance: I'm wondering if the use case you describe below is already
supported by the current RSS parse plugin?
The current RSS parser, parse-rss, does in fact index individual items that
are pointed to b
> It's at least out-of-date and perhaps obsolete. A quick read of
> Fetcher.java looks like there might be a case where a "fatal" error is
> logged but the fetcher doesn't exit, in FetcherThread#output().
>
So this raises an interesting question:
People (such as Scott G.) out there -- are you f
Hi Doug,
So, does this render the patch that I wrote obsolete?
Cheers,
Chris
On 1/25/07 10:08 AM, "Doug Cutting" <[EMAIL PROTECTED]> wrote:
> Scott Ganyo (JIRA) wrote:
>> ... since Hadoop hijacks and reassigns all log formatters (also a bad
>> practice!) in the org.apache.hadoop.util.LogF
> Before doubling (or after 0.9.0 tripling?) the maintenance/development work
> please consider the following:
>
> One option would be re factoring the code in a way that the parts that are
> usable to other projects like protocols?, parsers (this actually was
> proposed by
> Jukka Zitting some
Hi Dennis,
On 1/21/07 11:47 AM, "Dennis Kubes" <[EMAIL PROTECTED]> wrote:
> All,
>
> I am working on a "How to Become a Nutch Developer" document for the
> wiki and I need some input.
>
> I need an overview of how the process for JIRA works? If I am a
> developer new to Nutch and just startin
Folks,
When would you like to make the release? I've been working on NUTCH-185,
but got a bit bogged down with other work. If there is interest in having
NUTCH-185 included in the release, I could make a push to get out a patch by
week's end...
As for the rest, my +1 for NUTCH-61 being included
org.apache.nutch.metadata that aggreates all
the met key fields from HttpHeaders, and it would be the place that the met
key fields for FileHeaders, etc. could go into.
Let me know what you think, and thanks!
Cheers,
Chris
On 12/9/06 3:53 PM, "Sami Siren" <[EMAIL PROTECTED]> wrote:
> C
Hi Sami,
On 12/9/06 2:27 PM, "[EMAIL PROTECTED]" <[EMAIL PROTECTED]> wrote:
> Author: siren
> Date: Sat Dec 9 14:27:07 2006
> New Revision: 485076
>
> URL: http://svn.apache.org/viewvc?view=rev&rev=485076
> Log:
> Optimize SpellCheckedMetadata further by taking into account the fact that it
> i
Thanks, Andrzej, thanks to the rest of the folks who voted me in! I really
appreciate the honor and pledge to help maintain the high quality of the
Nutch source code.
Best wishes and happy holidays to all the folks on the list!
Cheers,
Chris
On 11/23/06 4:10 AM, "Andrzej Bialecki" <[EMAIL PR
Hi Sami,
On 11/23/06 9:45 AM, "Sami Siren" <[EMAIL PROTECTED]> wrote:
> Couple of points:
>
> 1. You used tabs
I just installed a new version of Eclipse, and forgot to change the default
preference for using tabs versus just whitespaces. I've went ahead and
changed this in my Eclipse and will
ieldvalue2",...));
Both the values "fieldvalue" and "fieldvalue2" will both get stored in the
index for the key "fieldname". So, if I understand you correctly (which I
may not ;) ), then I think you can omit the check that you are talking about
above and just g
Hi Guys,
Can we disable the selection of "released versions" within JIRA for issues
so that people like me don't continue to get confused?
Thanks!
Cheers,
Chris
On 10/13/06 9:32 AM, "Sami Siren (JIRA)" <[EMAIL PROTECTED]> wrote:
> [ http://issues.apache.org/jira/browse/NUTCH-379?page
d really change the email address for JIRA to not use the Apache
incubator one anymore, and to use to Lucene one.
Sound good? If so, could someone with permissions please take care of it?
:-)
Cheers,
Chris
On 10/3/06 9:04 AM, "Sami Siren" <[EMAIL PROTECTED]> wrote:
> Andrzej
>
> The switch to 1.5 format was also logged on jira issue
> http://issues.apache.org/jira/browse/NUTCH-360
> --
> Sami Siren
Ahh, I didn't see this. Way to go Sami, I love it when people actually keep
records of changes! ;)
Cheers,
Chris
__
Chris
Hi Folks,
I noticed that Nutch now requires JDK 5 in order to compile, due to recent
changes to the PluginRepository and some other classes. I think that this is
a good move, however, I wasn't sure that I had seen any "official"
announcement that Nutch now requires 1.5...
Cheers,
Chris
__
Hi Doug,
>
> But the nutch-developers Jira group pretty closely corresponds to
> Nutch's committers, so perhaps all committers should be permitted to
> close, although this should be exercised with caution, only at releases,
> since closes cannot be undone in this workflow.
>
> Another alternati
Hi Doug and Andrzej,
+1. I think that workflow makes a lot of sense. Currently users in the
nutch-developers group can close and resolve issues. In the Hadoop workflow,
would this continue to be the case?
Cheers,
Chris
On 8/30/06 3:14 PM, "Andrzej Bialecki" <[EMAIL PROTECTED]> wrote:
> Do
Hi Chris,
It seems from your email message that your plugin is located in
$NUTCH_HOME/build/custom-meta? Is this where your plugin * code * is
currently stored? If so, this is the wrong location and the most likely
reason that your plugin isn't being loaded.
Plugin code should live in $NUTCH_HO
Hi Steven,
On 8/16/06 7:36 AM, "steven shingler" <[EMAIL PROTECTED]> wrote:
> (This thread moved from the User List.)
>
> OK Lukas, lets open it up to the dev list! :)
>
> Particularly, does the group feel moving to Maven would be _a good thing_ ?
+1
I suggested this (however did not make an
agement on the Nutch mailing lists recently. From
that interest, we have gathered the following list of candidate committers
who have expressed interested in our proposed project. The leader of the
Tika project would be Chris Mattmann. Chris works at NASA's Jet Propulsion
Laboratory as a Member
Hi Jukka,
Thanks for your email. Indeed, there was discussion on the Lucene PMC email
list, about the Tika project. It was decided by the powers that be to
discuss it more on the Nutch mailing list before moving forward with any
vote on making Tika a sub-project of Apache Lucene. With regards to
Hi Guys,
I've seen on the Hadoop mailing list recently that there was a new status
added for issues in JIRA called "Patch Available" to let committers know
that a patch is ready for review to commit. How about we add this to the
Nutch jira instance as well? I tried doing this, but I don't think I
Hey Andrzej,
On 8/3/06 8:19 AM, "Andrzej Bialecki" <[EMAIL PROTECTED]> wrote:
> Chris Mattmann wrote:
>> Hi Marko,
>>
>>Thanks for your question. Basically it was set up as a sort of "last
>> result" of getting at least * some * i
Hi Marko,
Thanks for your question. Basically it was set up as a sort of "last
result" of getting at least * some * information from the PDF file, albeit
littered with garbage. If indeed the parse-text does not really make sense
in terms of a backup parser to handle PDF files and get at least s
Hi Jukka,
Thanks for your email. Jerome Charron and I proposed a project with a
similar goal in mind that we wanted to dub "Tika". Tika would effectively be
a Lucene sub-project, and would factor out some of the capabilities you
mention below from Nutch, incl:
1. MimeType repository
2. Parser i
Hi Andrzej,
>
> The main problem, as Scott observed, is that the static flag affects all
> instances of the task executing inside the same JVM. If there are
> several Fetcher tasks (or any other tasks that check for SEVERE flag!),
> belonging to different jobs, all of them will quit. This is cert
Folks,
Before I (or someone else) reopens the issue, I think it's important to
understand the implications:
>1) Having a *side-effect* of the entire system stop processing after merely
> logging a message at a certain event level is a poor practice.
I'm not sure that the Fetcher quitting is a *
Hi Alex,
I also noticed this issue a while back. It's described here:
http://mail-archives.apache.org/mod_mbox/lucene-nutch-dev/200510.mbox/%3c435
[EMAIL PROTECTED]
Cheers,
Chris
On 4/25/06 2:41 PM, "Alex" <[EMAIL PROTECTED]> wrote:
> Hi there,
>
> I'm fairly new to nutch and in working
ts own Stand-alone
library.
Just my two cents, thanks!
Cheers,
Chris
>
> Otis
>
> - Original Message
> From: Jérôme Charron <[EMAIL PROTECTED]>
> To: nutch-dev@lucene.apache.org
> Sent: Friday, April 7, 2006 4:26:54 AM
> Subject: [Proposal] New Luc
Hi Stefan,
The DTD actually does allow for custom attributes: Jerome factored them
out of the form:
=""
=""
.
>
Into the form:
...
See the difference? Using the parameter tags, we can have a generic DTD that
supports any parameter name and value. The other way, I had to go t
Hi Guys,
Any progress on the 0.8 release? Was there any resolution about which JIRA
issues to complete before the 0.8 release? We had a bit of conversation
there and some ideas, but no definitive answer...
Thanks for your help, and sorry to pester ;)
Cheers,
Chris
__
Hi Andrzej,
On 4/7/06 12:18 PM, "Andrzej Bialecki" <[EMAIL PROTECTED]> wrote:
> Do you guys have any additional insights / suggestions whether NUTCH-240
> and/or NUTCH-61 should be included in this release?
Looking at the JIRA popular issues pane for Nutch (
http://issues.apache.org/jira/browse
+1
On 4/7/06 10:20 AM, "Doug Cutting" <[EMAIL PROTECTED]> wrote:
> Chris Mattmann wrote:
>> +1 for a release sooner rather than later.
>
> I think this is a good plan. There's no reason we can't do another
> release in a month. If it is back-co
+1 for a release sooner rather than later. Several interesting features
contributed since the 0.7 branch I believe are now tested and
production-worthy, at least in my environment. Hats off to the folks who
were able to split the MapReduce and NDFS into Hadoop -- I'm going to be
experimenting with
Thanks Jerome! :-)
Cheers,
Chris
On 3/13/06 4:02 PM, "Jérôme Charron" <[EMAIL PROTECTED]> wrote:
>> I updated to the latest SVN revision (385691) today, and I am now seeing
>> a
>> Null Pointer exception in the AnalyzerFactory.java class.
>
> Fixed (r385702). Thanks Chris.
>
>
>> NOTE:
Hi Folks,
I updated to the latest SVN revision (385691) today, and I am now seeing a
Null Pointer exception in the AnalyzerFactory.java class. It seems that in
some cases, the method:
private Extension getExtension(String lang) { Extension extension =
(Extension) this.conf.getObject(lang)
if (this.extensionPoint == null) {
throw new RuntimeException("x point " + Parser.X_POINT_ID + " not
found.");
> -Original Message-
> From: Chris Mattmann [mailto:[EMAIL PROTECTED]
> Sent: Monday, March 06, 2006 7:51 PM
> To: 'nutch-dev@lucene.ap
oint == null) {
throw new RuntimeException("x point " + Parser.X_POINT_ID + " not
found.");
Cheers,
Chris
> Cheers,
> Stefan
>
> Am 07.03.2006 um 04:38 schrieb Chris Mattmann:
>
> > Hi Stefan,
> >
> >> after a short time
Hi Stefan,
> after a short time I already had 1602 time this lines in my
> tasktracker log files.
> 060307 022707 task_m_2bu9o4 found resource parse-plugins.xml at
> file:/home/joa/nutch/conf/parse-plugins.xml
>
> Sounds like this file is loaded 1602 (after lets say 3 minutes) I
> guess that was
Hi Andrzej,
> > commons-httpclient-3.0-beta1.jar src/plugin/parse-rss/lib
> > commons-httpclient-3.0.jarsrc/plugin/protocol-httpclient/lib
>
> Not sure what was the reason to use the beta1, perhaps no reason except
> that it was the latest available at the moment...
Yup, I think that w
Hey Doug,
I think that at least in the case of parse-rss, parse-pdf, and the nutch
core if there's probably some utility in having lib-xxx plugins (or at least
putting these jars in the $NUTCH_HOME/lib) for:
commons-httpclient
log4j
xerces
Then, protocol-httpclient, parse-pdf and the rest of t
:15 PM EST
> Subject: Re: ignore eclipse .project and .classpath
>
> +1
>
> Am 08.02.2006 um 06:16 schrieb Chris Mattmann:
>
>> Hi Folks,
>>
>>
>>
>> Just wondering if someone could add to the svn:ignore property for
>> Nutch
>> the
Hi Folks,
Just wondering if someone could add to the svn:ignore property for Nutch
the files:
.classpath
.project
I happen to use eclipse to do Nutch development and always ignore these
files in my other eclipse projects as well.
Cheers,
Chris
__
1 - 100 of 132 matches
Mail list logo