date:20060306

RE: compile search.jsp

2006-03-06 Thread Sylvain FURMANEK

Hi,

You must modify directly the page search.jsp on the tomcat.


-Message d'origine-
De : Michael Ji [mailto:[EMAIL PROTECTED] 
Envoyé : dimanche 5 mars 2006 04:04
À : nutch-dev@lucene.apache.org
Objet : compile search.jsp


Hi,
 
I made change in search.jsp under /nutch/src/web/jsp
and hope the change could reflect to the skin of
nutch search page.
 
I tried to run "ant war" and replace ROOT.war in
tomcat/webapp
 
also I tried to shutdown and restart tomcat;
 
But seems the nutch search page keeps the same, also
the bean.LOG.info keeps the same as before even I am
writing new information.
 
I wonder if any compiling steps I missed.
 
thanks your help,
 
Michael,
 


__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam
protection around 
http://mail.yahoo.com 

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com

record termination and MapReduce

2006-03-06 Thread Toby DiPasquale

Hi all,

I have a question about the MapReduce and NDFS implementations. When
writing records into an NDFS file, how does one make sure that records
terminate cleanly on block boundaries such that a Map job's input does not
span multiple physical blocks? 

It also appears as if NDFS does not have an explicit "record append"
operation. Is this the case?

-- 
Toby DiPasquale
Senior Software Engineer
Symantec Corporation

Nutch web site

2006-03-06 Thread Piotr Kosiorowski


Hi,
It looks like Nutch web site was updated with site built from latest 
trunk - the only problem is it contains tutorial for unreleased (yet) 
version 0.8. I think we talked about it and agreed to keep tutorial for 
latest release on the Web. I have just updated site in svn (branch-0.7) 
with latest changes (forrest 0.7 compatibility and mailing list 
archives) and rebuilt it using forrest 0.7. If no objections I can 
switch web site to use version from branch instead of trunk.

Regards
Piotr

Re: Nutch web site

2006-03-06 Thread Andrzej Bialecki


Piotr Kosiorowski wrote:

Hi,
It looks like Nutch web site was updated with site built from latest 
trunk - the only problem is it contains tutorial for unreleased (yet) 
version 0.8. I think we talked about it and agreed to keep tutorial 
for latest release on the Web. I have just updated site in svn 
(branch-0.7) with latest changes (forrest 0.7 compatibility and 
mailing list archives) and rebuilt it using forrest 0.7. If no 
objections I can switch web site to use version from branch instead of 
trunk.


+1, yes it would be really confusing. Since there are more and more 
people trying 0.8, could we perhaps include a short  note that 0.8 and 
later is NOT compatible with this tutorial, and a reference to the 
tutorial for 0.8 (or the trunk/ branch in general)?


--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Re: Nutch web site

2006-03-06 Thread Doug Cutting


Piotr Kosiorowski wrote:
It looks like Nutch web site was updated with site built from latest 
trunk - the only problem is it contains tutorial for unreleased (yet) 
version 0.8. I think we talked about it and agreed to keep tutorial for 
latest release on the Web. I have just updated site in svn (branch-0.7) 
with latest changes (forrest 0.7 compatibility and mailing list 
archives) and rebuilt it using forrest 0.7. If no objections I can 
switch web site to use version from branch instead of trunk.


+1

Thanks!

Doug

[jira] Created: (NUTCH-224) Nutch doesn't handle Korean text at all

2006-03-06 Thread KuroSaka TeruHiko (JIRA)

Nutch doesn't handle Korean text at all
---

 Key: NUTCH-224
 URL: http://issues.apache.org/jira/browse/NUTCH-224
 Project: Nutch
Type: Bug
  Components: indexer  
Versions: 0.7.1
Reporter: KuroSaka TeruHiko


I was browing NutchAnalysis.jj and found that
Hungul Syllables (U+AC00 ... U+D7AF; U+ means
a Unicode character of the hex value ) are not
part of LETTER or CJK class.  This seems to me that
Nutch cannot handle Korean documents at all.

I posted the above message at nutch-user ML and Cheolgoo Kang [EMAIL PROTECTED]
replied as:

There was similar issue with Lucene's StandardTokenizer.jj.

http://issues.apache.org/jira/browse/LUCENE-444

and

http://issues.apache.org/jira/browse/LUCENE-461

I'm have almost no experience with Nutch, but you can handle it like
those issues above.


Both fixes should probably be ported back to NuatchAnalysis.jj.





-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Re: record termination and MapReduce

2006-03-06 Thread Doug Cutting


Toby DiPasquale wrote:

I have a question about the MapReduce and NDFS implementations. When
writing records into an NDFS file, how does one make sure that records
terminate cleanly on block boundaries such that a Map job's input does not
span multiple physical blocks? 


We do not currently guarantee that.  A task's input may span multiple 
blocks.  We try to split things into block-sized chunks, but the last 
few records (up to the first sync mark past the split point) may be in 
the next block.  So a bit of i/o will happen over the network, but not 
the vast majority.



It also appears as if NDFS does not have an explicit "record append"
operation. Is this the case?


Yes.  DFS currently is write-once.

Please note that the MapReduce and DFS code has moved from Nutch to the 
Hadoop project.  Such questions are more appropriately asked there.


Doug

HttpResponse#readChunkedContent unused?

2006-03-06 Thread Stefan Groschupf


Hi,

what does the HttpResponse#readChunkedContent method do?
Looks like it is never used or do I oversee something?
Do we miss to call this method or do we miss to delete the code? :-?

Thanks for any clarification.
Stefan

found resource parse-plugins.xm?

2006-03-06 Thread Stefan Groschupf


Hi,
after a short time I already had 1602 time this lines in my  
tasktracker log files.
060307 022707 task_m_2bu9o4  found resource parse-plugins.xml at  
file:/home/joa/nutch/conf/parse-plugins.xml


Sounds like this file is loaded 1602 (after lets say 3 minutes) I  
guess that wasn't the goal or do I oversee anything?
That could be a serious performance improvement to just load this  
file once.
I was not able to find the code that is logging this statement, has  
anyone a idea where this happens?


Thanks.
Stefan
-
blog: http://www.find23.org
company: http://www.media-style.com

Re: found resource parse-plugins.xm?

2006-03-06 Thread [EMAIL PROTECTED]


Stefan Groschupf wrote:

Hi,
after a short time I already had 1602 time this lines in my 
tasktracker log files.
060307 022707 task_m_2bu9o4  found resource parse-plugins.xml at 
file:/home/joa/nutch/conf/parse-plugins.xml


Sounds like this file is loaded 1602 (after lets say 3 minutes) I 
guess that wasn't the goal or do I oversee anything?

Is it being loaded by the same task each time Stefan?
St.Ack

Re: found resource parse-plugins.xm?

2006-03-06 Thread Stefan Groschupf


Hi  Stack, :)
yes! Until fetching with switched on parsing on one tasktracker that  
tries to crawl a 10 mio segment with 800 threads.

:-?
Stefan

Am 07.03.2006 um 04:27 schrieb [EMAIL PROTECTED]:


Stefan Groschupf wrote:

Hi,
after a short time I already had 1602 time this lines in my  
tasktracker log files.
060307 022707 task_m_2bu9o4  found resource parse-plugins.xml at  
file:/home/joa/nutch/conf/parse-plugins.xml


Sounds like this file is loaded 1602 (after lets say 3 minutes) I  
guess that wasn't the goal or do I oversee anything?

Is it being loaded by the same task each time Stefan?
St.Ack



-
blog: http://www.find23.org
company: http://www.media-style.com

RE: found resource parse-plugins.xm?

2006-03-06 Thread Chris Mattmann

Hi Stefan,

> after a short time I already had 1602 time this lines in my
> tasktracker log files.
> 060307 022707 task_m_2bu9o4  found resource parse-plugins.xml at
> file:/home/joa/nutch/conf/parse-plugins.xml
> 
> Sounds like this file is loaded 1602 (after lets say 3 minutes) I
> guess that wasn't the goal or do I oversee anything?

It certainly wasn't the goal at all. After NUTCH-88, Jerome and I had the
following line in the ParserFactory.java class:

  /** List of parser plugins. */
  private static final ParsePluginList PARSE_PLUGIN_LIST =
  new ParsePluginsReader().parse();


(see revision 326889)

Looking at the revision history for the ParserFactory file, after the
application of NUTCH-169, the above changes to:


  private ParsePluginList parsePluginList;

//... code here

public ParserFactory(NutchConf nutchConf) {
this.nutchConf = nutchConf;
this.extensionPoint = nutchConf.getPluginRepository().getExtensionPoint(
Parser.X_POINT_ID);
this.parsePluginList = new ParsePluginsReader().parse(nutchConf);

if (this.extensionPoint == null) {
  throw new RuntimeException("x point " + Parser.X_POINT_ID + " not
found.");
}
if (this.parsePluginList == null) {
  throw new RuntimeException(
  "Parse Plugins preferences could not be loaded.");
}
  }


Thus, every time the ParserFactory is constructed, the parse-plugins.xml
file is read (it's the result of the call to
ParsePluginsReader().parse(nutchConf)). So, if the fie is loaded 1602 times,
I'd guess that the ParserFactory is loaded 1602 times? Additionally, I'm
wondering why the parse-plugins.xml configuration parameters aren't declared
as final static anymore?

> That could be a serious performance improvement to just load this
> file once.

Yup, I think that's the reason we made it final static. If there is no
reason to not have it final static, I would suggest that it be put back to
final static. There may be a problem however, now since NUTCH-169, the
loading requires an existing Configuration object I believe. So, we may need
a static Configuration object as well. Thoughts? 

> I was not able to find the code that is logging this statement, has
> anyone a idea where this happens?

The statement gets logged within the ParsePluginsReader.java class, line 98:

ppInputStream = conf.getConfResourceAsInputStream(
  conf.get(PP_FILE_PROP));

HTH,
  Chris


> 
> Thanks.
> Stefan
> -
> blog: http://www.find23.org
> company: http://www.media-style.com

Re: found resource parse-plugins.xm?

2006-03-06 Thread Stefan Groschupf


Hi Chris,
thanks for the clarification.
Do you think we can we somehow cache it in the nutchConf instance,  
since this is the way we doing this on other places as well?

Cheers,
Stefan

Am 07.03.2006 um 04:38 schrieb Chris Mattmann:


Hi Stefan,


after a short time I already had 1602 time this lines in my
tasktracker log files.
060307 022707 task_m_2bu9o4  found resource parse-plugins.xml at
file:/home/joa/nutch/conf/parse-plugins.xml

Sounds like this file is loaded 1602 (after lets say 3 minutes) I
guess that wasn't the goal or do I oversee anything?


It certainly wasn't the goal at all. After NUTCH-88, Jerome and I  
had the

following line in the ParserFactory.java class:

  /** List of parser plugins. */
  private static final ParsePluginList PARSE_PLUGIN_LIST =
  new ParsePluginsReader().parse();


(see revision 326889)

Looking at the revision history for the ParserFactory file, after the
application of NUTCH-169, the above changes to:


  private ParsePluginList parsePluginList;

//... code here

public ParserFactory(NutchConf nutchConf) {
this.nutchConf = nutchConf;
this.extensionPoint = nutchConf.getPluginRepository 
().getExtensionPoint(

Parser.X_POINT_ID);
this.parsePluginList = new ParsePluginsReader().parse(nutchConf);

if (this.extensionPoint == null) {
  throw new RuntimeException("x point " + Parser.X_POINT_ID + "  
not

found.");
}
if (this.parsePluginList == null) {
  throw new RuntimeException(
  "Parse Plugins preferences could not be loaded.");
}
  }


Thus, every time the ParserFactory is constructed, the parse- 
plugins.xml

file is read (it's the result of the call to
ParsePluginsReader().parse(nutchConf)). So, if the fie is loaded  
1602 times,
I'd guess that the ParserFactory is loaded 1602 times?  
Additionally, I'm
wondering why the parse-plugins.xml configuration parameters aren't  
declared

as final static anymore?


That could be a serious performance improvement to just load this
file once.


Yup, I think that's the reason we made it final static. If there is no
reason to not have it final static, I would suggest that it be put  
back to

final static. There may be a problem however, now since NUTCH-169, the
loading requires an existing Configuration object I believe. So, we  
may need

a static Configuration object as well. Thoughts?


I was not able to find the code that is logging this statement, has
anyone a idea where this happens?


The statement gets logged within the ParsePluginsReader.java class,  
line 98:


ppInputStream = conf.getConfResourceAsInputStream(
  conf.get(PP_FILE_PROP));

HTH,
  Chris




Thanks.
Stefan
-
blog: http://www.find23.org
company: http://www.media-style.com






-
blog: http://www.find23.org
company: http://www.media-style.com

RE: found resource parse-plugins.xm?

2006-03-06 Thread Chris Mattmann

Hi Stefan,


> Hi Chris,
> thanks for the clarification.

No probs. 

> Do you think we can we somehow cache it in the nutchConf instance,
> since this is the way we doing this on other places as well?

Yeah I think we can. Here is a small patch to the ParserFactory that should
do the trick. Give it a test and let me know if it works. If it does, I
would say +1 to the committers to get this into the sources ASAP, no?

Index: src/java/org/apache/nutch/parse/ParserFactory.java
===
--- src/java/org/apache/nutch/parse/ParserFactory.java  (revision 383463)
+++ src/java/org/apache/nutch/parse/ParserFactory.java  (working copy)
@@ -55,7 +55,13 @@
 this.conf = conf;
 this.extensionPoint = PluginRepository.get(conf).getExtensionPoint(
 Parser.X_POINT_ID);
-this.parsePluginList = new ParsePluginsReader().parse(conf);
+
+if(conf.getObject("parsePluginList") != null){
+   this.parsePluginList =
(ParsePluginList)conf.getObject("parsePluginList");
+}
+else{
+this.parsePluginList = new ParsePluginsReader().parse(conf);

+}
 
 if (this.extensionPoint == null) {
   throw new RuntimeException("x point " + Parser.X_POINT_ID + " not
found.");


Cheers,
  Chris

> Cheers,
> Stefan
> 
> Am 07.03.2006 um 04:38 schrieb Chris Mattmann:
> 
> > Hi Stefan,
> >
> >> after a short time I already had 1602 time this lines in my
> >> tasktracker log files.
> >> 060307 022707 task_m_2bu9o4  found resource parse-plugins.xml at
> >> file:/home/joa/nutch/conf/parse-plugins.xml
> >>
> >> Sounds like this file is loaded 1602 (after lets say 3 minutes) I
> >> guess that wasn't the goal or do I oversee anything?
> >
> > It certainly wasn't the goal at all. After NUTCH-88, Jerome and I
> > had the
> > following line in the ParserFactory.java class:
> >
> >   /** List of parser plugins. */
> >   private static final ParsePluginList PARSE_PLUGIN_LIST =
> >   new ParsePluginsReader().parse();
> >
> >
> > (see revision 326889)
> >
> > Looking at the revision history for the ParserFactory file, after the
> > application of NUTCH-169, the above changes to:
> >
> >
> >   private ParsePluginList parsePluginList;
> >
> > //... code here
> >
> > public ParserFactory(NutchConf nutchConf) {
> > this.nutchConf = nutchConf;
> > this.extensionPoint = nutchConf.getPluginRepository
> > ().getExtensionPoint(
> > Parser.X_POINT_ID);
> > this.parsePluginList = new ParsePluginsReader().parse(nutchConf);
> >
> > if (this.extensionPoint == null) {
> >   throw new RuntimeException("x point " + Parser.X_POINT_ID + "
> > not
> > found.");
> > }
> > if (this.parsePluginList == null) {
> >   throw new RuntimeException(
> >   "Parse Plugins preferences could not be loaded.");
> > }
> >   }
> >
> >
> > Thus, every time the ParserFactory is constructed, the parse-
> > plugins.xml
> > file is read (it's the result of the call to
> > ParsePluginsReader().parse(nutchConf)). So, if the fie is loaded
> > 1602 times,
> > I'd guess that the ParserFactory is loaded 1602 times?
> > Additionally, I'm
> > wondering why the parse-plugins.xml configuration parameters aren't
> > declared
> > as final static anymore?
> >
> >> That could be a serious performance improvement to just load this
> >> file once.
> >
> > Yup, I think that's the reason we made it final static. If there is no
> > reason to not have it final static, I would suggest that it be put
> > back to
> > final static. There may be a problem however, now since NUTCH-169, the
> > loading requires an existing Configuration object I believe. So, we
> > may need
> > a static Configuration object as well. Thoughts?
> >
> >> I was not able to find the code that is logging this statement, has
> >> anyone a idea where this happens?
> >
> > The statement gets logged within the ParsePluginsReader.java class,
> > line 98:
> >
> > ppInputStream = conf.getConfResourceAsInputStream(
> >   conf.get(PP_FILE_PROP));
> >
> > HTH,
> >   Chris
> >
> >
> >>
> >> Thanks.
> >> Stefan
> >> -
> >> blog: http://www.find23.org
> >> company: http://www.media-style.com
> >
> >
> >
> 
> -
> blog: http://www.find23.org
> company: http://www.media-style.com

RE: found resource parse-plugins.xm?

2006-03-06 Thread Chris Mattmann

Sorry,

 My last patch was missing one line. Here's the update:

Index: src/java/org/apache/nutch/parse/ParserFactory.java
===
--- src/java/org/apache/nutch/parse/ParserFactory.java  (revision 383463)
+++ src/java/org/apache/nutch/parse/ParserFactory.java  (working copy)
@@ -55,7 +55,14 @@
 this.conf = conf;
 this.extensionPoint = PluginRepository.get(conf).getExtensionPoint(
 Parser.X_POINT_ID);
-this.parsePluginList = new ParsePluginsReader().parse(conf);
+
+if(conf.getObject("parsePluginList") != null){
+   this.parsePluginList =
(ParsePluginList)conf.getObject("parsePluginList");
+}
+else{
+this.parsePluginList = new ParsePluginsReader().parse(conf);
+conf.setObject("parsePluginList", this.parsePluginList);
+}
 
 if (this.extensionPoint == null) {
   throw new RuntimeException("x point " + Parser.X_POINT_ID + " not
found.");


> -Original Message-
> From: Chris Mattmann [mailto:[EMAIL PROTECTED]
> Sent: Monday, March 06, 2006 7:51 PM
> To: 'nutch-dev@lucene.apache.org'
> Subject: RE: found resource parse-plugins.xm?
> 
> Hi Stefan,
> 
> 
> > Hi Chris,
> > thanks for the clarification.
> 
> No probs.
> 
> > Do you think we can we somehow cache it in the nutchConf instance,
> > since this is the way we doing this on other places as well?
> 
> Yeah I think we can. Here is a small patch to the ParserFactory that
> should do the trick. Give it a test and let me know if it works. If it
> does, I would say +1 to the committers to get this into the sources ASAP,
> no?
> 
> Index: src/java/org/apache/nutch/parse/ParserFactory.java
> ===
> --- src/java/org/apache/nutch/parse/ParserFactory.java(revision
> 383463)
> +++ src/java/org/apache/nutch/parse/ParserFactory.java(working
copy)
> @@ -55,7 +55,13 @@
>  this.conf = conf;
>  this.extensionPoint = PluginRepository.get(conf).getExtensionPoint(
>  Parser.X_POINT_ID);
> -this.parsePluginList = new ParsePluginsReader().parse(conf);
> +
> +if(conf.getObject("parsePluginList") != null){
> + this.parsePluginList =
> (ParsePluginList)conf.getObject("parsePluginList");
> +}
> +else{
> +this.parsePluginList = new ParsePluginsReader().parse(conf);
> 
> +}
> 
>  if (this.extensionPoint == null) {
>throw new RuntimeException("x point " + Parser.X_POINT_ID + " not
> found.");
> 
> 
> Cheers,
>   Chris
> 
> > Cheers,
> > Stefan
> >
> > Am 07.03.2006 um 04:38 schrieb Chris Mattmann:
> >
> > > Hi Stefan,
> > >
> > >> after a short time I already had 1602 time this lines in my
> > >> tasktracker log files.
> > >> 060307 022707 task_m_2bu9o4  found resource parse-plugins.xml at
> > >> file:/home/joa/nutch/conf/parse-plugins.xml
> > >>
> > >> Sounds like this file is loaded 1602 (after lets say 3 minutes) I
> > >> guess that wasn't the goal or do I oversee anything?
> > >
> > > It certainly wasn't the goal at all. After NUTCH-88, Jerome and I
> > > had the
> > > following line in the ParserFactory.java class:
> > >
> > >   /** List of parser plugins. */
> > >   private static final ParsePluginList PARSE_PLUGIN_LIST =
> > >   new ParsePluginsReader().parse();
> > >
> > >
> > > (see revision 326889)
> > >
> > > Looking at the revision history for the ParserFactory file, after the
> > > application of NUTCH-169, the above changes to:
> > >
> > >
> > >   private ParsePluginList parsePluginList;
> > >
> > > //... code here
> > >
> > > public ParserFactory(NutchConf nutchConf) {
> > > this.nutchConf = nutchConf;
> > > this.extensionPoint = nutchConf.getPluginRepository
> > > ().getExtensionPoint(
> > > Parser.X_POINT_ID);
> > > this.parsePluginList = new ParsePluginsReader().parse(nutchConf);
> > >
> > > if (this.extensionPoint == null) {
> > >   throw new RuntimeException("x point " + Parser.X_POINT_ID + "
> > > not
> > > found.");
> > > }
> > > if (this.parsePluginList == null) {
> > >   throw new RuntimeException(
> > >   "Parse Plugins preferences could not be loaded.");
> > > }
> > >   }
> > >
> > >
> > > Thus, every time the ParserFactory is constructed, the parse-
> > > plugins.xml
> > > file is read (it's the result of the call to
> > > ParsePluginsReader().parse(nutchConf)). So, if the fie is loaded
> > > 1602 times,
> > > I'd guess that the ParserFactory is loaded 1602 times?
> > > Additionally, I'm
> > > wondering why the parse-plugins.xml configuration parameters aren't
> > > declared
> > > as final static anymore?
> > >
> > >> That could be a serious performance improvement to just load this
> > >> file once.
> > >
> > > Yup, I think that's the reason we made it final static. If there is no
> > > reason to not have it final static, I would suggest that it be put
> > > back to
> > > final static. There may be a problem however, now since

db.score.injected

2006-03-06 Thread Jeff Ritchie


Developers...

Is the configuration property  still used? 

If so in which source file is it used? 


I can't seem to find where it is used in the source anywhere.

Line 70 org.apache.nutch.crawl.Injector.java
if (url != null) {  // if it passes
   value.set(url);   // collect it
->output.collect(value, new 
CrawlDatum(CrawlDatum.STATUS_DB_UNFETCHED,

interval));
 }

Should that be:
-> output.collect(value, new CrawlDatum(CrawlDatum.STATUS_DB_UNFETCHED,

interval,jobConf.getFloat("db.score.injected",1.0f)));




Jeff

Re: Nutch web site

2006-03-06 Thread Piotr Kosiorowski



Andrzej Bialecki wrote:
+1, yes it would be really confusing. Since there are more and more 
people trying 0.8, could we perhaps include a short  note that 0.8 and 
later is NOT compatible with this tutorial, and a reference to the 
tutorial for 0.8 (or the trunk/ branch in general)?




I can add both tutorials to Nutch web site named Tutorial for 0.7 
version and Tutorial for 0.8 version. It should make things clear.

Anyone against it?
Piotr

RE: Nutch web site

2006-03-06 Thread Richard Braman

No that sounds good to me.  I also think that the whole web vs. crawl
needs to be better explained.  I will write a bug/patch for it tomorrow.

-Original Message-
From: Piotr Kosiorowski [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, March 07, 2006 1:13 AM
To: nutch-dev@lucene.apache.org
Subject: Re: Nutch web site

Andrzej Bialecki wrote:
> +1, yes it would be really confusing. Since there are more and more
> people trying 0.8, could we perhaps include a short  note that 0.8 and
> later is NOT compatible with this tutorial, and a reference to the 
> tutorial for 0.8 (or the trunk/ branch in general)?
> 

I can add both tutorials to Nutch web site named Tutorial for 0.7 
version and Tutorial for 0.8 version. It should make things clear.
Anyone against it?
Piotr

Re: Nutch web site

2006-03-06 Thread Matthias Jaekle

I can add both tutorials to Nutch web site named Tutorial for 0.7 
version and Tutorial for 0.8 version. It should make things clear.

Anyone against it?

Hi,
would you add both tutorials directly to the wiki so that we could 
improve them all together?

Thanks
Matthias

RE: compile search.jsp

record termination and MapReduce

Nutch web site

Re: Nutch web site

Re: Nutch web site

[jira] Created: (NUTCH-224) Nutch doesn't handle Korean text at all

Re: record termination and MapReduce

HttpResponse#readChunkedContent unused?

found resource parse-plugins.xm?

Re: found resource parse-plugins.xm?

Re: found resource parse-plugins.xm?

RE: found resource parse-plugins.xm?

Re: found resource parse-plugins.xm?

RE: found resource parse-plugins.xm?

RE: found resource parse-plugins.xm?

db.score.injected

Re: Nutch web site

RE: Nutch web site

Re: Nutch web site

19 matches

Site Navigation

Mail list logo

Footer information