[Nutch-dev] Re: mapred crawling exception - Job failed!

2006-01-03 Thread Andrzej Bialecki
Lukas Vlcek wrote: Hi, I am trying to use the latest nutch-trunk version but I am facing unexpected "Job failed!" exception. It seems that all crawling work has been already done but some threads are hunged which results into exception after some timeout. This was fixed (or should be fixe

[Nutch-dev] Re: mapred crawling exception - Job failed!

2006-01-03 Thread Lukas Vlcek
Note: I mistakenly used nutch-user email for reply-to value. Feel free to reply to either nutch-dev or nutch-user as I monitor both of them :-) Anyway can anybody tell me how I can easily change reply-to value in gmail? I am struggling with this all the time especially when replying to multiple mai

[Nutch-dev] mapred crawling exception - Job failed!

2006-01-03 Thread Lukas Vlcek
Hi, I am trying to use the latest nutch-trunk version but I am facing unexpected "Job failed!" exception. It seems that all crawling work has been already done but some threads are hunged which results into exception after some timeout. I am not sure whether this is a real nutch issue or just min

[Nutch-dev] Adding some theory & publication links into the Wiki..

2006-01-03 Thread Byron Miller
I figured since i'm in research mode i woul start compiling available information & resource and putthing them up on the wiki http://wiki.apache.org/nutch/Search_Theory sorry about all the cvs message on edits.. i'm not used to the touchpad on this darned laptop :) Anyhow, if you have any resour

[Nutch-dev] NegativeArraySizeException in search server

2006-01-03 Thread Gal Nitzan
When trying to use the search server I get. I use the trunk from today... 060104 025549 13 Server handler 0 on 9004 call error: java.io.IOException: java.lang.NegativeArraySizeException java.io.IOException: java.lang.NegativeArraySizeException at org.apache.lucene.util.PriorityQueue.initi

[Nutch-dev] Re: Nutch-87 Setup

2006-01-03 Thread Matt Kangas
Hi Neal, The code attached to the ticket does indeed work for me, but I'm afraid it's a little rough around the edges. Plus, I think I forgot to include the WhitelistWriter class. :) What timeframe do you need this within? I usually see one request a month for this, so I should clean clea

[Nutch-dev] [jira] Commented: (NUTCH-162) country code "jp" is used instead of language code "ja" for Japanese

2006-01-03 Thread KuroSaka TeruHiko (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-162?page=comments#action_12361683 ] KuroSaka TeruHiko commented on NUTCH-162: - This is causing an undesired behavior for Japanese users. If the Nutch main index.jsp is visited from the browser of which t

[Nutch-dev] [jira] Created: (NUTCH-162) country code "jp" is used instead of language code "ja" for Japanese

2006-01-03 Thread KuroSaka TeruHiko (JIRA)
country code "jp" is used instead of language code "ja" for Japanese Key: NUTCH-162 URL: http://issues.apache.org/jira/browse/NUTCH-162 Project: Nutch Type: Bug Components: web gui Versions:

[Nutch-dev] Re: IndexSorter optimizer

2006-01-03 Thread Byron Miller
On optimizing performance, does anyone know if google is exporting its entire dataset as an index or only somehow indexing the topN % (since they only show the first 1000 or so results anyway) With this patch and a top result set in the xml file does that mean it will stop scanning the index at th

[Nutch-dev] Re: Nutch-87 Setup

2006-01-03 Thread Neal Whitley
Sorry, I posted the incorrect error code in my previous messages. Here is the output I get when running ant with the Nutch-87 plugin: [caribmag]$ ant -v Apache Ant version 1.6.5 compiled on June 2 2005 Buildfile: build.xml Detected Java version: 1.4 in: /home/1/caribmag/j2sdk1.4.2_10/jre Detec

[Nutch-dev] [jira] Commented: (NUTCH-87) Efficient site-specific crawling for a large number of sites

2006-01-03 Thread Neal Whitley (JIRA)
[ http://issues.apache.org/jira/browse/NUTCH-87?page=comments#action_12361660 ] Neal Whitley commented on NUTCH-87: --- Matt, Is there a how-to or tutorial on how to get this plugin up and running? I am running into problems (probably mine) on the integrat

[Nutch-dev] Nutch-87 Setup

2006-01-03 Thread Neal Whitley
Nutch-87 Setup I am looking to create a vertical/regional search application and the Nutch-87 plugin sounds perfect for what I want to do. However, this is all VERY new to me (java, ant, tomcat, nutch etc. but I was able to hack my way through the installation and have a working copy of Nutc

[Nutch-dev] Re: [bug?] PRC called emthod require parameter

2006-01-03 Thread Stefan Groschupf
Different parameters are sent to each address. So params.length should equal addresses.length, and if params.length==0 then addresses.length==0 and there's no call to be made. Make sense? It might be clearer if the test were changed to addresses.length==0. Yes, this would be better, sinc

[Nutch-dev] Re: LogFormatter

2006-01-03 Thread Stefan Groschupf
Hi, I also agree and would love to see things changed. In general I would love to be able to be able to write log files also in custom storages types. For example it would be great in case it would be possibe to write log files into the ndfs or into a database. Especially for smaller scaled ho

[Nutch-dev] Re: LogFormatter

2006-01-03 Thread Christopher Burkey
H Daniel, I agree with you. Using the LogFormatter is a bad design and is causing us problems as well. We use Spring and set the error handler to any objects that want it. Daniel Feinstein wrote: Hi, This is my first email to the community. In our (rawsugar.com) project we integrated m

[Nutch-dev] LogFormatter

2006-01-03 Thread Daniel Feinstein
Hi,   This is my first email to the community. In our (rawsugar.com) project we integrated multiple lucene indexes and a nutch package. We use the packages for about two years. Lucene is working great and we could use it as a "black box". Unfortunately, we had to modify Nutch package for our

[Nutch-dev] Re: NullPointerException (new as of Dec 31st)

2006-01-03 Thread Andrzej Bialecki
Rod Taylor wrote: During a fetch I have recently started getting these (pretty consistently). Fixed. Thanks! -- Best regards, Andrzej Bialecki <>< ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || |