Everything looks okay in terms of the files.
When you copied everything over from windows, other then the operating system
is there anything different with the software?
Maybe you have an old windows style path somewhere (C:\Nutch\Crawl)? Also
double check to see if your "searcher.dir" property inside your nutch-site.xml
file is correct.
----- Original Message ----
From: kan001 <[EMAIL PROTECTED]>
To: [email protected]
Sent: Monday, March 5, 2007 11:48:56 PM
Subject: Re: [SOLVED] moving crawled db from windows to linux
Thanks for the immediate reply.
please find the result from du -h crawl/ command and the logs below:
32K crawl/crawldb/current/part-00000
36K crawl/crawldb/current
40K crawl/crawldb
120K crawl/index
128K crawl/indexes/part-00000
132K crawl/indexes
52K crawl/linkdb/current/part-00000
56K crawl/linkdb/current
60K crawl/linkdb
40K crawl/segments/20070228143239/content/part-00000
44K crawl/segments/20070228143239/content
20K crawl/segments/20070228143239/crawl_fetch/part-00000
24K crawl/segments/20070228143239/crawl_fetch
12K crawl/segments/20070228143239/crawl_generate
12K crawl/segments/20070228143239/crawl_parse
20K crawl/segments/20070228143239/parse_data/part-00000
24K crawl/segments/20070228143239/parse_data
24K crawl/segments/20070228143239/parse_text/part-00000
28K crawl/segments/20070228143239/parse_text
148K crawl/segments/20070228143239
136K crawl/segments/20070228143249/content/part-00000
140K crawl/segments/20070228143249/content
20K crawl/segments/20070228143249/crawl_fetch/part-00000
24K crawl/segments/20070228143249/crawl_fetch
12K crawl/segments/20070228143249/crawl_generate
28K crawl/segments/20070228143249/crawl_parse
32K crawl/segments/20070228143249/parse_data/part-00000
36K crawl/segments/20070228143249/parse_data
44K crawl/segments/20070228143249/parse_text/part-00000
48K crawl/segments/20070228143249/parse_text
292K crawl/segments/20070228143249
20K crawl/segments/20070228143327/content/part-00000
24K crawl/segments/20070228143327/content
20K crawl/segments/20070228143327/crawl_fetch/part-00000
24K crawl/segments/20070228143327/crawl_fetch
16K crawl/segments/20070228143327/crawl_generate
12K crawl/segments/20070228143327/crawl_parse
20K crawl/segments/20070228143327/parse_data/part-00000
24K crawl/segments/20070228143327/parse_data
20K crawl/segments/20070228143327/parse_text/part-00000
24K crawl/segments/20070228143327/parse_text
128K crawl/segments/20070228143327
20K crawl/segments/20070228143434/content/part-00000
24K crawl/segments/20070228143434/content
20K crawl/segments/20070228143434/crawl_fetch/part-00000
24K crawl/segments/20070228143434/crawl_fetch
16K crawl/segments/20070228143434/crawl_generate
12K crawl/segments/20070228143434/crawl_parse
20K crawl/segments/20070228143434/parse_data/part-00000
24K crawl/segments/20070228143434/parse_data
20K crawl/segments/20070228143434/parse_text/part-00000
24K crawl/segments/20070228143434/parse_text
128K crawl/segments/20070228143434
700K crawl/segments
1.1M crawl/
INFO [TP-Processor1] (Configuration.java:397) - parsing
jar:file:/usr/java/tomcat-5.5/webapps/ROOT/WEB-INF/lib/hadoop-0.4.0.jar!/hadoop-default.xml
INFO [TP-Processor1] (Configuration.java:397) - parsing
file:/usr/java/tomcat-5.5/webapps/ROOT/WEB-INF/classes/nutch-default.xml
INFO [TP-Processor1] (Configuration.java:397) - parsing
file:/usr/java/tomcat-5.5/webapps/ROOT/WEB-INF/classes/nutch-site.xml
INFO [TP-Processor1] (Configuration.java:397) - parsing
file:/usr/java/tomcat-5.5/webapps/ROOT/WEB-INF/classes/hadoop-site.xml
INFO [TP-Processor1] (PluginManifestParser.java:81) - Plugins: looking in:
/usr/java/tomcat-5.5/webapps/ROOT/WEB-INF/classes/plugins
INFO [TP-Processor1] (PluginRepository.java:333) - Plugin Auto-activation
mode: [true]
INFO [TP-Processor1] (PluginRepository.java:334) - Registered Plugins:
INFO [TP-Processor1] (PluginRepository.java:341) - CyberNeko HTML
Parser (lib-nekohtml)
INFO [TP-Processor1] (PluginRepository.java:341) - Site Query Filter
(query-site)
INFO [TP-Processor1] (PluginRepository.java:341) - Html Parse Plug-in
(parse-html)
INFO [TP-Processor1] (PluginRepository.java:341) - Regex URL Filter
Framework (lib-regex-filter)
INFO [TP-Processor1] (PluginRepository.java:341) - Basic Indexing
Filter (index-basic)
INFO [TP-Processor1] (PluginRepository.java:341) - Basic Summarizer
Plug-in (summary-basic)
INFO [TP-Processor1] (PluginRepository.java:341) - Text Parse Plug-in
(parse-text)
INFO [TP-Processor1] (PluginRepository.java:341) - JavaScript Parser
(parse-js)
INFO [TP-Processor1] (PluginRepository.java:341) - Regex URL Filter
(urlfilter-regex)
INFO [TP-Processor1] (PluginRepository.java:341) - Basic Query Filter
(query-basic)
INFO [TP-Processor1] (PluginRepository.java:341) - HTTP Framework
(lib-http)
INFO [TP-Processor1] (PluginRepository.java:341) - URL Query Filter
(query-url)
INFO [TP-Processor1] (PluginRepository.java:341) - Http Protocol
Plug-in (protocol-http)
INFO [TP-Processor1] (PluginRepository.java:341) - the nutch core
extension points (nutch-extensionpoints)
INFO [TP-Processor1] (PluginRepository.java:341) - OPIC Scoring Plug-in
(scoring-opic)
INFO [TP-Processor1] (PluginRepository.java:345) - Registered
Extension-Points:
INFO [TP-Processor1] (PluginRepository.java:352) - Nutch Summarizer
(org.apache.nutch.searcher.Summarizer)
INFO [TP-Processor1] (PluginRepository.java:352) - Nutch Scoring
(org.apache.nutch.scoring.ScoringFilter)
INFO [TP-Processor1] (PluginRepository.java:352) - Nutch Protocol
(org.apache.nutch.protocol.Protocol)
INFO [TP-Processor1] (PluginRepository.java:352) - Nutch URL Filter
(org.apache.nutch.net.URLFilter)
INFO [TP-Processor1] (PluginRepository.java:352) - HTML Parse Filter
(org.apache.nutch.parse.HtmlParseFilter)
INFO [TP-Processor1] (PluginRepository.java:352) - Nutch Online Search
Results Clustering Plugin (org.apache.nutch.clustering.OnlineClusterer)
INFO [TP-Processor1] (PluginRepository.java:352) - Nutch Indexing
Filter (org.apache.nutch.indexer.IndexingFilter)
INFO [TP-Processor1] (PluginRepository.java:352) - Nutch Content Parser
(org.apache.nutch.parse.Parser)
INFO [TP-Processor1] (PluginRepository.java:352) - Ontology Model
Loader (org.apache.nutch.ontology.Ontology)
INFO [TP-Processor1] (PluginRepository.java:352) - Nutch Analysis
(org.apache.nutch.analysis.NutchAnalyzer)
INFO [TP-Processor1] (PluginRepository.java:352) - Nutch Query Filter
(org.apache.nutch.searcher.QueryFilter)
INFO [TP-Processor1] (NutchBean.java:69) - creating new bean
INFO [TP-Processor1] (NutchBean.java:121) - opening indexes in
/home/nutch-0.8/crawl/indexes
INFO [TP-Processor1] (Configuration.java:360) - found resource
common-terms.utf8 at
file:/usr/java/tomcat-5.5/webapps/ROOT/WEB-INF/classes/common-terms.utf8
INFO [TP-Processor1] (NutchBean.java:143) - opening segments in
/home/nutch-0.8/crawl/segments
INFO [TP-Processor1] (SummarizerFactory.java:52) - Using the first
summarizer extension found: Basic Summarizer
INFO [TP-Processor1] (NutchBean.java:154) - opening linkdb in
/home/nutch-0.8/crawl/linkdb
INFO [TP-Processor1] (search_jsp.java:108) - query request from
192.168.1.64
INFO [TP-Processor1] (search_jsp.java:151) - query:
INFO [TP-Processor1] (search_jsp.java:152) - lang:
INFO [TP-Processor1] (NutchBean.java:247) - searching for 20 raw hits
INFO [TP-Processor1] (search_jsp.java:337) - total hits: 0
INFO [TP-Processor5] (search_jsp.java:108) - query request from
192.168.1.64
INFO [TP-Processor5] (search_jsp.java:151) - query: ads
INFO [TP-Processor5] (search_jsp.java:152) - lang: en
INFO [TP-Processor5] (NutchBean.java:247) - searching for 20 raw hits
INFO [TP-Processor5] (search_jsp.java:337) - total hits: 0
kan001 wrote:
>
> When I copied crawled db from windows to linux and trying to search
> through tomcat in linux - it returns 0 hits.
> But in windows its getting results from search screen. Any idea?? I have
> given root permissions to the crawled db.
> In the logs it is showing - oening segments.... But hits 0!!!
>
--
View this message in context:
http://www.nabble.com/moving-crawled-db-from-windows-to-linux-tf3350448.html#a9326034
Sent from the Nutch - User mailing list archive at Nabble.com.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general