which version of Nutch are you using?
Nutch 2 what?

On Thu, Jun 5, 2014 at 12:14 PM, Manikandan Saravanan <
[email protected]> wrote:

> Dear Lewis,
>
> I’m running Nutch 2 on a Hadoop 1.2.1 cluster (2 nodes). I’m using
> Cassandra as my backend datastore . I’m trying to crawl one link as of now.
> The inject command works properly: I’m able to find one row added to the
> “webpage” keyspace in Cassandra. But the generator doesn’t do a thing. So
> does the fetcher. In the end, nothing’s indexed in Solr.
>
> Please help me out. My stack trace is:
>
> hduser@nutch-one-qontifi:/usr/local/nutch$ bin/crawl urls/seed.txt
> TestCrawl http://10.130.231.16:8983/solr/nutch 2
> Warning: $HADOOP_HOME is deprecated.
>
> 14/06/05 15:00:34 INFO crawl.InjectorJob: InjectorJob: starting at
> 2014-06-05 15:00:34
> 14/06/05 15:00:34 INFO crawl.InjectorJob: InjectorJob: Injecting urlDir:
> urls/seed.txt
> 14/06/05 15:00:36 INFO connection.CassandraHostRetryService: Downed Host
> Retry service started with queue size -1 and retry delay 10s
> 14/06/05 15:00:40 INFO service.JmxMonitor: Registering JMX
> me.prettyprint.cassandra.service_Qontifi:ServiceType=hector,MonitorType=hector
> 14/06/05 15:00:41 INFO crawl.InjectorJob: InjectorJob: Using class
> org.apache.gora.cassandra.store.CassandraStore as the Gora storage class.
> 14/06/05 15:00:44 INFO input.FileInputFormat: Total input paths to process
> : 1
> 14/06/05 15:00:44 INFO util.NativeCodeLoader: Loaded the native-hadoop
> library
> 14/06/05 15:00:44 WARN snappy.LoadSnappy: Snappy native library not loaded
> 14/06/05 15:00:44 INFO mapred.JobClient: Running job: job_201406051410_0011
> 14/06/05 15:00:45 INFO mapred.JobClient:  map 0% reduce 0%
> 14/06/05 15:01:00 INFO mapred.JobClient:  map 100% reduce 0%
> 14/06/05 15:01:02 INFO mapred.JobClient: Job complete:
> job_201406051410_0011
> 14/06/05 15:01:02 INFO mapred.JobClient: Counters: 19
> 14/06/05 15:01:02 INFO mapred.JobClient:   Job Counters
> 14/06/05 15:01:02 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=14861
> 14/06/05 15:01:02 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 14/06/05 15:01:02 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 14/06/05 15:01:02 INFO mapred.JobClient:     Launched map tasks=1
> 14/06/05 15:01:02 INFO mapred.JobClient:     Data-local map tasks=1
> 14/06/05 15:01:02 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> 14/06/05 15:01:02 INFO mapred.JobClient:   File Output Format Counters
> 14/06/05 15:01:02 INFO mapred.JobClient:     Bytes Written=0
> 14/06/05 15:01:02 INFO mapred.JobClient:   injector
> 14/06/05 15:01:02 INFO mapred.JobClient:     urls_injected=1
> 14/06/05 15:01:02 INFO mapred.JobClient:   FileSystemCounters
> 14/06/05 15:01:02 INFO mapred.JobClient:     HDFS_BYTES_READ=135
> 14/06/05 15:01:02 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=77648
> 14/06/05 15:01:02 INFO mapred.JobClient:   File Input Format Counters
> 14/06/05 15:01:02 INFO mapred.JobClient:     Bytes Read=25
> 14/06/05 15:01:02 INFO mapred.JobClient:   Map-Reduce Framework
> 14/06/05 15:01:02 INFO mapred.JobClient:     Map input records=1
> 14/06/05 15:01:02 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=122052608
> 14/06/05 15:01:02 INFO mapred.JobClient:     Spilled Records=0
> 14/06/05 15:01:02 INFO mapred.JobClient:     CPU time spent (ms)=1490
> 14/06/05 15:01:02 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=58195968
> 14/06/05 15:01:02 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=1119281152
> 14/06/05 15:01:02 INFO mapred.JobClient:     Map output records=1
> 14/06/05 15:01:02 INFO mapred.JobClient:     SPLIT_RAW_BYTES=110
> 14/06/05 15:01:02 INFO crawl.InjectorJob: InjectorJob: total number of
> urls rejected by filters: 0
> 14/06/05 15:01:02 INFO crawl.InjectorJob: InjectorJob: total number of
> urls injected after normalization and filtering: 1
> 14/06/05 15:01:02 INFO crawl.InjectorJob: Injector: finished at 2014-06-05
> 15:01:02, elapsed: 00:00:28
> Thu Jun 5 15:01:02 EDT 2014 : Iteration 1 of 2
> Generating batchId
> Generating a new fetchlist
> Warning: $HADOOP_HOME is deprecated.
>
> 14/06/05 15:01:06 INFO crawl.GeneratorJob: GeneratorJob: starting at
> 2014-06-05 15:01:06
> 14/06/05 15:01:06 INFO crawl.GeneratorJob: GeneratorJob: Selecting
> best-scoring urls due for fetch.
> 14/06/05 15:01:06 INFO crawl.GeneratorJob: GeneratorJob: starting
> 14/06/05 15:01:06 INFO crawl.GeneratorJob: GeneratorJob: filtering: false
> 14/06/05 15:01:06 INFO crawl.GeneratorJob: GeneratorJob: normalizing: false
> 14/06/05 15:01:06 INFO crawl.GeneratorJob: GeneratorJob: topN: 50000
> 14/06/05 15:01:06 INFO crawl.FetchScheduleFactory: Using FetchSchedule
> impl: org.apache.nutch.crawl.DefaultFetchSchedule
> 14/06/05 15:01:06 INFO crawl.AbstractFetchSchedule: defaultInterval=2592000
> 14/06/05 15:01:06 INFO crawl.AbstractFetchSchedule: maxInterval=7776000
> 14/06/05 15:01:07 INFO connection.CassandraHostRetryService: Downed Host
> Retry service started with queue size -1 and retry delay 10s
> 14/06/05 15:01:11 INFO service.JmxMonitor: Registering JMX
> me.prettyprint.cassandra.service_Qontifi:ServiceType=hector,MonitorType=hector
> 14/06/05 15:01:15 INFO mapred.JobClient: Running job: job_201406051410_0012
> 14/06/05 15:01:16 INFO mapred.JobClient:  map 0% reduce 0%
> 14/06/05 15:01:55 INFO mapred.JobClient:  map 100% reduce 0%
> 14/06/05 15:02:05 INFO mapred.JobClient:  map 100% reduce 33%
> 14/06/05 15:02:08 INFO mapred.JobClient:  map 100% reduce 66%
> 14/06/05 15:02:10 INFO mapred.JobClient:  map 100% reduce 83%
> 14/06/05 15:02:11 INFO mapred.JobClient:  map 100% reduce 100%
> 14/06/05 15:02:14 INFO mapred.JobClient: Job complete:
> job_201406051410_0012
> 14/06/05 15:02:14 INFO mapred.JobClient: Counters: 27
> 14/06/05 15:02:14 INFO mapred.JobClient:   Job Counters
> 14/06/05 15:02:14 INFO mapred.JobClient:     Launched reduce tasks=2
> 14/06/05 15:02:14 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=39990
> 14/06/05 15:02:14 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 14/06/05 15:02:14 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 14/06/05 15:02:14 INFO mapred.JobClient:     Launched map tasks=1
> 14/06/05 15:02:14 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=29119
> 14/06/05 15:02:14 INFO mapred.JobClient:   File Output Format Counters
> 14/06/05 15:02:14 INFO mapred.JobClient:     Bytes Written=0
> 14/06/05 15:02:14 INFO mapred.JobClient:   FileSystemCounters
> 14/06/05 15:02:14 INFO mapred.JobClient:     FILE_BYTES_READ=44
> 14/06/05 15:02:14 INFO mapred.JobClient:     HDFS_BYTES_READ=951
> 14/06/05 15:02:14 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=239453
> 14/06/05 15:02:14 INFO mapred.JobClient:   File Input Format Counters
> 14/06/05 15:02:14 INFO mapred.JobClient:     Bytes Read=0
> 14/06/05 15:02:14 INFO mapred.JobClient:   Map-Reduce Framework
> 14/06/05 15:02:14 INFO mapred.JobClient:     Map output materialized
> bytes=28
> 14/06/05 15:02:14 INFO mapred.JobClient:     Map input records=0
> 14/06/05 15:02:14 INFO mapred.JobClient:     Reduce shuffle bytes=28
> 14/06/05 15:02:14 INFO mapred.JobClient:     Spilled Records=0
> 14/06/05 15:02:14 INFO mapred.JobClient:     Map output bytes=0
> 14/06/05 15:02:14 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=333971456
> 14/06/05 15:02:14 INFO mapred.JobClient:     CPU time spent (ms)=9330
> 14/06/05 15:02:14 INFO mapred.JobClient:     Combine input records=0
> 14/06/05 15:02:14 INFO mapred.JobClient:     SPLIT_RAW_BYTES=951
> 14/06/05 15:02:14 INFO mapred.JobClient:     Reduce input records=0
> 14/06/05 15:02:14 INFO mapred.JobClient:     Reduce input groups=0
> 14/06/05 15:02:14 INFO mapred.JobClient:     Combine output records=0
> 14/06/05 15:02:14 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=486813696
> 14/06/05 15:02:14 INFO mapred.JobClient:     Reduce output records=0
> 14/06/05 15:02:14 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=6016212992
> 14/06/05 15:02:14 INFO mapred.JobClient:     Map output records=0
> 14/06/05 15:02:14 INFO crawl.GeneratorJob: GeneratorJob: finished at
> 2014-06-05 15:02:14, time elapsed: 00:01:08
> 14/06/05 15:02:14 INFO crawl.GeneratorJob: GeneratorJob: generated batch
> id: 1401994862-29963
> Fetching :
> Warning: $HADOOP_HOME is deprecated.
>
> 14/06/05 15:02:18 INFO fetcher.FetcherJob: FetcherJob: starting
> 14/06/05 15:02:18 INFO fetcher.FetcherJob: FetcherJob: batchId:
> 1401994862-29963
> 14/06/05 15:02:18 INFO fetcher.FetcherJob: FetcherJob: threads: 50
> 14/06/05 15:02:18 INFO fetcher.FetcherJob: FetcherJob: parsing: false
> 14/06/05 15:02:18 INFO fetcher.FetcherJob: FetcherJob: resuming: false
> 14/06/05 15:02:18 INFO fetcher.FetcherJob: FetcherJob : timelimit set for
> : 1402005738902
> 14/06/05 15:02:19 INFO plugin.PluginRepository: Plugins: looking in:
> /app/hadoop/tmp/hadoop-unjar813633856909664022/classes/plugins
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Plugin Auto-activation
> mode: [true]
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Registered Plugins:
> 14/06/05 15:02:20 INFO plugin.PluginRepository: the nutch core extension
> points (nutch-extensionpoints)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Regex URL Normalizer
> (urlnormalizer-regex)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: CyberNeko HTML Parser
> (lib-nekohtml)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: OPIC Scoring Plug-in
> (scoring-opic)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Basic URL Normalizer
> (urlnormalizer-basic)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Tika Parser Plug-in
> (parse-tika)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Basic Indexing Filter
> (index-basic)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Html Parse Plug-in
> (parse-html)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Anchor Indexing Filter
> (index-anchor)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: HTTP Framework (lib-http)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Regex URL Filter
> (urlfilter-regex)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Regex URL Filter
> Framework (lib-regex-filter)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Pass-through URL
> Normalizer (urlnormalizer-pass)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Http Protocol Plug-in
> (protocol-http)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Registered
> Extension-Points:
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Nutch URL Normalizer
> (org.apache.nutch.net.URLNormalizer)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Nutch Protocol
> (org.apache.nutch.protocol.Protocol)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Parse Filter
> (org.apache.nutch.parse.ParseFilter)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Nutch URL Filter
> (org.apache.nutch.net.URLFilter)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Nutch Indexing Filter
> (org.apache.nutch.indexer.IndexingFilter)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Nutch Content Parser
> (org.apache.nutch.parse.Parser)
> 14/06/05 15:02:20 INFO plugin.PluginRepository: Nutch Scoring
> (org.apache.nutch.scoring.ScoringFilter)
> 14/06/05 15:02:20 INFO http.Http: http.proxy.host = null
> 14/06/05 15:02:20 INFO http.Http: http.proxy.port = 8080
> 14/06/05 15:02:20 INFO http.Http: http.timeout = 10000
> 14/06/05 15:02:20 INFO http.Http: http.content.limit = 65536
> 14/06/05 15:02:20 INFO http.Http: http.agent = Qontifi/Nutch-2.2.1 (A big
> data analytics and social media intelligence platform; http://qontifi.com;
> manikandan at thesocialpeople dot net)
> 14/06/05 15:02:20 INFO http.Http: http.accept.language =
> en-us,en-gb,en;q=0.7,*;q=0.3
> 14/06/05 15:02:20 INFO http.Http: http.accept =
> text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
> 14/06/05 15:02:20 INFO connection.CassandraHostRetryService: Downed Host
> Retry service started with queue size -1 and retry delay 10s
> 14/06/05 15:02:25 INFO service.JmxMonitor: Registering JMX
> me.prettyprint.cassandra.service_Qontifi:ServiceType=hector,MonitorType=hector
> 14/06/05 15:02:29 INFO mapred.JobClient: Running job: job_201406051410_0013
> 14/06/05 15:02:30 INFO mapred.JobClient:  map 0% reduce 0%
> 14/06/05 15:03:05 INFO mapred.JobClient:  map 100% reduce 0%
> 14/06/05 15:03:14 INFO mapred.JobClient:  map 100% reduce 16%
> 14/06/05 15:03:16 INFO mapred.JobClient:  map 100% reduce 33%
> 14/06/05 15:03:17 INFO mapred.JobClient:  map 100% reduce 50%
> 14/06/05 15:03:19 INFO mapred.JobClient:  map 100% reduce 66%
> 14/06/05 15:03:23 INFO mapred.JobClient:  map 100% reduce 83%
> 14/06/05 15:03:28 INFO mapred.JobClient:  map 100% reduce 100%
> 14/06/05 15:03:31 INFO mapred.JobClient: Job complete:
> job_201406051410_0013
> 14/06/05 15:03:31 INFO mapred.JobClient: Counters: 28
> 14/06/05 15:03:31 INFO mapred.JobClient:   Job Counters
> 14/06/05 15:03:31 INFO mapred.JobClient:     Launched reduce tasks=2
> 14/06/05 15:03:31 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=37163
> 14/06/05 15:03:31 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 14/06/05 15:03:31 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 14/06/05 15:03:31 INFO mapred.JobClient:     Launched map tasks=1
> 14/06/05 15:03:31 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=39755
> 14/06/05 15:03:31 INFO mapred.JobClient:   File Output Format Counters
> 14/06/05 15:03:31 INFO mapred.JobClient:     Bytes Written=0
> 14/06/05 15:03:31 INFO mapred.JobClient:   FileSystemCounters
> 14/06/05 15:03:31 INFO mapred.JobClient:     FILE_BYTES_READ=44
> 14/06/05 15:03:31 INFO mapred.JobClient:     HDFS_BYTES_READ=935
> 14/06/05 15:03:31 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=237923
> 14/06/05 15:03:31 INFO mapred.JobClient:   File Input Format Counters
> 14/06/05 15:03:31 INFO mapred.JobClient:     Bytes Read=0
> 14/06/05 15:03:31 INFO mapred.JobClient:   FetcherStatus
> 14/06/05 15:03:31 INFO mapred.JobClient:     HitByTimeLimit-QueueFeeder=0
> 14/06/05 15:03:31 INFO mapred.JobClient:   Map-Reduce Framework
> 14/06/05 15:03:31 INFO mapred.JobClient:     Map output materialized
> bytes=28
> 14/06/05 15:03:31 INFO mapred.JobClient:     Map input records=0
> 14/06/05 15:03:31 INFO mapred.JobClient:     Reduce shuffle bytes=28
> 14/06/05 15:03:31 INFO mapred.JobClient:     Spilled Records=0
> 14/06/05 15:03:31 INFO mapred.JobClient:     Map output bytes=0
> 14/06/05 15:03:31 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=375914496
> 14/06/05 15:03:31 INFO mapred.JobClient:     CPU time spent (ms)=9820
> 14/06/05 15:03:31 INFO mapred.JobClient:     Combine input records=0
> 14/06/05 15:03:31 INFO mapred.JobClient:     SPLIT_RAW_BYTES=935
> 14/06/05 15:03:31 INFO mapred.JobClient:     Reduce input records=0
> 14/06/05 15:03:31 INFO mapred.JobClient:     Reduce input groups=0
> 14/06/05 15:03:31 INFO mapred.JobClient:     Combine output records=0
> 14/06/05 15:03:31 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=510382080
> 14/06/05 15:03:31 INFO mapred.JobClient:     Reduce output records=0
> 14/06/05 15:03:31 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=6060650496
> 14/06/05 15:03:31 INFO mapred.JobClient:     Map output records=0
> 14/06/05 15:03:31 INFO fetcher.FetcherJob: FetcherJob: done
> Parsing :
> Warning: $HADOOP_HOME is deprecated.
>
> 14/06/05 15:03:34 INFO parse.ParserJob: ParserJob: starting
> 14/06/05 15:03:34 INFO parse.ParserJob: ParserJob: resuming: false
> 14/06/05 15:03:34 INFO parse.ParserJob: ParserJob: forced reparse: false
> 14/06/05 15:03:34 INFO parse.ParserJob: ParserJob: batchId:
> 1401994862-29963
> 14/06/05 15:03:35 INFO plugin.PluginRepository: Plugins: looking in:
> /app/hadoop/tmp/hadoop-unjar8143815380567453850/classes/plugins
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Plugin Auto-activation
> mode: [true]
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Registered Plugins:
> 14/06/05 15:03:36 INFO plugin.PluginRepository: the nutch core extension
> points (nutch-extensionpoints)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Regex URL Normalizer
> (urlnormalizer-regex)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: CyberNeko HTML Parser
> (lib-nekohtml)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: OPIC Scoring Plug-in
> (scoring-opic)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Basic URL Normalizer
> (urlnormalizer-basic)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Tika Parser Plug-in
> (parse-tika)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Basic Indexing Filter
> (index-basic)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Html Parse Plug-in
> (parse-html)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Anchor Indexing Filter
> (index-anchor)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: HTTP Framework (lib-http)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Regex URL Filter
> (urlfilter-regex)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Regex URL Filter
> Framework (lib-regex-filter)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Pass-through URL
> Normalizer (urlnormalizer-pass)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Http Protocol Plug-in
> (protocol-http)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Registered
> Extension-Points:
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Nutch URL Normalizer
> (org.apache.nutch.net.URLNormalizer)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Nutch Protocol
> (org.apache.nutch.protocol.Protocol)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Parse Filter
> (org.apache.nutch.parse.ParseFilter)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Nutch URL Filter
> (org.apache.nutch.net.URLFilter)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Nutch Indexing Filter
> (org.apache.nutch.indexer.IndexingFilter)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Nutch Content Parser
> (org.apache.nutch.parse.Parser)
> 14/06/05 15:03:36 INFO plugin.PluginRepository: Nutch Scoring
> (org.apache.nutch.scoring.ScoringFilter)
> 14/06/05 15:03:36 INFO conf.Configuration: found resource
> parse-plugins.xml at
> file:/app/hadoop/tmp/hadoop-unjar8143815380567453850/parse-plugins.xml
> 14/06/05 15:03:36 INFO crawl.SignatureFactory: Using Signature impl:
> org.apache.nutch.crawl.MD5Signature
> 14/06/05 15:03:37 INFO connection.CassandraHostRetryService: Downed Host
> Retry service started with queue size -1 and retry delay 10s
> 14/06/05 15:03:41 INFO service.JmxMonitor: Registering JMX
> me.prettyprint.cassandra.service_Qontifi:ServiceType=hector,MonitorType=hector
> 14/06/05 15:03:45 INFO mapred.JobClient: Running job: job_201406051410_0014
> 14/06/05 15:03:46 INFO mapred.JobClient:  map 0% reduce 0%
> 14/06/05 15:04:22 INFO mapred.JobClient:  map 100% reduce 0%
> 14/06/05 15:04:24 INFO mapred.JobClient: Job complete:
> job_201406051410_0014
> 14/06/05 15:04:25 INFO mapred.JobClient: Counters: 17
> 14/06/05 15:04:25 INFO mapred.JobClient:   Job Counters
> 14/06/05 15:04:25 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=36653
> 14/06/05 15:04:25 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 14/06/05 15:04:25 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 14/06/05 15:04:25 INFO mapred.JobClient:     Launched map tasks=1
> 14/06/05 15:04:25 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> 14/06/05 15:04:25 INFO mapred.JobClient:   File Output Format Counters
> 14/06/05 15:04:25 INFO mapred.JobClient:     Bytes Written=0
> 14/06/05 15:04:25 INFO mapred.JobClient:   FileSystemCounters
> 14/06/05 15:04:25 INFO mapred.JobClient:     HDFS_BYTES_READ=979
> 14/06/05 15:04:25 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=78853
> 14/06/05 15:04:25 INFO mapred.JobClient:   File Input Format Counters
> 14/06/05 15:04:25 INFO mapred.JobClient:     Bytes Read=0
> 14/06/05 15:04:25 INFO mapred.JobClient:   Map-Reduce Framework
> 14/06/05 15:04:25 INFO mapred.JobClient:     Map input records=0
> 14/06/05 15:04:25 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=129826816
> 14/06/05 15:04:25 INFO mapred.JobClient:     Spilled Records=0
> 14/06/05 15:04:25 INFO mapred.JobClient:     CPU time spent (ms)=2330
> 14/06/05 15:04:25 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=60817408
> 14/06/05 15:04:25 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=2000629760
> 14/06/05 15:04:25 INFO mapred.JobClient:     Map output records=0
> 14/06/05 15:04:25 INFO mapred.JobClient:     SPLIT_RAW_BYTES=979
> 14/06/05 15:04:25 INFO parse.ParserJob: ParserJob: success
> CrawlDB update for TestCrawl
> Warning: $HADOOP_HOME is deprecated.
>
> 14/06/05 15:04:28 INFO crawl.DbUpdaterJob: DbUpdaterJob: starting
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Plugins: looking in:
> /app/hadoop/tmp/hadoop-unjar4238316120015868426/classes/plugins
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Plugin Auto-activation
> mode: [true]
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Registered Plugins:
> 14/06/05 15:04:29 INFO plugin.PluginRepository: the nutch core extension
> points (nutch-extensionpoints)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Regex URL Normalizer
> (urlnormalizer-regex)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: CyberNeko HTML Parser
> (lib-nekohtml)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: OPIC Scoring Plug-in
> (scoring-opic)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Basic URL Normalizer
> (urlnormalizer-basic)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Tika Parser Plug-in
> (parse-tika)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Basic Indexing Filter
> (index-basic)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Html Parse Plug-in
> (parse-html)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Anchor Indexing Filter
> (index-anchor)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: HTTP Framework (lib-http)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Regex URL Filter
> (urlfilter-regex)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Regex URL Filter
> Framework (lib-regex-filter)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Pass-through URL
> Normalizer (urlnormalizer-pass)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Http Protocol Plug-in
> (protocol-http)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Registered
> Extension-Points:
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Nutch URL Normalizer
> (org.apache.nutch.net.URLNormalizer)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Nutch Protocol
> (org.apache.nutch.protocol.Protocol)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Parse Filter
> (org.apache.nutch.parse.ParseFilter)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Nutch URL Filter
> (org.apache.nutch.net.URLFilter)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Nutch Indexing Filter
> (org.apache.nutch.indexer.IndexingFilter)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Nutch Content Parser
> (org.apache.nutch.parse.Parser)
> 14/06/05 15:04:29 INFO plugin.PluginRepository: Nutch Scoring
> (org.apache.nutch.scoring.ScoringFilter)
> 14/06/05 15:04:30 INFO connection.CassandraHostRetryService: Downed Host
> Retry service started with queue size -1 and retry delay 10s
> 14/06/05 15:04:34 INFO service.JmxMonitor: Registering JMX
> me.prettyprint.cassandra.service_Qontifi:ServiceType=hector,MonitorType=hector
> 14/06/05 15:04:38 INFO mapred.JobClient: Running job: job_201406051410_0015
> 14/06/05 15:04:39 INFO mapred.JobClient:  map 0% reduce 0%
> 14/06/05 15:05:21 INFO mapred.JobClient:  map 100% reduce 0%
> 14/06/05 15:05:31 INFO mapred.JobClient:  map 100% reduce 33%
> 14/06/05 15:05:34 INFO mapred.JobClient:  map 100% reduce 66%
> 14/06/05 15:05:37 INFO mapred.JobClient:  map 100% reduce 100%
> 14/06/05 15:05:39 INFO mapred.JobClient: Job complete:
> job_201406051410_0015
> 14/06/05 15:05:39 INFO mapred.JobClient: Counters: 27
> 14/06/05 15:05:39 INFO mapred.JobClient:   Job Counters
> 14/06/05 15:05:39 INFO mapred.JobClient:     Launched reduce tasks=2
> 14/06/05 15:05:39 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=39898
> 14/06/05 15:05:39 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 14/06/05 15:05:39 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 14/06/05 15:05:39 INFO mapred.JobClient:     Launched map tasks=1
> 14/06/05 15:05:39 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=30439
> 14/06/05 15:05:39 INFO mapred.JobClient:   File Output Format Counters
> 14/06/05 15:05:39 INFO mapred.JobClient:     Bytes Written=0
> 14/06/05 15:05:39 INFO mapred.JobClient:   FileSystemCounters
> 14/06/05 15:05:39 INFO mapred.JobClient:     FILE_BYTES_READ=44
> 14/06/05 15:05:39 INFO mapred.JobClient:     HDFS_BYTES_READ=1028
> 14/06/05 15:05:39 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=237914
> 14/06/05 15:05:39 INFO mapred.JobClient:   File Input Format Counters
> 14/06/05 15:05:39 INFO mapred.JobClient:     Bytes Read=0
> 14/06/05 15:05:39 INFO mapred.JobClient:   Map-Reduce Framework
> 14/06/05 15:05:39 INFO mapred.JobClient:     Map output materialized
> bytes=28
> 14/06/05 15:05:39 INFO mapred.JobClient:     Map input records=0
> 14/06/05 15:05:39 INFO mapred.JobClient:     Reduce shuffle bytes=28
> 14/06/05 15:05:39 INFO mapred.JobClient:     Spilled Records=0
> 14/06/05 15:05:39 INFO mapred.JobClient:     Map output bytes=0
> 14/06/05 15:05:39 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=375914496
> 14/06/05 15:05:39 INFO mapred.JobClient:     CPU time spent (ms)=8880
> 14/06/05 15:05:39 INFO mapred.JobClient:     Combine input records=0
> 14/06/05 15:05:39 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1028
> 14/06/05 15:05:39 INFO mapred.JobClient:     Reduce input records=0
> 14/06/05 15:05:39 INFO mapred.JobClient:     Reduce input groups=0
> 14/06/05 15:05:39 INFO mapred.JobClient:     Combine output records=0
> 14/06/05 15:05:39 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=490651648
> 14/06/05 15:05:39 INFO mapred.JobClient:     Reduce output records=0
> 14/06/05 15:05:39 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=6002880512
> 14/06/05 15:05:39 INFO mapred.JobClient:     Map output records=0
> 14/06/05 15:05:39 INFO crawl.DbUpdaterJob: DbUpdaterJob: done
> Indexing TestCrawl on SOLR index -> http://10.130.231.16:8983/solr/nutch
> Warning: $HADOOP_HOME is deprecated.
>
> 14/06/05 15:05:43 INFO solr.SolrIndexerJob: SolrIndexerJob: starting
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Plugins: looking in:
> /app/hadoop/tmp/hadoop-unjar7543842044056940295/classes/plugins
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Plugin Auto-activation
> mode: [true]
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Registered Plugins:
> 14/06/05 15:05:44 INFO plugin.PluginRepository: the nutch core extension
> points (nutch-extensionpoints)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Regex URL Normalizer
> (urlnormalizer-regex)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: CyberNeko HTML Parser
> (lib-nekohtml)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: OPIC Scoring Plug-in
> (scoring-opic)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Basic URL Normalizer
> (urlnormalizer-basic)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Tika Parser Plug-in
> (parse-tika)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Basic Indexing Filter
> (index-basic)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Html Parse Plug-in
> (parse-html)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Anchor Indexing Filter
> (index-anchor)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: HTTP Framework (lib-http)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Regex URL Filter
> (urlfilter-regex)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Regex URL Filter
> Framework (lib-regex-filter)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Pass-through URL
> Normalizer (urlnormalizer-pass)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Http Protocol Plug-in
> (protocol-http)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Registered
> Extension-Points:
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Nutch URL Normalizer
> (org.apache.nutch.net.URLNormalizer)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Nutch Protocol
> (org.apache.nutch.protocol.Protocol)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Parse Filter
> (org.apache.nutch.parse.ParseFilter)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Nutch URL Filter
> (org.apache.nutch.net.URLFilter)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Nutch Indexing Filter
> (org.apache.nutch.indexer.IndexingFilter)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Nutch Content Parser
> (org.apache.nutch.parse.Parser)
> 14/06/05 15:05:44 INFO plugin.PluginRepository: Nutch Scoring
> (org.apache.nutch.scoring.ScoringFilter)
> 14/06/05 15:05:44 INFO basic.BasicIndexingFilter: Maximum title length for
> indexing set to: 100
> 14/06/05 15:05:44 INFO indexer.IndexingFilters: Adding
> org.apache.nutch.indexer.basic.BasicIndexingFilter
> 14/06/05 15:05:44 INFO anchor.AnchorIndexingFilter: Anchor deduplication
> is: off
> 14/06/05 15:05:44 INFO indexer.IndexingFilters: Adding
> org.apache.nutch.indexer.anchor.AnchorIndexingFilter
> 14/06/05 15:05:45 INFO connection.CassandraHostRetryService: Downed Host
> Retry service started with queue size -1 and retry delay 10s
> 14/06/05 15:05:49 INFO service.JmxMonitor: Registering JMX
> me.prettyprint.cassandra.service_Qontifi:ServiceType=hector,MonitorType=hector
> 14/06/05 15:05:52 INFO mapred.JobClient: Running job: job_201406051410_0016
> 14/06/05 15:05:53 INFO mapred.JobClient:  map 0% reduce 0%
> 14/06/05 15:06:29 INFO mapred.JobClient:  map 100% reduce 0%
> 14/06/05 15:06:32 INFO mapred.JobClient: Job complete:
> job_201406051410_0016
> 14/06/05 15:06:32 INFO mapred.JobClient: Counters: 17
> 14/06/05 15:06:32 INFO mapred.JobClient:   Job Counters
> 14/06/05 15:06:32 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=36879
> 14/06/05 15:06:32 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 14/06/05 15:06:32 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 14/06/05 15:06:32 INFO mapred.JobClient:     Launched map tasks=1
> 14/06/05 15:06:32 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
> 14/06/05 15:06:32 INFO mapred.JobClient:   File Output Format Counters
> 14/06/05 15:06:32 INFO mapred.JobClient:     Bytes Written=0
> 14/06/05 15:06:32 INFO mapred.JobClient:   FileSystemCounters
> 14/06/05 15:06:32 INFO mapred.JobClient:     HDFS_BYTES_READ=962
> 14/06/05 15:06:32 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=78923
> 14/06/05 15:06:32 INFO mapred.JobClient:   File Input Format Counters
> 14/06/05 15:06:32 INFO mapred.JobClient:     Bytes Read=0
> 14/06/05 15:06:32 INFO mapred.JobClient:   Map-Reduce Framework
> 14/06/05 15:06:32 INFO mapred.JobClient:     Map input records=0
> 14/06/05 15:06:32 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=114335744
> 14/06/05 15:06:32 INFO mapred.JobClient:     Spilled Records=0
> 14/06/05 15:06:32 INFO mapred.JobClient:     CPU time spent (ms)=2670
> 14/06/05 15:06:32 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=60293120
> 14/06/05 15:06:32 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=1990189056
> 14/06/05 15:06:32 INFO mapred.JobClient:     Map output records=0
> 14/06/05 15:06:32 INFO mapred.JobClient:     SPLIT_RAW_BYTES=962
> 14/06/05 15:06:32 INFO solr.SolrIndexerJob: SolrIndexerJob: done.
>
> When I run readdb -stats, I get:
>
> hduser@nutch-one-qontifi:/usr/local/nutch$ bin/nutch readdb TestCrawl
> -stats
> Warning: $HADOOP_HOME is deprecated.
>
> 14/06/05 15:13:19 INFO crawl.WebTableReader: WebTable statistics start
> 14/06/05 15:13:21 INFO connection.CassandraHostRetryService: Downed Host
> Retry service started with queue size -1 and retry delay 10s
> 14/06/05 15:13:25 INFO service.JmxMonitor: Registering JMX
> me.prettyprint.cassandra.service_Qontifi:ServiceType=hector,MonitorType=hector
> 14/06/05 15:13:29 INFO mapred.JobClient: Running job: job_201406051410_0019
> 14/06/05 15:13:30 INFO mapred.JobClient:  map 0% reduce 0%
> 14/06/05 15:14:06 INFO mapred.JobClient:  map 100% reduce 0%
> 14/06/05 15:14:15 INFO mapred.JobClient:  map 100% reduce 33%
> 14/06/05 15:14:17 INFO mapred.JobClient:  map 100% reduce 100%
> 14/06/05 15:14:19 INFO mapred.JobClient: Job complete:
> job_201406051410_0019
> 14/06/05 15:14:19 INFO mapred.JobClient: Counters: 28
> 14/06/05 15:14:19 INFO mapred.JobClient:   Job Counters
> 14/06/05 15:14:19 INFO mapred.JobClient:     Launched reduce tasks=1
> 14/06/05 15:14:19 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=36697
> 14/06/05 15:14:19 INFO mapred.JobClient:     Total time spent by all
> reduces waiting after reserving slots (ms)=0
> 14/06/05 15:14:19 INFO mapred.JobClient:     Total time spent by all maps
> waiting after reserving slots (ms)=0
> 14/06/05 15:14:19 INFO mapred.JobClient:     Launched map tasks=1
> 14/06/05 15:14:19 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=10302
> 14/06/05 15:14:19 INFO mapred.JobClient:   File Output Format Counters
> 14/06/05 15:14:19 INFO mapred.JobClient:     Bytes Written=86
> 14/06/05 15:14:19 INFO mapred.JobClient:   FileSystemCounters
> 14/06/05 15:14:19 INFO mapred.JobClient:     FILE_BYTES_READ=6
> 14/06/05 15:14:19 INFO mapred.JobClient:     HDFS_BYTES_READ=1135
> 14/06/05 15:14:19 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=157112
> 14/06/05 15:14:19 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=86
> 14/06/05 15:14:19 INFO mapred.JobClient:   File Input Format Counters
> 14/06/05 15:14:19 INFO mapred.JobClient:     Bytes Read=0
> 14/06/05 15:14:19 INFO mapred.JobClient:   Map-Reduce Framework
> 14/06/05 15:14:19 INFO mapred.JobClient:     Map output materialized
> bytes=6
> 14/06/05 15:14:19 INFO mapred.JobClient:     Map input records=0
> 14/06/05 15:14:19 INFO mapred.JobClient:     Reduce shuffle bytes=6
> 14/06/05 15:14:19 INFO mapred.JobClient:     Spilled Records=0
> 14/06/05 15:14:19 INFO mapred.JobClient:     Map output bytes=0
> 14/06/05 15:14:19 INFO mapred.JobClient:     Total committed heap usage
> (bytes)=216530944
> 14/06/05 15:14:19 INFO mapred.JobClient:     CPU time spent (ms)=2450
> 14/06/05 15:14:19 INFO mapred.JobClient:     Combine input records=0
> 14/06/05 15:14:19 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1135
> 14/06/05 15:14:19 INFO mapred.JobClient:     Reduce input records=0
> 14/06/05 15:14:19 INFO mapred.JobClient:     Reduce input groups=0
> 14/06/05 15:14:19 INFO mapred.JobClient:     Combine output records=0
> 14/06/05 15:14:19 INFO mapred.JobClient:     Physical memory (bytes)
> snapshot=320630784
> 14/06/05 15:14:19 INFO mapred.JobClient:     Reduce output records=0
> 14/06/05 15:14:19 INFO mapred.JobClient:     Virtual memory (bytes)
> snapshot=2254024704
> 14/06/05 15:14:19 INFO mapred.JobClient:     Map output records=0
> 14/06/05 15:14:19 INFO crawl.WebTableReader: Statistics for WebTable:
> 14/06/05 15:14:19 INFO crawl.WebTableReader: jobs: 
> {db_stats-job_201406051410_0019={jobID=job_201406051410_0019,
> jobName=db_stats, counters={File Input Format Counters ={BYTES_READ=0}, Job
> Counters ={TOTAL_LAUNCHED_REDUCES=1, SLOTS_MILLIS_MAPS=36697,
> FALLOW_SLOTS_MILLIS_REDUCES=0, FALLOW_SLOTS_MILLIS_MAPS=0,
> TOTAL_LAUNCHED_MAPS=1, SLOTS_MILLIS_REDUCES=10302}, Map-Reduce
> Framework={MAP_OUTPUT_MATERIALIZED_BYTES=6, MAP_INPUT_RECORDS=0,
> REDUCE_SHUFFLE_BYTES=6, SPILLED_RECORDS=0, MAP_OUTPUT_BYTES=0,
> COMMITTED_HEAP_BYTES=216530944, CPU_MILLISECONDS=2450,
> SPLIT_RAW_BYTES=1135, COMBINE_INPUT_RECORDS=0, REDUCE_INPUT_RECORDS=0,
> REDUCE_INPUT_GROUPS=0, COMBINE_OUTPUT_RECORDS=0,
> PHYSICAL_MEMORY_BYTES=320630784, REDUCE_OUTPUT_RECORDS=0,
> VIRTUAL_MEMORY_BYTES=2254024704, MAP_OUTPUT_RECORDS=0},
> FileSystemCounters={FILE_BYTES_READ=6, HDFS_BYTES_READ=1135,
> FILE_BYTES_WRITTEN=157112, HDFS_BYTES_WRITTEN=86}, File Output Format
> Counters ={BYTES_WRITTEN=86}}}}
> 14/06/05 15:14:19 INFO crawl.WebTableReader: TOTAL urls: 0
> 14/06/05 15:14:19 INFO crawl.WebTableReader: WebTable statistics: done
> 14/06/05 15:14:19 INFO crawl.WebTableReader: jobs: 
> {db_stats-job_201406051410_0019={jobID=job_201406051410_0019,
> jobName=db_stats, counters={File Input Format Counters ={BYTES_READ=0}, Job
> Counters ={TOTAL_LAUNCHED_REDUCES=1, SLOTS_MILLIS_MAPS=36697,
> FALLOW_SLOTS_MILLIS_REDUCES=0, FALLOW_SLOTS_MILLIS_MAPS=0,
> TOTAL_LAUNCHED_MAPS=1, SLOTS_MILLIS_REDUCES=10302}, Map-Reduce
> Framework={MAP_OUTPUT_MATERIALIZED_BYTES=6, MAP_INPUT_RECORDS=0,
> REDUCE_SHUFFLE_BYTES=6, SPILLED_RECORDS=0, MAP_OUTPUT_BYTES=0,
> COMMITTED_HEAP_BYTES=216530944, CPU_MILLISECONDS=2450,
> SPLIT_RAW_BYTES=1135, COMBINE_INPUT_RECORDS=0, REDUCE_INPUT_RECORDS=0,
> REDUCE_INPUT_GROUPS=0, COMBINE_OUTPUT_RECORDS=0,
> PHYSICAL_MEMORY_BYTES=320630784, REDUCE_OUTPUT_RECORDS=0,
> VIRTUAL_MEMORY_BYTES=2254024704, MAP_OUTPUT_RECORDS=0},
> FileSystemCounters={FILE_BYTES_READ=6, HDFS_BYTES_READ=1135,
> FILE_BYTES_WRITTEN=157112, HDFS_BYTES_WRITTEN=86}, File Output Format
> Counters ={BYTES_WRITTEN=86}}}}
> 14/06/05 15:14:19 INFO crawl.WebTableReader: TOTAL urls: 0
>
> --
> Manikandan Saravanan
> Architect - Technology
> TheSocialPeople <http://thesocialpeople.net>
>



-- 
*Lewis*

Reply via email to