Hi Kaveh, I'm not sure if your problem is the same at all. You're problem stems from the solr mapping configuration used by AnchorIndexingFilter in the index-anchor plugin. If this works properly then you should see a list of all of the source --> destination field mappings, this unfortunately is not the case and needs to be resolved before you can progress.
Maybe once this is sorted you can address the MR NPE hth On Thu, Jan 26, 2012 at 1:02 AM, kaveh minooie <[email protected]> wrote: > Hi I think I am havign a simillar problem. this is what i got in the > hadoop.log file (nutch log file) after running this command : > > > bin/nutch crawl urls/ -solr http://solr3:8983/solr/core8 -dir mycrawldir > -threads 2 -depth 2 -topN 20 > > and here is the result( from hadoop.log): > > 2012-01-25 16:42:37,174 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.**anchor.AnchorIndexingFilter > 2012-01-25 16:42:40,151 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.**basic.BasicIndexingFilter > 2012-01-25 16:42:40,151 INFO anchor.AnchorIndexingFilter - Anchor > deduplication is: off > 2012-01-25 16:42:40,151 INFO indexer.IndexingFilters - Adding > org.apache.nutch.indexer.**anchor.AnchorIndexingFilter > 2012-01-25 16:42:40,167 WARN solr.SolrMappingReader - > java.net.MalformedURLException > 2012-01-25 16:42:40,341 INFO solr.SolrWriter - Indexing 21 documents > 2012-01-25 16:42:44,137 INFO solr.SolrIndexer - SolrIndexer: finished at > 2012-01-25 > 16:42:44, elapsed: 00:00:34 > 2012-01-25 16:42:44,143 INFO solr.SolrDeleteDuplicates - > SolrDeleteDuplicates: starting at 2012-01-25 16:42:44 > 2012-01-25 16:42:44,144 INFO solr.SolrDeleteDuplicates - > SolrDeleteDuplicates: Solr url: http://solr3:8983/solr/core8 > 2012-01-25 16:42:44,295 WARN mapred.FileOutputCommitter - Output path is > null in cleanup > 2012-01-25 16:42:44,296 WARN mapred.LocalJobRunner - job_local_0015 > java.lang.NullPointerException > at org.apache.nutch.indexer.solr.**SolrDeleteDuplicates$** > SolrRecord.readSolrDocument(**SolrDeleteDuplicates.java:131) > at org.apache.nutch.indexer.solr.**SolrDeleteDuplicates$** > SolrInputFormat$1.next(**SolrDeleteDuplicates.java:271) > at org.apache.nutch.indexer.solr.**SolrDeleteDuplicates$** > SolrInputFormat$1.next(**SolrDeleteDuplicates.java:241) > at org.apache.hadoop.mapred.**MapTask$TrackedRecordReader.** > moveToNext(MapTask.java:236) > at org.apache.hadoop.mapred.**MapTask$TrackedRecordReader.** > next(MapTask.java:216) > at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**48) > at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.** > java:436) > at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:372) > at org.apache.hadoop.mapred.**LocalJobRunner$Job.run(** > LocalJobRunner.java:212) > > what is it talking about in this line: > > 2012-01-25 16:42:44,295 WARN mapred.FileOutputCommitter - Output path is > null in cleanup > > what ouput path is it talking about? > > (I am running this locally not on hadoop) > > On 01/24/2012 05:13 AM, Denis Sinner wrote: > >> hadoop.log: >> >> 2012-01-24 14:09:37,156 INFO solr.SolrMappingReader - source: content >> dest: content >> 2012-01-24 14:09:37,156 INFO solr.SolrMappingReader - source: site >> dest: site >> 2012-01-24 14:09:37,156 INFO solr.SolrMappingReader - source: title >> dest: teaser >> 2012-01-24 14:09:37,156 INFO solr.SolrMappingReader - source: boost >> dest: boost >> 2012-01-24 14:09:37,156 INFO solr.SolrMappingReader - source: tstamp >> dest: changed >> 2012-01-24 14:09:37,156 INFO solr.SolrMappingReader - source: tstamp >> dest: created >> 2012-01-24 14:09:37,370 INFO solr.SolrWriter - Adding 2 documents >> 2012-01-24 14:09:38,095 INFO solr.SolrIndexer - SolrIndexer: finished >> at 2012-01-24 14:09:38, elapsed: 00:00:02 >> 2012-01-24 14:09:38,097 INFO solr.SolrDeleteDuplicates - >> SolrDeleteDuplicates: starting at 2012-01-24 14:09:38 >> 2012-01-24 14:09:38,097 INFO solr.SolrDeleteDuplicates - >> SolrDeleteDuplicates: Solr url: >> http://192.168.0.47:8080/solr/**core_en/<http://192.168.0.47:8080/solr/core_en/> >> 2012-01-24 14:09:38,457 WARN mapred.LocalJobRunner - job_local_0010 >> java.lang.NullPointerException >> at org.apache.hadoop.io.Text.**encode(Text.java:388) >> at org.apache.hadoop.io.Text.set(**Text.java:178) >> at org.apache.nutch.indexer.solr.**SolrDeleteDuplicates$** >> SolrInputFormat$1.next(**SolrDeleteDuplicates.java:284) >> at org.apache.nutch.indexer.solr.**SolrDeleteDuplicates$** >> SolrInputFormat$1.next(**SolrDeleteDuplicates.java:249) >> at org.apache.hadoop.mapred.**MapTask$TrackedRecordReader.** >> moveToNext(MapTask.java:192) >> at org.apache.hadoop.mapred.**MapTask$TrackedRecordReader.** >> next(MapTask.java:176) >> at org.apache.hadoop.mapred.**MapRunner.run(MapRunner.java:**48) >> at org.apache.hadoop.mapred.**MapTask.runOldMapper(MapTask.** >> java:358) >> at org.apache.hadoop.mapred.**MapTask.run(MapTask.java:307) >> at org.apache.hadoop.mapred.**LocalJobRunner$Job.run(** >> LocalJobRunner.java:177) >> >> Solr (running out of eclipse with jetty): >> >> 24.01.2012 14:09:37 org.apache.solr.core.**SolrDeletionPolicy onInit >> INFO: SolrDeletionPolicy.onInit: commits:num=1 >> commit{dir=/Users/dkd-sinner/**Documents/solr/** >> SolrTypo3Plugin/solr/**typo3cores/data/core_en/index,** >> segFN=segments_p,version=**1326882792610,generation=25,**filenames=[_1.frq, >> _b.nrm, _b.tvx, _2.tii, _1.fnm, _2.tvx, _2.tvd, _1.tii, _2.tvf, _1.tvx, >> _1.tis, _2.prx, _b.prx, _2.fdt, _2.frq, _b.tis, _2.fdx, _2.fnm, _b.tii, >> _b.frq, _1.prx, _1.fdx, _2.tis, _1.tvf, _b.tvd, _1.fdt, segments_p, _b.fnm, >> _b.fdt, _b.tvf, _1.tvd, _b.fdx, _1.nrm, _2.nrm] >> 24.01.2012 14:09:37 org.apache.solr.core.**SolrDeletionPolicy >> updateCommits >> INFO: newest commit = 1326882792610 >> 24.01.2012 14:09:37 org.apache.solr.update.**processor.LogUpdateProcessor >> finish >> INFO: {add=[**045756f6efde46c27a8e1016756bf9**9cc8153d51/nutch_external/ >> http**://www.dkd.de/ <http://www.dkd.de/>, 5648ab376b909bc402c4ecbf45c26b >> **4546e69f04/nutch_external/http**://www.typo3-solr.com/<http://www.typo3-solr.com/>]} >> 0 71 >> 24.01.2012 14:09:37 org.apache.solr.core.SolrCore execute >> INFO: [core_en] webapp=/solr path=/update params={wt=javabin&version=2} >> status=0 QTime=71 >> 24.01.2012 14:09:37 org.apache.solr.update.**DirectUpdateHandler2 commit >> INFO: start commit(optimize=false,**waitFlush=true,waitSearcher=** >> true,expungeDeletes=false) >> 24.01.2012 14:09:38 org.apache.solr.core.**SolrDeletionPolicy onCommit >> INFO: SolrDeletionPolicy.onCommit: commits:num=2 >> commit{dir=/Users/dkd-sinner/**Documents/solr/** >> SolrTypo3Plugin/solr/**typo3cores/data/core_en/index,** >> segFN=segments_p,version=**1326882792610,generation=25,**filenames=[_1.frq, >> _b.nrm, _b.tvx, _2.tii, _1.fnm, _2.tvx, _2.tvd, _1.tii, _2.tvf, _1.tvx, >> _1.tis, _2.prx, _b.prx, _2.fdt, _2.frq, _b.tis, _2.fdx, _2.fnm, _b.tii, >> _b.frq, _1.prx, _1.fdx, _2.tis, _1.tvf, _b.tvd, _1.fdt, segments_p, _b.fnm, >> _b.fdt, _b.tvf, _1.tvd, _b.fdx, _1.nrm, _2.nrm] >> commit{dir=/Users/dkd-sinner/**Documents/solr/** >> SolrTypo3Plugin/solr/**typo3cores/data/core_en/index,** >> segFN=segments_q,version=**1326882792614,generation=26,**filenames=[_1.frq, >> _2.tii, _c.tii, _c.fdx, _c.tvx, _1.fnm, _2.tvx, _c.fdt, _2.tvd, _c.tis, >> _c.nrm, _1.tii, _2.tvf, _1.tvx, _1.tis, _2.prx, _c.prx, _2.fdt, _2.frq, >> _2.fdx, _2.fnm, _1.prx, _1.fdx, _2.tis, _1.tvf, _1.fdt, segments_q, _c.tvf, >> _c.tvd, _c.fnm, _1.tvd, _c.frq, _1.nrm, _2.nrm] >> 24.01.2012 14:09:38 org.apache.solr.core.**SolrDeletionPolicy >> updateCommits >> INFO: newest commit = 1326882792614 >> 24.01.2012 14:09:38 org.apache.solr.search.**SolrIndexSearcher<init> >> INFO: Opening Searcher@2a44fec1 main >> 24.01.2012 14:09:38 org.apache.solr.update.**DirectUpdateHandler2 commit >> INFO: end_commit_flush >> 24.01.2012 14:09:38 org.apache.solr.search.**SolrIndexSearcher warm >> INFO: autowarming Searcher@2a44fec1 main from Searcher@3d78cd7b main >> fieldValueCache{lookups=0,**hits=0,hitratio=0.00,inserts=** >> 0,evictions=0,size=0,**warmupTime=0,cumulative_** >> lookups=0,cumulative_hits=0,**cumulative_hitratio=0.00,** >> cumulative_inserts=0,**cumulative_evictions=0} >> 24.01.2012 14:09:38 org.apache.solr.search.**SolrIndexSearcher warm >> INFO: autowarming result for Searcher@2a44fec1 main >> fieldValueCache{lookups=0,**hits=0,hitratio=0.00,inserts=** >> 0,evictions=0,size=0,**warmupTime=0,cumulative_** >> lookups=0,cumulative_hits=0,**cumulative_hitratio=0.00,** >> cumulative_inserts=0,**cumulative_evictions=0} >> 24.01.2012 14:09:38 org.apache.solr.search.**SolrIndexSearcher warm >> INFO: autowarming Searcher@2a44fec1 main from Searcher@3d78cd7b main >> filterCache{lookups=0,hits=0,**hitratio=0.00,inserts=0,** >> evictions=0,size=0,warmupTime=**0,cumulative_lookups=0,** >> cumulative_hits=0,cumulative_**hitratio=0.00,cumulative_** >> inserts=0,cumulative_**evictions=0} >> 24.01.2012 14:09:38 org.apache.solr.search.**SolrIndexSearcher warm >> INFO: autowarming result for Searcher@2a44fec1 main >> filterCache{lookups=0,hits=0,**hitratio=0.00,inserts=0,** >> evictions=0,size=0,warmupTime=**0,cumulative_lookups=0,** >> cumulative_hits=0,cumulative_**hitratio=0.00,cumulative_** >> inserts=0,cumulative_**evictions=0} >> 24.01.2012 14:09:38 org.apache.solr.search.**SolrIndexSearcher warm >> INFO: autowarming Searcher@2a44fec1 main from Searcher@3d78cd7b main >> queryResultCache{lookups=0,**hits=0,hitratio=0.00,inserts=** >> 0,evictions=0,size=0,**warmupTime=0,cumulative_** >> lookups=44,cumulative_hits=32,**cumulative_hitratio=0.72,** >> cumulative_inserts=22,**cumulative_evictions=0} >> 24.01.2012 14:09:38 org.apache.solr.search.**SolrIndexSearcher warm >> INFO: autowarming result for Searcher@2a44fec1 main >> queryResultCache{lookups=0,**hits=0,hitratio=0.00,inserts=** >> 0,evictions=0,size=0,**warmupTime=0,cumulative_** >> lookups=44,cumulative_hits=32,**cumulative_hitratio=0.72,** >> cumulative_inserts=22,**cumulative_evictions=0} >> 24.01.2012 14:09:38 org.apache.solr.search.**SolrIndexSearcher warm >> INFO: autowarming Searcher@2a44fec1 main from Searcher@3d78cd7b main >> documentCache{lookups=0,hits=**0,hitratio=0.00,inserts=0,** >> evictions=0,size=0,warmupTime=**0,cumulative_lookups=1136,** >> cumulative_hits=618,**cumulative_hitratio=0.54,**cumulative_inserts=518,* >> *cumulative_evictions=0} >> 24.01.2012 14:09:38 org.apache.solr.search.**SolrIndexSearcher warm >> INFO: autowarming result for Searcher@2a44fec1 main >> documentCache{lookups=0,hits=**0,hitratio=0.00,inserts=0,** >> evictions=0,size=0,warmupTime=**0,cumulative_lookups=1136,** >> cumulative_hits=618,**cumulative_hitratio=0.54,**cumulative_inserts=518,* >> *cumulative_evictions=0} >> 24.01.2012 14:09:38 org.apache.solr.core.**QuerySenderListener >> newSearcher >> INFO: QuerySenderListener sending requests to Searcher@2a44fec1 main >> 24.01.2012 14:09:38 org.apache.solr.core.**QuerySenderListener >> newSearcher >> INFO: QuerySenderListener done. >> 24.01.2012 14:09:38 org.apache.solr.handler.** >> component.SpellCheckComponent$**SpellCheckerListener buildSpellIndex >> INFO: Building spell index for spell checker: default >> 24.01.2012 14:09:38 org.apache.solr.core.SolrCore registerSearcher >> INFO: [core_en] Registered new searcher Searcher@2a44fec1 main >> 24.01.2012 14:09:38 org.apache.solr.search.**SolrIndexSearcher close >> INFO: Closing Searcher@3d78cd7b main >> fieldValueCache{lookups=0,**hits=0,hitratio=0.00,inserts=** >> 0,evictions=0,size=0,**warmupTime=0,cumulative_** >> lookups=0,cumulative_hits=0,**cumulative_hitratio=0.00,** >> cumulative_inserts=0,**cumulative_evictions=0} >> filterCache{lookups=0,hits=0,**hitratio=0.00,inserts=0,** >> evictions=0,size=0,warmupTime=**0,cumulative_lookups=0,** >> cumulative_hits=0,cumulative_**hitratio=0.00,cumulative_** >> inserts=0,cumulative_**evictions=0} >> queryResultCache{lookups=0,**hits=0,hitratio=0.00,inserts=** >> 0,evictions=0,size=0,**warmupTime=0,cumulative_** >> lookups=44,cumulative_hits=32,**cumulative_hitratio=0.72,** >> cumulative_inserts=22,**cumulative_evictions=0} >> documentCache{lookups=0,hits=**0,hitratio=0.00,inserts=0,** >> evictions=0,size=0,warmupTime=**0,cumulative_lookups=1136,** >> cumulative_hits=618,**cumulative_hitratio=0.54,**cumulative_inserts=518,* >> *cumulative_evictions=0} >> 24.01.2012 14:09:38 org.apache.solr.update.**processor.LogUpdateProcessor >> finish >> INFO: {commit=} 0 212 >> 24.01.2012 14:09:38 org.apache.solr.core.SolrCore execute >> INFO: [core_en] webapp=/solr path=/update params={waitSearcher=true&** >> waitFlush=true&wt=javabin&**commit=true&version=2} status=0 QTime=212 >> 24.01.2012 14:09:38 org.apache.solr.core.SolrCore execute >> INFO: [core_en] webapp=/solr path=/select params={fl=id&wt=javabin&q=*:** >> *&rows=1&version=2} hits=52 status=0 QTime=2 >> 24.01.2012 14:09:38 org.apache.solr.core.SolrCore execute >> INFO: [core_en] webapp=/solr path=/select params={fl=id&wt=javabin&q=*:** >> *&rows=1&version=2} hits=52 status=0 QTime=1 >> 24.01.2012 14:09:38 org.apache.solr.core.SolrCore execute >> INFO: [core_en] webapp=/solr path=/select params={fl=id,boost,tstamp,** >> digest&start=0&q=*:*&wt=**javabin&rows=52&version=2} hits=52 status=0 >> QTime=2 >> >> > -- > Kaveh Minooie > > www.plutoz.com > -- *Lewis*

