I running nutch from root user When I check under /crawl/segments/20131017194821/crawl_fetch doesn't exist It is incomplete, there are only _temporary and crawl_generate What can I do, If I copy a fresh binary files from version Nutch 1.7 ??? thanks in advance, Luis armando
________________________________________ De: Talat UYARER [[email protected]] Enviado el: viernes, 18 de octubre de 2013 11:04 a.m. Para: [email protected] Asunto: Re: Nutch 1.7 and Solr 4.4.0 Integrate Did you check your priviledged ? Can you check your path, is it exists ? 1. 2013-10-18 13:19:49,020 ERROR security.UserGroupInformation - PriviledgedActionException as:root cause:org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: file:/opt/apache-nutch-1.7/crawl/segments/20131017194821/crawl_fetch 18-10-2013 18:22 tarihinde, Luis Armando Roca Fumero yazdı: > Ooooppppssss sorry Talat UAYRER: > This is the link for hadoop.log file: http://pastebin.com/F6qBQhSA > > ________________________________________ > De: Talat UYARER [[email protected]] > Enviado el: viernes, 18 de octubre de 2013 10:06 a.m. > Para: [email protected] > Asunto: Re: Nutch 1.7 and Solr 4.4.0 Integrate > > Maillists dont accept attachment files. Can you share on pastebin etc. > > 18-10-2013 17:59 tarihinde, Luis Armando Roca Fumero yazdı: >> Here is the hadoop.log file >> Thanks for your time, >> Luis Armando >> ________________________________________ >> De: Talat UYARER [[email protected]] >> Enviado el: viernes, 18 de octubre de 2013 09:51 a.m. >> Para: [email protected] >> Asunto: Re: Nutch 1.7 and Solr 4.4.0 Integrate >> >> Hi Luis, >> Can you share your hadoop.log file. We need verbouse output log for >> understanding problem. But If I can understand correct. You dont have >> any problem for IndexerJob. >> >> Talat >> >> 18-10-2013 17:36 tarihinde, Luis Armando Roca Fumero yazdı: >>> Hello >>> I added the lines that Mourdak suggested me, but I still getting the same >>> errors: >>> >>> SOLRIndexWriter >>> solr.server.url : URL of the SOLR instance (mandatory) >>> solr.commit.size : buffer size when sending to SOLR (default >>> 1000) >>> solr.mapping.file : name of the mapping file for fields (default >>> solrindex-mapping.xml) >>> solr.auth : use authentication (default false) >>> solr.auth.username : use authentication (default false) >>> solr.auth : username for authentication >>> solr.auth.password : password for authentication >>> >>> >>> Indexer: finished at 2013-10-18 14:39:23, elapsed: 00:00:04 >>> SolrDeleteDuplicates: starting at 2013-10-18 14:39:23 >>> SolrDeleteDuplicates: Solr url: http://localhost:8983/solr/ >>> Exception in thread "main" java.io.IOException: Job failed! >>> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1357) >>> at >>> org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:373) >>> at >>> org.apache.nutch.indexer.solr.SolrDeleteDuplicates.dedup(SolrDeleteDuplicates.java:353) >>> at org.apache.nutch.crawl.Crawl.run(Crawl.java:160) >>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) >>> at org.apache.nutch.crawl.Crawl.main(Crawl.java:55) >>> >>> Any other idea??? >>> thanks for your time, >>> Luis Armando >>> >>> ________________________________________ >>> De: Mouradk [[email protected]] >>> Enviado el: viernes, 18 de octubre de 2013 09:08 a.m. >>> Para: [email protected] >>> Asunto: Re: Nutch 1.7 and Solr 4.4.0 Integrate >>> >>> Hi Luis, >>> >>> Under you nutch-site.xml configuration file you need to add the SOLR >>> indexer plugin: >>> >>> <property> >>> <name>plugin.includes</name> >>> >>> <value>protocol-http|parse-(html|tika)|index-(basic|anchor)|indexer-solr</value> >>> </property> >>> >>> Hope this help, >>> >>> Mourad >>> >>> >>> On 18 Oct 2013, at 15:05, Luis Armando Roca Fumero <[email protected]> >>> wrote: >>> >>>> Hello friends: >>>> I had configurated nutch 1.7 and solr 4.4.0 to work together, by Nutch >>>> Tutorial paper >>>> When I run the command: ./bin/nutch crawl urls -solr >>>> http://localhost:8983/solr/ -depth 3 -topN 5 > test.txt >>>> All works good, but finally when Indexer is starting I get errors like >>>> this: >>>> >>>> Indexer: starting at 2013-10-18 13:57:32 >>>> Indexer: deleting gone documents: false >>>> Indexer: URL filtering: false >>>> Indexer: URL normalizing: false >>>> Active IndexWriters : >>>> SOLRIndexWriter >>>> solr.server.url : URL of the SOLR instance (mandatory) >>>> solr.commit.size : buffer size when sending to SOLR (default >>>> 1000) >>>> solr.mapping.file : name of the mapping file for fields (default >>>> solrindex-mapping.xml) >>>> solr.auth : use authentication (default false) >>>> solr.auth.username : use authentication (default false) >>>> solr.auth : username for authentication >>>> solr.auth.password : password for authentication >>>> >>>> >>>> >>>> What Can I do, what is wrong?? I have not idea, I had tried with Nutch >>>> 2.2.1 and doesn't work with solr 4.4.0 either. I need a tutorial to >>>> integrate nutch with solr, like baby steps :) >>>> Thanks in advance >>>> >>>> La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. >>>> Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu >>>> Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. >>>> Cuba. http://www.congresouniversidad.cu/ >>>> >>>> >>> La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. >>> Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu >>> Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. >>> Cuba. http://www.congresouniversidad.cu/ >>> >>> >>> >>> La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. >>> Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu >>> Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. >>> Cuba. http://www.congresouniversidad.cu/ >>> >>> >> La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. >> Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu >> Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. >> Cuba. http://www.congresouniversidad.cu/ >> >> >> >> La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. >> Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu >> Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. >> Cuba. http://www.congresouniversidad.cu/ >> >> > > La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. > Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu > Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba. > http://www.congresouniversidad.cu/ > > > > La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. > Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu > Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba. > http://www.congresouniversidad.cu/ > > La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba. http://www.congresouniversidad.cu/ La Universidad Central "Marta Abreu" de Las Villas en su 60 Aniversario. Fundada el 30 de noviembre de 1952. Visítenos en: http://www.uclv.edu.cu Participe en Universidad 2014, del 10 al 14 de febrero de 2014. Habana. Cuba. http://www.congresouniversidad.cu/

