Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Nutch Wiki" for change 
notification.

The following page has been changed by SteveSeverance:
http://wiki.apache.org/nutch/Getting_Started

------------------------------------------------------------------------------
   2. Then you need to submit your job to Hadoop to be run. This is done by 
calling JobClient.runJob. JobClient. runJob submits the job for starting and 
handles receiving status updates back from the job. It starts by creating an 
instance of the JobClient. It continues to push the job toward execution by 
calling JobClient.submitJob
   3. JobClient.submitJob handles splitting the input files and generating the 
MapReduce task.
  
+ === How do I open Nutch's data files ===
+ You will need to interact with Nutch's files using Hadoop's MapFile and 
SequenceFile classes. This simple code sample shows opening a file and reading 
the values.
+ 
+ {{{
+ MapFile.Reader reader = new MapFile.Reader (fs, seqFile, conf);
+ 
+         Class keyC = reader.getKeyClass();
+         Class valueC = reader.getValueClass();
+ 
+         while (true) {
+             WritableComparable key = null;
+             Writable value = null;
+             try {
+                 key = (WritableComparable)keyC.newInstance();
+                 value = (Writable)valueC.newInstance();
+             } catch (Exception ex) {
+                 ex.printStackTrace();
+                 System.exit(-1);
+             }
+ 
+             try {   
+                 if (!reader.next(key, value)) {
+                     break;
+                 }
+ 
+                 out.println(key);
+                 out.println(value);
+             } catch (Exception e) {
+                 e.printStackTrace();
+                 out.println("Exception occured. " + e);
+                 break;
+             }
+         }
+ 
+ }}}
+ 
  == Tutorials ==
   * CountLinks Counting outbound links with MapReduce
  

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-cvs mailing list
Nutch-cvs@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-cvs

Reply via email to