I am not saying there is a slowdown cause by hadoop. I was wondering if there were anyother techinques that optimize speed (IE reading a little a time and writing to the local disk). Ananth T Sarathy
On Wed, Aug 19, 2009 at 1:26 AM, Raghu Angadi <rang...@yahoo-inc.com> wrote: > Ananth T. Sarathy wrote: > >> I am trying to download binary files stored in Hadoop but there is like a >> 2 >> minute wait on a 20mb file when I try to execute the in.read(buf). >> > > What does this mean : 2 min to pipe 20mb or one or your one of the > in.read() calls took 2 minutes? Your code actually measures team for read > and write. > > There is nothing in FSInputstream to cause this slow down. Do you think > anyone would use Hadoop otherwise? It would be as fast as underlying > filesystem goes. > > Raghu. > > > is there a better way to be doing this? >> >> private void pipe(InputStream in, OutputStream out) throws IOException >> { System.out.println(System.currentTimeMillis()+" Starting to Pipe >> Data"); >> byte[] buf = new byte[1024]; >> int read = 0; >> while ((read = in.read(buf)) >= 0) >> { >> out.write(buf, 0, read); >> System.out.println(System.currentTimeMillis()+" Piping Data"); >> } >> out.flush(); >> System.out.println(System.currentTimeMillis()+" Finished Piping >> Data"); >> >> } >> >> public void readFile(String fileToRead, OutputStream out) >> throws IOException >> { >> System.out.println(System.currentTimeMillis()+" Start Read File"); >> Path inFile = new Path(fileToRead); >> System.out.println(System.currentTimeMillis()+" Set Path"); >> // Validate the input/output paths before reading/writing. >> >> if (!fs.exists(inFile)) >> { >> throw new HadoopFileException("Specified file " + fileToRead >> + " not found."); >> } >> if (!fs.isFile(inFile)) >> { >> throw new HadoopFileException("Specified file " + fileToRead >> + " not found."); >> } >> // Open inFile for reading. >> System.out.println(System.currentTimeMillis()+" Opening Data >> Stream"); >> FSDataInputStream in = fs.open(inFile); >> >> System.out.println(System.currentTimeMillis()+" Opened Data >> Stream"); >> // Open outFile for writing. >> >> // Read from input stream and write to output stream until EOF. >> pipe(in, out); >> >> // Close the streams when done. >> out.close(); >> in.close(); >> } >> Ananth T Sarathy >> >> >