Re: Does HDFS read blocks simultaneously in multi-threaded way?

2019-06-26 Thread Jeff Hubbs
I'm not sure if I get the point of so doing, though. With replication set to the default of three, your only-1-GB file will get cut up into a mere 24 blocks spread among some number of your worker nodes. The "multithreaded" comes in when the various worker nodes are reading these blocks at onc

Re: Does HDFS read blocks simultaneously in multi-threaded way?

2019-06-26 Thread Arpit Agarwal
Correct. The blocks will be read sequentially. > On Jun 26, 2019, at 10:51 AM, Daegyu Han wrote: > > Thank you for your response. > > Assuming HDFS blocks (blk1~blk8) for file input.dat are on the local data > node, > does the map task read these blocks sequentially when trying to read local

Re: Does HDFS read blocks simultaneously in multi-threaded way?

2019-06-26 Thread Daegyu Han
Thank you for your response. Assuming HDFS blocks (blk1~blk8) for file input.dat are on the local data node, does the map task read these blocks sequentially when trying to read local blocks? 2019년 6월 27일 (목) 02:45, Arpit Agarwal 님이 작성: > HDFS reads blocks sequentially. We can implement a multi

Re: Does HDFS read blocks simultaneously in multi-threaded way?

2019-06-26 Thread Arpit Agarwal
HDFS reads blocks sequentially. We can implement a multi-threaded block reader in theory. > On Jun 26, 2019, at 5:05 AM, Daegyu Han wrote: > > Hi all, > > Assuming HDFS has a 1GB file input.dat and a block size of 128MB. > > Can the user read multithreaded when reading the input.dat file? >