Thanks Raghu:

Here is where it gets stuck:

"DataStreamer for file
/hbasetrunk2/.logs/aa0-000-13.u.powerset.com_1241988169615_60021/hlog.dat.1242020985471
block blk_-1659539029802462400_12649" daemon prio=10 tid=0x00007f10a0000c00
nid=0x660 in Object.wait() [0x0000000043a33000..0x0000000043a33c80]
   java.lang.Thread.State: WAITING (on object monitor)
        at java.lang.Object.wait(Native Method)
        at java.lang.Object.wait(Object.java:485)
        at
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2322)
        - locked <0x00007f10e2b0c588> (a java.util.LinkedList)

Which is the wait in the below in the middle of DataStream.run:

          // Is this block full?
          if (one.lastPacketInBlock) {
            synchronized (ackQueue) {
              while (!hasError && ackQueue.size() != 0 && clientRunning) {
                try {
                  ackQueue.wait();   // wait for acks to arrive from
datanodes
                } catch (InterruptedException  e) {
                }
              }
            }

Sounds like, if we set the replication down from 3 to 2 it should write a
little faster.

Regards increasing size of ackqueue, are you thinking maxPackage?  Currently
its hardcoded at 80 -- a queue of 5MB (packets are 64k).  You thinking I
should experiment with that?  I suppose that won't hel w/ much w/ getting my
writes on the datanode.  Maybe I should be digging on datanode side to
figure why its slow getting back to the client?

Thanks,
St.Ack




On Sun, May 10, 2009 at 7:49 PM, Raghu Angadi <rang...@yahoo-inc.com> wrote:

>
> It should not be waiting unnecessarily. But the client has to, if any of
> the datanodes in the pipeline is not able to receive the as fast as client
> is writing. IOW writing goes as fast as the slowest of nodes involved in the
> pipeline (1 client and 3 datanodes).
>
> But based on what your case is, you probably could benefit by increasing
> the buffer (number of unacked packets).. it would depend on where the
> datastream thread is blocked.
>
> Raghu.
>
>
> stack wrote:
>
>> Writing a file, our application spends a load of time here:
>>
>>        at java.lang.Object.wait(Native Method)
>>        at java.lang.Object.wait(Object.java:485)
>>        at
>>
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.writeChunk(DFSClient.java:2964)
>>        - locked <0x00007f11054c2b68> (a java.util.LinkedList)
>>        - locked <0x00007f11054c24c0> (a
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
>>        at
>>
>> org.apache.hadoop.fs.FSOutputSummer.writeChecksumChunk(FSOutputSummer.java:150)
>>        at
>> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:132)
>>        - locked <0x00007f11054c24c0> (a
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
>>        at
>> org.apache.hadoop.fs.FSOutputSummer.flushBuffer(FSOutputSummer.java:121)
>>        - locked <0x00007f11054c24c0> (a
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
>>        at
>> org.apache.hadoop.fs.FSOutputSummer.write1(FSOutputSummer.java:112)
>>        at
>> org.apache.hadoop.fs.FSOutputSummer.write(FSOutputSummer.java:86)
>>        - locked <0x00007f11054c24c0> (a
>> org.apache.hadoop.hdfs.DFSClient$DFSOutputStream)
>>        at
>>
>> org.apache.hadoop.fs.FSDataOutputStream$PositionCache.write(FSDataOutputStream.java:49)
>>        at java.io.DataOutputStream.write(DataOutputStream.java:90)
>>        - locked <0x00007f1105694f28> (a
>> org.apache.hadoop.fs.FSDataOutputStream)
>>        at
>> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:1020)
>>        - locked <0x00007f1105694e98> (a
>> org.apache.hadoop.io.SequenceFile$Writer)
>>        at
>> org.apache.hadoop.io.SequenceFile$Writer.append(SequenceFile.java:984)
>>
>> Here is the code from around line 2964 in writeChunk.
>>
>>        // If queue is full, then wait till we can create  enough
>> space
>>        while (!closed && dataQueue.size() + ackQueue.size()  > maxPackets)
>> {
>>          try
>> {
>>
>> dataQueue.wait();
>>          } catch (InterruptedException  e) {
>>
>> }
>>
>>        }
>>
>> The queue of packets is full and we're waiting for it to be cleared.
>>
>> Any suggestions for how I might get the DataStreamer to act more promptly
>> clearing the package queue?
>>
>> This is hadoop 0.20 branch.  Its a small cluster but relatively lightly
>> loaded (so says ganglia).
>>
>> Thanks,
>> St.Ack
>>
>>
>

Reply via email to