[jira] Created: (HDFS-895) Allow hflush/sync to occur in parallel with new writes to the file

2010-01-13 Thread dhruba borthakur (JIRA)
Allow hflush/sync to occur in parallel with new writes to the file
--

 Key: HDFS-895
 URL: https://issues.apache.org/jira/browse/HDFS-895
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: hdfs client
Reporter: dhruba borthakur


In the current trunk, the HDFS client methods writeChunk() and hflush./sync are 
syncronized. This means that if a hflush/sync is in progress, an applicationn 
cannot write data to the HDFS client buffer. This reduces the write throughput 
of the transaction log in HBase. 

The hflush/sync should allow new writes to happen to the HDFS client even when 
a hflush/sync is in progress. It can record the seqno of the message for which 
it should receice the ack, indicate to the DataStream thread to star flushing 
those messages, exit the synchronized section  and just wai for that ack to 
arrive.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Synchronization problem in FSNameSystemMetrics

2010-01-13 Thread Zlatin.Balevsky
Hello all,

Unless I'm missing something, the synchronization in FSNameSystemMetrics
on the 0.20 does not appear correct.  It locks on itself while pulling
various metrics from the FSNameSystem object.  For example,
FSNameSystem.getBlocksTotal() delegates to BlocksMap.size() which
delegates to the contained HashMap object.  None of those methods are
synchronized on FSNameSystem, so one cannot get accurate data for that
metric.  Is this intentional?

Zlatin 
___

This e-mail may contain information that is confidential, privileged or 
otherwise protected from disclosure. If you are not an intended recipient of 
this e-mail, do not duplicate or redistribute it by any means. Please delete it 
and any attachments and notify the sender that you have received it in error. 
Unless specifically indicated, this e-mail is not an offer to buy or sell or a 
solicitation to buy or sell any securities, investment products or other 
financial product or service, an official confirmation of any transaction, or 
an official statement of Barclays. Any views or opinions presented are solely 
those of the author and do not necessarily represent those of Barclays. This 
e-mail is subject to terms available at the following link: 
www.barcap.com/emaildisclaimer. By messaging with Barclays you consent to the 
foregoing.  Barclays Capital is the investment banking division of Barclays 
Bank PLC, a company registered in England (number 1026167) with its registered 
office at 1 Churchill Place, London, E14 5HP.  This email may relate to or be 
sent from other members of the Barclays Group.
___


[jira] Created: (HDFS-897) ReplicasMap remove has a bug in generation stamp comparison

2010-01-13 Thread Suresh Srinivas (JIRA)
ReplicasMap remove has a bug in generation stamp comparison
---

 Key: HDFS-897
 URL: https://issues.apache.org/jira/browse/HDFS-897
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: data-node
Affects Versions: 0.21.0, 0.22.0
Reporter: Suresh Srinivas
Assignee: Suresh Srinivas
 Fix For: 0.21.0, 0.22.0


In {{ReplicasMap.remove(block)}}, instead of comparing generation stamp of the 
entry in the map to that of the given block, the generation stamp of entry in 
the map is compared to itself. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Reopened: (HDFS-520) Create new tests for block recovery

2010-01-13 Thread Hairong Kuang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hairong Kuang reopened HDFS-520:



I am reopening this. Block recovery leads to the truncation of data thus has 
the risk of losing data. Making sure that the tests cover all the cases is 
important.

 Create new tests for block recovery
 ---

 Key: HDFS-520
 URL: https://issues.apache.org/jira/browse/HDFS-520
 Project: Hadoop HDFS
  Issue Type: Sub-task
  Components: test
Reporter: Konstantin Boudnik
Assignee: Konstantin Boudnik

 According to the test plan a number of new features are going to be 
 implemented as a part of this umbrella (HDFS-265) JIRA.
 These new features are have to be tested properly. Block recovery is one of 
 new functionality which require new tests to be developed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: libhdfs install error

2010-01-13 Thread Eli Collins
            Any idea of what is wrong in the process?

I'd build using ant:

ant -Dcompile.c++=true -Dlibhdfs=true compile-c++-libhdfs

Thanks,
Eli


[jira] Created: (HDFS-898) Sequential generation of block ids

2010-01-13 Thread Konstantin Shvachko (JIRA)
Sequential generation of block ids
--

 Key: HDFS-898
 URL: https://issues.apache.org/jira/browse/HDFS-898
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: name-node
Affects Versions: 0.20.1
Reporter: Konstantin Shvachko
Assignee: Konstantin Shvachko
 Fix For: 0.22.0


This is a proposal to replace random generation of block ids with a sequential 
generator in order to avoid block id reuse in the future.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-900) Corrupt replicas are not tracked correctly through block report from DN

2010-01-13 Thread Todd Lipcon (JIRA)
Corrupt replicas are not tracked correctly through block report from DN
---

 Key: HDFS-900
 URL: https://issues.apache.org/jira/browse/HDFS-900
 Project: Hadoop HDFS
  Issue Type: Bug
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Priority: Critical
 Attachments: log-commented, to-reproduce.patch

This one is tough to describe, but essentially the following order of events is 
seen to occur:

# A client marks one replica of a block to be corrupt by telling the NN about it
# Replication is then scheduled to make a new replica of this node
# The replication completes, such that there are now 3 good replicas and 1 
corrupt replica
# The DN holding the corrupt replica sends a block report. Rather than telling 
this DN to delete the node, the NN instead marks this as a new *good* replica 
of the block, and schedules deletion on one of the good replicas.

I don't know if this is a dataloss bug in the case of 1 corrupt replica with 
dfs.replication=2, but it seems feasible. I will attach a debug log with some 
commentary marked by '', plus a unit test patch which I can get to 
reproduce this behavior reliably. (it's not a proper unit test, just some edits 
to an existing one to show it)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.