[jira] Updated: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-04-02 Thread Ruyue Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruyue Ma updated HDFS-923:
--

Status: Open  (was: Patch Available)

 libhdfs hdfs_read example uses hdfsRead wrongly
 ---

 Key: HDFS-923
 URL: https://issues.apache.org/jira/browse/HDFS-923
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs
Reporter: Ruyue Ma
Assignee: Ruyue Ma
 Fix For: 0.21.0

 Attachments: hdfs-923.patch


 In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 
 {noformat}
 // read from the file
 tSize curSize = bufferSize;
 for (; curSize == bufferSize;) {
 curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
 }
 {noformat} 
 the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-04-02 Thread Ruyue Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruyue Ma updated HDFS-923:
--

Affects Version/s: (was: 0.20.1)
   Status: Patch Available  (was: Open)

 libhdfs hdfs_read example uses hdfsRead wrongly
 ---

 Key: HDFS-923
 URL: https://issues.apache.org/jira/browse/HDFS-923
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs
Reporter: Ruyue Ma
Assignee: Ruyue Ma
 Fix For: 0.21.0

 Attachments: hdfs-923.patch


 In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 
 {noformat}
 // read from the file
 tSize curSize = bufferSize;
 for (; curSize == bufferSize;) {
 curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
 }
 {noformat} 
 the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-04-02 Thread Ruyue Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruyue Ma updated HDFS-923:
--

Attachment: hdfs-923.patch

trunk patch

 libhdfs hdfs_read example uses hdfsRead wrongly
 ---

 Key: HDFS-923
 URL: https://issues.apache.org/jira/browse/HDFS-923
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs
Reporter: Ruyue Ma
Assignee: Ruyue Ma
 Fix For: 0.21.0

 Attachments: hdfs-923.patch


 In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 
 {noformat}
 // read from the file
 tSize curSize = bufferSize;
 for (; curSize == bufferSize;) {
 curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
 }
 {noformat} 
 the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-04-02 Thread Ruyue Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-923?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruyue Ma updated HDFS-923:
--

Status: Patch Available  (was: Open)

 libhdfs hdfs_read example uses hdfsRead wrongly
 ---

 Key: HDFS-923
 URL: https://issues.apache.org/jira/browse/HDFS-923
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs
Reporter: Ruyue Ma
Assignee: Ruyue Ma
 Fix For: 0.21.0

 Attachments: hdfs-923.patch


 In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 
 {noformat}
 // read from the file
 tSize curSize = bufferSize;
 for (; curSize == bufferSize;) {
 curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
 }
 {noformat} 
 the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-01-27 Thread Ruyue Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805809#action_12805809
 ] 

Ruyue Ma commented on HDFS-923:
---

The latest modification of hdfsRead api impl:

{noformat}

tSize hdfsRead(hdfsFS fs, hdfsFile f, void* buffer, tSize length)
{
// JAVA EQUIVALENT:
//  byte [] bR = new byte[length];
//  fis.read(bR);

//Get the JNIEnv* corresponding to current thread
JNIEnv* env = getJNIEnv();
if (env == NULL) {
  errno = EINTERNAL;
  return -1;
}

//Parameters
jobject jInputStream = (jobject)(f ? f-file : NULL);

jbyteArray jbRarray;
jint noReadBytes = 0;
jvalue jVal;
jthrowable jExc = NULL;

int hasReadBytes = 0;

//Sanity check
if (!f || f-type == UNINITIALIZED) {
errno = EBADF;
return -1;
}

//Error checking... make sure that this file is 'readable'
if (f-type != INPUT) {
fprintf(stderr, Cannot read from a non-InputStream object!\n);
errno = EINVAL;
return -1;
}

 
/
//  OUR MODIFICATION
int exception = 0;
jbRarray = (*env)-NewByteArray(env, length);
while (hasReadBytes  length) {
if (invokeMethod(env, jVal, jExc, INSTANCE, jInputStream, 
HADOOP_ISTRM,
 read, JMETHOD3([B, I, I, I) , jbRarray, 
hasReadBytes, length-hasReadBytes) != 0) {
errno = errnoFromException(jExc, env, org.apache.hadoop.fs.
   FSDataInputStream::read);
exception = 1;
break;
}
else {
noReadBytes = jVal.i;
if (noReadBytes = 0) {
(*env)-GetByteArrayRegion(env, jbRarray, 0, noReadBytes, 
buffer+hasReadBytes);
hasReadBytes += noReadBytes;
}  else {
//This is a valid case: there aren't any bytes left to read!
break;
}
errno = 0;
}



///

}

destroyLocalReference(env, jbRarray);
if (exception == 1) return -1;
return hasReadBytes;
 //  OUR MODIFICATION
}
{noformat}

 libhdfs hdfs_read example uses hdfsRead wrongly
 ---

 Key: HDFS-923
 URL: https://issues.apache.org/jira/browse/HDFS-923
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs
Affects Versions: 0.20.1
Reporter: Ruyue Ma
Assignee: Ruyue Ma
 Fix For: 0.21.0


 In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 
 {noformat}
 // read from the file
 tSize curSize = bufferSize;
 for (; curSize == bufferSize;) {
 curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
 }
 {noformat} 
 the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-01-26 Thread Ruyue Ma (JIRA)
libhdfs hdfs_read example uses hdfsRead wrongly
---

 Key: HDFS-923
 URL: https://issues.apache.org/jira/browse/HDFS-923
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs
Affects Versions: 0.20.1
Reporter: Ruyue Ma
Assignee: Ruyue Ma
 Fix For: 0.21.0


In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 

{noformat}
// read from the file
tSize curSize = bufferSize;
for (; curSize == bufferSize;) {
curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
}
{noformat} 

the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-01-26 Thread Ruyue Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804975#action_12804975
 ] 

Ruyue Ma commented on HDFS-923:
---

Our modification for hdfs.c is: 

{noformat}

tSize hdfsRead(hdfsFS fs, hdfsFile f, void* buffer, tSize length)
{
// JAVA EQUIVALENT:
//  byte [] bR = new byte[length];
//  fis.read(bR);

//Get the JNIEnv* corresponding to current thread
JNIEnv* env = getJNIEnv();
if (env == NULL) {
  errno = EINTERNAL;
  return -1;
}

//Parameters
jobject jInputStream = (jobject)(f ? f-file : NULL);

jbyteArray jbRarray;
jint noReadBytes = 0;
jvalue jVal;
jthrowable jExc = NULL;

int hasReadBytes = 0;

//Sanity check
if (!f || f-type == UNINITIALIZED) {
errno = EBADF;
return -1;
}

//Error checking... make sure that this file is 'readable'
if (f-type != INPUT) {
fprintf(stderr, Cannot read from a non-InputStream object!\n);
errno = EINVAL;
return -1;
}

 
/
//  OUR MODIFICATION
jbRarray = (*env)-NewByteArray(env, length);
while (hasReadBytes  length) {
if (invokeMethod(env, jVal, jExc, INSTANCE, jInputStream, 
HADOOP_ISTRM,
 read, JMETHOD3([B, I, I, I) , jbRarray, 
hasReadBytes, length-hasReadBytes) != 0) {
errno = errnoFromException(jExc, env, org.apache.hadoop.fs.
   FSDataInputStream::read);
noReadBytes = -1;
}
else {
noReadBytes = jVal.i;
if (noReadBytes = 0) {
(*env)-GetByteArrayRegion(env, jbRarray, 0, noReadBytes, 
buffer+hasReadBytes);
hasReadBytes += noReadBytes;
}  else {
//This is a valid case: there aren't any bytes left to read!
break;
}
errno = 0;
}

 //  OUR MODIFICATION

///

}

destroyLocalReference(env, jbRarray);
return hasReadBytes;
}


{noformat}

 libhdfs hdfs_read example uses hdfsRead wrongly
 ---

 Key: HDFS-923
 URL: https://issues.apache.org/jira/browse/HDFS-923
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs
Affects Versions: 0.20.1
Reporter: Ruyue Ma
Assignee: Ruyue Ma
 Fix For: 0.21.0


 In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 
 {noformat}
 // read from the file
 tSize curSize = bufferSize;
 for (; curSize == bufferSize;) {
 curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
 }
 {noformat} 
 the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-924) Support reading and writing sequencefile in libhdfs

2010-01-26 Thread Ruyue Ma (JIRA)
Support reading and writing sequencefile in libhdfs 


 Key: HDFS-924
 URL: https://issues.apache.org/jira/browse/HDFS-924
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ruyue Ma
Assignee: Ruyue Ma
Priority: Minor


Some use case may need read and write sequencefile through libhdfs. 

We should provide the reading and writing api for sequencefile in libhdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-924) Support reading and writing sequencefile in libhdfs

2010-01-26 Thread Ruyue Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12804979#action_12804979
 ] 

Ruyue Ma commented on HDFS-924:
---

we have implemented it by using jni. 

 Support reading and writing sequencefile in libhdfs 
 

 Key: HDFS-924
 URL: https://issues.apache.org/jira/browse/HDFS-924
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Ruyue Ma
Assignee: Ruyue Ma
Priority: Minor

 Some use case may need read and write sequencefile through libhdfs. 
 We should provide the reading and writing api for sequencefile in libhdfs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-923) libhdfs hdfs_read example uses hdfsRead wrongly

2010-01-26 Thread Ruyue Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-923?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12805333#action_12805333
 ] 

Ruyue Ma commented on HDFS-923:
---

Firstly, to resolve this problem, we should make sure whether the current 
hdfsRead api is good (correctly).

I support the following:

If the returned length of hdfsRead is not equal to buffer length, we could make 
sure that the file is EOF.

Maybe, we can provide another api: hdfsReadFully().


your suggestions?

 libhdfs hdfs_read example uses hdfsRead wrongly
 ---

 Key: HDFS-923
 URL: https://issues.apache.org/jira/browse/HDFS-923
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: contrib/libhdfs
Affects Versions: 0.20.1
Reporter: Ruyue Ma
Assignee: Ruyue Ma
 Fix For: 0.21.0


 In the examples of libhdfs,  the hdfs_read.c uses hdfsRead wrongly. 
 {noformat}
 // read from the file
 tSize curSize = bufferSize;
 for (; curSize == bufferSize;) {
 curSize = hdfsRead(fs, readFile, (void*)buffer, curSize);
 }
 {noformat} 
 the condition curSize == bufferSize has problem.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-630) In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific datanodes when locating the next block.

2009-09-18 Thread Ruyue Ma (JIRA)
In DFSOutputStream.nextBlockOutputStream(), the client can exclude specific 
datanodes when locating the next block.
---

 Key: HDFS-630
 URL: https://issues.apache.org/jira/browse/HDFS-630
 Project: Hadoop HDFS
  Issue Type: New Feature
  Components: hdfs client
Affects Versions: 0.20.1, 0.21.0
Reporter: Ruyue Ma
Assignee: Ruyue Ma
Priority: Minor
 Fix For: 0.21.0


created from hdfs-200.

If during a write, the dfsclient sees that a block replica location for a newly 
allocated block is not-connectable, it re-requests the NN to get a fresh set of 
replica locations of the block. It tries this dfs.client.block.write.retries 
times (default 3), sleeping 6 seconds between each retry ( see 
DFSClient.nextBlockOutputStream).

This setting works well when you have a reasonable size cluster; if u have few 
datanodes in the cluster, every retry maybe pick the dead-datanode and the 
above logic bails out.

Our solution: when getting block location from namenode, we give nn the 
excluded datanodes. The list of dead datanodes is only for one block allocation.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-495) Hadoop FSNamesystem startFileInternal() getLease() has bug

2009-09-17 Thread Ruyue Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12756979#action_12756979
 ] 

Ruyue Ma commented on HDFS-495:
---

Sorry, I see this just now. 

I will give the patch next week.

 Hadoop FSNamesystem startFileInternal() getLease() has bug
 --

 Key: HDFS-495
 URL: https://issues.apache.org/jira/browse/HDFS-495
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20.1
Reporter: Ruyue Ma
Priority: Minor
 Fix For: 0.20.2


 Original Code:
 //
 // If the file is under construction , then it must be in our
 // leases. Find the appropriate lease record.
 //
 Lease lease = leaseManager.getLease(new StringBytesWritable(holder));
 //
 // We found the lease for this file. And surprisingly the original
 // holder is trying to recreate this file. This should never occur.
 //
 if (lease != null) {
   throw new AlreadyBeingCreatedException(
  failed to create file  + 
 src +  for  + holder +
   on client  + 
 clientMachine + 
   because current 
 leaseholder is trying to recreate file.);
 }
 Problem: if another client (who has had some file leases) to recreate the 
 underconstruction file, it can't trigger the lease recovery. 
 Reason:  we should do:
  if (new StringBytesWritable(holder).equals(pendingFile.clientName)){
   throw new AlreadyBeingCreatedException(
  failed to create file  + 
 src +  for  + holder +
   on client  + 
 clientMachine + 
   because current 
 leaseholder is trying to recreate file.);
 }

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers

2009-07-21 Thread Ruyue Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12733502#action_12733502
 ] 

Ruyue Ma commented on HDFS-200:
---

to: dhruba borthakur 

 This is not related to HDFS-4379. let me explain why.
 The problem is actually related to HDFS-xxx. The namenode waits for 10 
 minutes after losing heartbeats from a datanode to declare it dead. During 
 this 10 minutes, the NN is free to choose the dead datanode as a possible 
 replica for a newly allocated block.

 If during a write, the dfsclient sees that a block replica location for a 
 newly allocated block is not-connectable, it re-requests the NN to get a 
 fresh set of replica locations of the block. It tries this 
 dfs.client.block.write.retries times (default 3), sleeping 6 seconds between 
 each retry ( see DFSClient.nextBlockOutputStream).  This setting works well 
 when you have a reasonable size cluster; if u have only 4 datanodes in the 
 cluster, every retry picks the dead-datanode and the above logic bails out.

 One solution is to change the value of dfs.client.block.write.retries to a 
 much much larger value, say 200 or so. Better still, increase the number of 
 nodes in ur cluster.

Our modification: when getting block location from namenode, we give nn the 
excluded datanodes. The list of dead datanodes is only for one block 
allocation. 

+++ hadoop-new/src/hdfs/org/apache/hadoop/hdfs/DFSClient.java   2009-07-20 
00:19:03.0 +0800
@@ -2734,6 +2734,7 @@
   LocatedBlock lb = null;
   boolean retry = false;
   DatanodeInfo[] nodes;
+  DatanodeInfo[] exludedNodes = null;
   int count = conf.getInt(dfs.client.block.write.retries, 3);
   boolean success;
   do {
@@ -2745,7 +2746,7 @@
 success = false;
 
 long startTime = System.currentTimeMillis();
-lb = locateFollowingBlock(startTime);
+lb = locateFollowingBlock(startTime, exludedNodes);
 block = lb.getBlock();
 nodes = lb.getLocations();
   
@@ -2755,6 +2756,19 @@
 success = createBlockOutputStream(nodes, clientName, false);
 
 if (!success) {
+ 
+ LOG.info(Excluding node:  + nodes[errorIndex]); 
+ // Mark datanode as excluded
+ DatanodeInfo errorNode = nodes[errorIndex];
+ if (exludedNodes != null) {
+DatanodeInfo[] newExcludedNodes = new 
DatanodeInfo[exludedNodes.length + 1];
+System.arraycopy(exludedNodes, 0, newExcludedNodes, 0, 
exludedNodes.length);
+newExcludedNodes[exludedNodes.length] = errorNode;
+exludedNodes = newExcludedNodes;
+ } else {
+exludedNodes = new DatanodeInfo[] { errorNode };
+ }
+
   LOG.info(Abandoning block  + block);
   namenode.abandonBlock(block, src, clientName);
 

 In HDFS, sync() not yet guarantees data available to the new readers
 

 Key: HDFS-200
 URL: https://issues.apache.org/jira/browse/HDFS-200
 Project: Hadoop HDFS
  Issue Type: New Feature
Reporter: Tsz Wo (Nicholas), SZE
Assignee: dhruba borthakur
Priority: Blocker
 Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt, 
 fsyncConcurrentReaders11_20.txt, fsyncConcurrentReaders12_20.txt, 
 fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, 
 fsyncConcurrentReaders5.txt, fsyncConcurrentReaders6.patch, 
 fsyncConcurrentReaders9.patch, 
 hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, 
 hypertable-namenode.log.gz, namenode.log, namenode.log, Reader.java, 
 Reader.java, reopen_test.sh, ReopenProblem.java, Writer.java, Writer.java


 In the append design doc 
 (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it 
 says
 * A reader is guaranteed to be able to read data that was 'flushed' before 
 the reader opened the file
 However, this feature is not yet implemented.  Note that the operation 
 'flushed' is now called sync.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HDFS-495) Hadoop FSNamesystem startFileInternal() getLease() has bug

2009-07-20 Thread Ruyue Ma (JIRA)
Hadoop FSNamesystem startFileInternal() getLease() has bug
--

 Key: HDFS-495
 URL: https://issues.apache.org/jira/browse/HDFS-495
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: name-node
Affects Versions: 0.20.1
Reporter: Ruyue Ma
Priority: Minor
 Fix For: 0.20.1


Original Code:

//
// If the file is under construction , then it must be in our
// leases. Find the appropriate lease record.
//
Lease lease = leaseManager.getLease(new StringBytesWritable(holder));
//
// We found the lease for this file. And surprisingly the original
// holder is trying to recreate this file. This should never occur.
//
if (lease != null) {
  throw new AlreadyBeingCreatedException(
 failed to create file  + src 
+  for  + holder +
  on client  + clientMachine 
+ 
  because current leaseholder 
is trying to recreate file.);
}

Problem: if another client (who has had some file leases) to recreate the 
underconstruction file, it can't trigger the lease recovery. 
Reason:  we should do:

 if (new StringBytesWritable(holder).equals(pendingFile.clientName)){
  throw new AlreadyBeingCreatedException(
 failed to create file  + src 
+  for  + holder +
  on client  + clientMachine 
+ 
  because current leaseholder 
is trying to recreate file.);
}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.