[jira] Created: (HADOOP-7086) Retrying socket connection failure times can be made as configurable

2011-01-05 Thread Devaraj K (JIRA)
Retrying socket connection failure times can be made as configurable


 Key: HADOOP-7086
 URL: https://issues.apache.org/jira/browse/HADOOP-7086
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 0.21.0
 Environment: NA
Reporter: Devaraj K
Priority: Minor
 Fix For: 0.22.0


Retrying socket connection failure times are hard coded as 45 and it is giving 
the retryring message for 45 times as below. 

2011-01-04 15:14:30,700 INFO ipc.Client 
(Client.java:handleConnectionFailure(487)) - Retrying connect to server: 
/10.18.52.124:50020. Already tried 1 time(s).

This can be made as configurable and also we can keep the default value as 45. 
If the user wants to decrease/increase,  they can add this configurable 
property otherwise it can continue with the default value.

common\src\java\org\apache\hadoop\ipc\Client.java:
---

private synchronized void setupConnection() throws IOException {
  short ioFailures = 0;
  short timeoutFailures = 0;
  while (true) {
try {
  this.socket = socketFactory.createSocket();
  this.socket.setTcpNoDelay(tcpNoDelay);
  // connection time out is 20s
  NetUtils.connect(this.socket, remoteId.getAddress(), 2);
  if (rpcTimeout > 0) {
pingInterval = rpcTimeout;  // rpcTimeout overwrites pingInterval
  }
  this.socket.setSoTimeout(pingInterval);
  return;
} catch (SocketTimeoutException toe) {
  /*
   * The max number of retries is 45, which amounts to 20s*45 = 15
   * minutes retries.
   */
  handleConnectionFailure(timeoutFailures++, 45, toe);
} catch (IOException ie) {
  handleConnectionFailure(ioFailures++, maxRetries, ie);
}
  }
}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



RE: Hadoop use direct I/O in Linux?

2011-01-05 Thread Segel, Mike
You are mixing a few things up.

You're testing your I/O using C. 
What do you see if you try testing your direct I/O from Java?
I'm guessing that you'll keep your i/o piece in place and wrap it within some 
JNI code and then re-write the test in Java? 

Also are you testing large streams or random i/o blocks? (Hopefully both)

I think that when you test out the system, you'll find that you won't see much, 
if any performance improvement.



-Original Message-
From: Da Zheng [mailto:zhen...@cs.jhu.edu] 
Sent: Tuesday, January 04, 2011 11:11 PM
To: common-dev@hadoop.apache.org
Subject: Re: Hadoop use direct I/O in Linux?

On 1/4/11 5:17 PM, Christopher Smith wrote:
> If you use direct I/O to reduce CPU time, that means you are saving CPU via
> DMA. If you are using Java's heap though, you can kiss that goodbye.
The buffer for direct I/O cannot be allocated from Java's heap anyway, I don't
understand what you mean?
> 
> That said, I'm surprised that the Atom can't keep up with magnetic disk
> unless you have a striped array. 100MB/s shouldn't be too taxing. Is it
> possible you're doing something wrong or your CPU is otherwise occupied?
Yes, my C program can reach 100MB/s or even 110MB/s when writing data to the
disk sequentially, but with direct I/O enabled, the maximal throughput is about
140MB/s. But the biggest difference is CPU usage.
Without direct I/O, operating system uses a lot of CPU time (the data below is
got with top, and this is a dual-core processor with hyperthread enabled).
Cpu(s):  3.4%us, 32.8%sy,  0.0%ni, 50.0%id, 12.1%wa,  0.0%hi,  1.6%si,  0.0%st
But with direct I/O, the system time can be as little as 3%.

Best,
Da
> 
> On Tue, Jan 4, 2011 at 9:58 AM, Da Zheng  wrote:
> 
>> The most important reason for me to use direct I/O is that the Atom
>> processor is too weak. If I wrote a simple program to write data to the
>> disk, CPU is almost 100% but the disk hasn't reached its maximal bandwidth.
>> When I write data to SSD, the difference is even larger. Even if the program
>> has saturated the two cores of the CPU, it cannot even get to the half of
>> the maximal bandwidth of SSD.
>>
>> I don't know how much benefit direct I/O can bring to the normal processor
>> such as Xeon, but I have a feeling I have to use direct I/O in order to have
>> good performance on Atom processors.
>>
>> Best,
>> Da
> 
> 



The information contained in this communication may be CONFIDENTIAL and is 
intended only for the use of the recipient(s) named above.  If you are not the 
intended recipient, you are hereby notified that any dissemination, 
distribution, or copying of this communication, or any of its contents, is 
strictly prohibited.  If you have received this communication in error, please 
notify the sender and delete/destroy the original message and any copy of it 
from your computer or paper files.


Re: Hadoop use direct I/O in Linux?

2011-01-05 Thread Jay Booth
On Tue, Jan 4, 2011 at 12:58 PM, Da Zheng  wrote:

> The most important reason for me to use direct I/O is that the Atom
> processor is too weak. If I wrote a simple program to write data to the
> disk, CPU is almost 100% but the disk hasn't reached its maximal bandwidth.
> When I write data to SSD, the difference is even larger. Even if the program
> has saturated the two cores of the CPU, it cannot even get to the half of
> the maximal bandwidth of SSD.
>
>

The issue here is most likely checksumming.  Hadoop computes a CRC32 for
every 512 bytes it reads or writes.  On most processors, the CPU can easily
keep up and still saturate the pipe, but your atom is probably behind.
 There's a config somewhere to disable checksums, I'd suggest trying that.
 They're much more expensive CPU wise than simply byte funneling on the
client side (and the wire protocol which involves interleaving checksum data
with file data is the reason it would be really difficult to rewrite the
client to use direct I/O).





> I don't know how much benefit direct I/O can bring to the normal processor
> such as Xeon, but I have a feeling I have to use direct I/O in order to have
> good performance on Atom processors.
>
> Best,
> Da
>
>
> On 01/04/2011 10:12 AM, Segel, Mike wrote:
>
>> All,
>> While this is an interesting topic for debate, I think it's a moot point.
>> A lot of DBAs (Especially Informix DBAs) don't agree with Linus. (I'm
>> referring to an earlier post in this thread that referenced a quote from
>> Linus T.) Direct I/O is a good thing. But if Linus is removing it from
>> Linux...
>>
>> But with respect to Hadoop... disk i/o shouldn't be a major topic. I mean
>> if it were, then why isn't anyone pushing the use of SSDs? Or if they are
>> too expensive for your budget, why not SAS drives that spin at 15K?
>> Ok, those points are rhetorical. The simple solution is that if you're i/o
>> bound, you add more nodes with more disk to further distribute the load,
>> right?
>>
>> Also, I may be wrong, but do all OS(s) that one can run Hadoop, handle
>> Direct I/O? And handle it in a common way? So won't you end up having
>> machine/OS specific classes?
>>
>> IMHO there are other features that don't yet exist in Hadoop/HBase that
>> will yield a better ROI.
>>
>> Ok, so I may be way off base, so I'll shut up now... ;-P
>>
>> -Mike
>>
>>
>>
>> -Original Message-
>> From: Christopher Smith [mailto:cbsm...@gmail.com]
>> Sent: Tuesday, January 04, 2011 8:56 AM
>> To: common-dev@hadoop.apache.org
>> Subject: Re: Hadoop use direct I/O in Linux?
>>
>> On Mon, Jan 3, 2011 at 7:15 PM, Brian Bockelman> >wrote:
>>
>>  The I/O pattern isn't truly random.  To convert from physicist terms to
>>> CS
>>> terms, the application is iterating through the rows of a column-oriented
>>> store, reading out somewhere between 1 and 10% of the columns.  The twist
>>> is
>>> that the columns are compressed, meaning the size of a set of rows on
>>> disk
>>> is variable.
>>>
>>>  We're getting pretty far off topic here, but this is an interesting
>> problem.
>> It *sounds* to me like a "compressed bitmap index" problem, possibly with
>> bloom filters for joins (basically what HBase/Cassandra/Hypertable get in
>> to, or in a less distributed case: MonetDB). Is that on the money?
>>
>>
>>   This prevents any sort of OS page cache stride detection from helping -
>>> the OS sees everything as random.
>>>
>>>  It seems though like if you organized the data a certain way, the OS
>> page
>> cache could help.
>>
>>
>>However, the application also has an index of where each row is
>>> located,
>>> meaning if it knows the active set of columns, it can predict the reads
>>> the
>>> client will perform and do a read-ahead.
>>>
>>>  Yes, this is the kind of advantage where O_DIRECT might help, although
>> I'd
>> hope in this kind of circumstance the OS buffer cache would mostly give up
>> anyway and just give as much of the available RAM as possible to the app.
>> In
>> that case memory mapped files with a thread doing a bit of read ahead
>> would
>> seem like not that much slower than using O_DIRECT.
>>
>> That said, I have to wonder how often this problem devolves in to a
>> straight
>> forward column scan. I mean, with a 1-10% hit rate, you need SSD seek
>> times
>> for it to make sense to seek to specific records vs. just scanning through
>> the whole column, or to put it another way: "disk is the new tape". ;-)
>>
>>
>>  Some days, it does feel like "building a better TCP using UDP".  However,
>>> we got a 3x performance improvement by building it (and multiplying by
>>> 10-15k cores for just our LHC experiment, that's real money!), so it's a
>>> particular monstrosity we are stuck with.
>>>
>>
>> It sure sounds like a problem better suited to C++ than Java though. What
>> benefits do you yield from doing all this with a JVM?
>>
>>
>


Re: Hadoop use direct I/O in Linux?

2011-01-05 Thread Da Zheng
On 1/5/11 12:44 AM, Christopher Smith wrote:
> On Tue, Jan 4, 2011 at 9:11 PM, Da Zheng  wrote:
> 
>> On 1/4/11 5:17 PM, Christopher Smith wrote:
>>> If you use direct I/O to reduce CPU time, that means you are saving CPU
>> via
>>> DMA. If you are using Java's heap though, you can kiss that goodbye.
>> The buffer for direct I/O cannot be allocated from Java's heap anyway, I
>> don't
>> understand what you mean?
> 
> 
> The DMA buffer cannot be on Java's heap, but in the typical use case (say
> Hadoop), it would certainly have to get copied either in to our out from
> Java's heap, and that's going to get the CPU involved whether you like it
> nor not. If you stay entirely off the Java heap, you really don't get to use
> much of Java's object model or capabilities, so you have to wonder why use
> Java in the first place.
true. I wrote the code with JNI, and found it's still very close to its best
performance when doing one or even two memory copy.
> 
> 
>>> That said, I'm surprised that the Atom can't keep up with magnetic disk
>>> unless you have a striped array. 100MB/s shouldn't be too taxing. Is it
>>> possible you're doing something wrong or your CPU is otherwise occupied?
>> Yes, my C program can reach 100MB/s or even 110MB/s when writing data to
>> the
>> disk sequentially, but with direct I/O enabled, the maximal throughput is
>> about
>> 140MB/s. But the biggest difference is CPU usage.
>> Without direct I/O, operating system uses a lot of CPU time (the data below
>> is
>> got with top, and this is a dual-core processor with hyperthread enabled).
>> Cpu(s):  3.4%us, 32.8%sy,  0.0%ni, 50.0%id, 12.1%wa,  0.0%hi,  1.6%si,
>>  0.0%st
>> But with direct I/O, the system time can be as little as 3%.
>>
> 
> I'm surprised that system time is really that high. We did Atom experiments
> where it wasn't even close to that. Are you using a memory mapped file? If
No, I don't. just simply write a large chunk of data to the memory and the code
is attached below. Right now the buffer size is 1MB, I think it's big enough to
get the best performance.
> not are you buffering your writes? Is there perhaps
> something dysfunctional about the drive controller/driver you are using?
I'm not sure. It's also odd to me, but I thought it's what I can get from a Atom
processor. I guess I need to do some profiling.
Also, which Atom processors did you use? do you have hyperthread enabled?

Best,
Da

int main (int argc, char *argv[])
{
char *out_file;
int outfd;
ssize_t size;
time_t start_time2;
long size1 = 0;

out_file = argv[1];

outfd = open (out_file, O_CREAT | O_WRONLY, S_IWUSR | S_IRUSR);
if (outfd < 0) {
perror ("open");
return -1;
}

buf = mmap (0, bufsize, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS,
-1, 0);

start_time2 = start_time = time (NULL);
signal(SIGINT , sighandler);
int offset = 0;

while (1) {
fill_data ((int *) buf, bufsize);
size = write (outfd, buf, bufsize);
if (size < 0) {
perror ("fwrite");
return 1;
}
offset += size;
tot_size += size;
size1 += size;
//  if (posix_fadvise (outfd, 0, offset, POSIX_FADV_NOREUSE) < 0)
//  perror ("posix_fadvise");

time_t end_time = time (NULL);
if (end_time - start_time2 > 5) {
printf ("current rate: %ld\n",
(long) (size1 / (end_time - start_time2)));
size1 = 0;
start_time2 = end_time;
}
}
}


Re: Hadoop use direct I/O in Linux?

2011-01-05 Thread Da Zheng
On 1/5/11 9:50 AM, Segel, Mike wrote:
> You are mixing a few things up.
> 
> You're testing your I/O using C. 
> What do you see if you try testing your direct I/O from Java?
> I'm guessing that you'll keep your i/o piece in place and wrap it within some 
> JNI code and then re-write the test in Java? 
I tested both.
> 
> Also are you testing large streams or random i/o blocks? (Hopefully both)
I only test large streams. For mapreduce, the only random i/o access is in
between mapping and reducing, right? where the output from mappers is sorted,
spilled to the disk and then merge sort. Then reducers need another merge sort
after they pull data. All these operations are not completely random. Maybe
there is some random access for metadata, but it should be small.
> 
> I think that when you test out the system, you'll find that you won't see 
> much, if any performance improvement.
> 
> 
> 
> -Original Message-
> From: Da Zheng [mailto:zhen...@cs.jhu.edu] 
> Sent: Tuesday, January 04, 2011 11:11 PM
> To: common-dev@hadoop.apache.org
> Subject: Re: Hadoop use direct I/O in Linux?
> 
> On 1/4/11 5:17 PM, Christopher Smith wrote:
>> If you use direct I/O to reduce CPU time, that means you are saving CPU via
>> DMA. If you are using Java's heap though, you can kiss that goodbye.
> The buffer for direct I/O cannot be allocated from Java's heap anyway, I don't
> understand what you mean?
>>
>> That said, I'm surprised that the Atom can't keep up with magnetic disk
>> unless you have a striped array. 100MB/s shouldn't be too taxing. Is it
>> possible you're doing something wrong or your CPU is otherwise occupied?
> Yes, my C program can reach 100MB/s or even 110MB/s when writing data to the
> disk sequentially, but with direct I/O enabled, the maximal throughput is 
> about
> 140MB/s. But the biggest difference is CPU usage.
> Without direct I/O, operating system uses a lot of CPU time (the data below is
> got with top, and this is a dual-core processor with hyperthread enabled).
> Cpu(s):  3.4%us, 32.8%sy,  0.0%ni, 50.0%id, 12.1%wa,  0.0%hi,  1.6%si,  0.0%st
> But with direct I/O, the system time can be as little as 3%.
> 
> Best,
> Da
>>
>> On Tue, Jan 4, 2011 at 9:58 AM, Da Zheng  wrote:
>>
>>> The most important reason for me to use direct I/O is that the Atom
>>> processor is too weak. If I wrote a simple program to write data to the
>>> disk, CPU is almost 100% but the disk hasn't reached its maximal bandwidth.
>>> When I write data to SSD, the difference is even larger. Even if the program
>>> has saturated the two cores of the CPU, it cannot even get to the half of
>>> the maximal bandwidth of SSD.
>>>
>>> I don't know how much benefit direct I/O can bring to the normal processor
>>> such as Xeon, but I have a feeling I have to use direct I/O in order to have
>>> good performance on Atom processors.
>>>
>>> Best,
>>> Da
>>
>>
> 
> 
> 
> The information contained in this communication may be CONFIDENTIAL and is 
> intended only for the use of the recipient(s) named above.  If you are not 
> the intended recipient, you are hereby notified that any dissemination, 
> distribution, or copying of this communication, or any of its contents, is 
> strictly prohibited.  If you have received this communication in error, 
> please notify the sender and delete/destroy the original message and any copy 
> of it from your computer or paper files.
> 



Re: Hadoop use direct I/O in Linux?

2011-01-05 Thread Greg Roelofs
Da Zheng wrote:

> I already did "ant compile-c++-libhdfs -Dlibhdfs=1", but it seems nothing is
> compiled as it prints the following:

> check-c++-libhdfs:

> check-c++-makefile-libhdfs:

> create-c++-libhdfs-makefile:

> compile-c++-libhdfs:

> BUILD SUCCESSFUL
> Total time: 2 seconds

You may need to add -Dcompile.native=true in there.

Switching lists.

Greg


Using git grafts to merge history across project split

2011-01-05 Thread Todd Lipcon
I know many people use git, so wanted to share a neat tip I figured out this
morning that lets you graft the pre-split history into the post-split
repositories. I'm using git 1.7.1, not sure how new these features are. Here
are the steps:

1) Check out the git repos from git.apache.org into git/hadoop-common,
git/hadoop-mapreduce, and git/hadoop-hdfs

2) Set up the common repo as an "alternate object store" for mr and hdfs:

$ echo "/path/to/git/hadoop-common/.git/objects" >
/path/to/git/hadoop-hdfs/.git/objects/info/alternates
$ echo "/path/to/git/hadoop-common/.git/objects" >
/path/to/git/hadoop-mapreduce/.git/objects/info/alternates
This allows you to look at hashes from common from within your MR or HDFS
repos. Note that if you move the paths later you'll have to update this
file!

3) Set up grafts for the beginning of MR/HDFS history to the pre-split
commit in common:
echo 546d96754ffee3142bcbbf4563c624c053d0ed0d
6c16dc8cf2b28818c852e95302920a278d07ad0c >
git/hadoop-mapreduce/.git/info/grafts
echo 6a3ac690e493c7da45bbf2ae2054768c427fd0e1
6c16dc8cf2b28818c852e95302920a278d07ad0c
> git/hadoop-hdfs/.git/info/grafts

Now when you use commands like git log --follow or git blame, it will pick
up changes from pre-split as if it were one repository.

Hope others find this useful as well!
-Todd
-- 
Todd Lipcon
Software Engineer, Cloudera


Build failed, hudson broken?

2011-01-05 Thread Niels Basjes
Hi,

I just submitted a patch for the feature I've been working on.
https://issues.apache.org/jira/browse/HADOOP-7076

This patch works fine on my system and passes all the unit tests.

Now some 30 minutes later it seems the build on the hudson has failed.
https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/155/

I'm not sure but to me it seems that there are issues with the hudson.
None of the errors in the log are related to my fixes and not only my
build (155)
but also builds 154 and 153 have failed with errors that are (at first
glance) the same.

Someone here knows how/where to get this problem fixed?

-- 
Met vriendelijke groeten,

Niels Basjes


Re: Build failed, hudson broken?

2011-01-05 Thread Niels Basjes
I found where to report this ... so I did:
https://issues.apache.org/jira/browse/INFRA-3340

2011/1/5 Niels Basjes :
> Hi,
>
> I just submitted a patch for the feature I've been working on.
> https://issues.apache.org/jira/browse/HADOOP-7076
>
> This patch works fine on my system and passes all the unit tests.
>
> Now some 30 minutes later it seems the build on the hudson has failed.
> https://hudson.apache.org/hudson/job/PreCommit-HADOOP-Build/155/
>
> I'm not sure but to me it seems that there are issues with the hudson.
> None of the errors in the log are related to my fixes and not only my
> build (155)
> but also builds 154 and 153 have failed with errors that are (at first
> glance) the same.
>
> Someone here knows how/where to get this problem fixed?
>
> --
> Met vriendelijke groeten,
>
> Niels Basjes
>



-- 
Met vriendelijke groeten,

Niels Basjes


Re: Using git grafts to merge history across project split

2011-01-05 Thread Chris Douglas
This is great. Thanks, Todd. -C

On Wed, Jan 5, 2011 at 12:36 PM, Todd Lipcon  wrote:
> I know many people use git, so wanted to share a neat tip I figured out this
> morning that lets you graft the pre-split history into the post-split
> repositories. I'm using git 1.7.1, not sure how new these features are. Here
> are the steps:
>
> 1) Check out the git repos from git.apache.org into git/hadoop-common,
> git/hadoop-mapreduce, and git/hadoop-hdfs
>
> 2) Set up the common repo as an "alternate object store" for mr and hdfs:
>
> $ echo "/path/to/git/hadoop-common/.git/objects" >
> /path/to/git/hadoop-hdfs/.git/objects/info/alternates
> $ echo "/path/to/git/hadoop-common/.git/objects" >
> /path/to/git/hadoop-mapreduce/.git/objects/info/alternates
> This allows you to look at hashes from common from within your MR or HDFS
> repos. Note that if you move the paths later you'll have to update this
> file!
>
> 3) Set up grafts for the beginning of MR/HDFS history to the pre-split
> commit in common:
> echo 546d96754ffee3142bcbbf4563c624c053d0ed0d
> 6c16dc8cf2b28818c852e95302920a278d07ad0c >
> git/hadoop-mapreduce/.git/info/grafts
> echo 6a3ac690e493c7da45bbf2ae2054768c427fd0e1
> 6c16dc8cf2b28818c852e95302920a278d07ad0c
>> git/hadoop-hdfs/.git/info/grafts
>
> Now when you use commands like git log --follow or git blame, it will pick
> up changes from pre-split as if it were one repository.
>
> Hope others find this useful as well!
> -Todd
> --
> Todd Lipcon
> Software Engineer, Cloudera
>


Re: Hadoop use direct I/O in Linux?

2011-01-05 Thread Milind Bhandarkar
I agree with Jay B. Checksumming is usually the culprit for high CPU on clients 
and datanodes. Plus, a checksum of 4 bytes for every 512, means for 64MB block, 
the checksum will be 512KB, i.e. 128 ext3 blocks. Changing it to generate 1 
ext3 checksum block per DFS block will speedup read/write without any loss of 
reliability.

- milind

---
Milind Bhandarkar
(mbhandar...@linkedin.com)
(650-776-3236)








Re: Hadoop use direct I/O in Linux?

2011-01-05 Thread Brian Bockelman

On Jan 5, 2011, at 4:03 PM, Milind Bhandarkar wrote:

> I agree with Jay B. Checksumming is usually the culprit for high CPU on 
> clients and datanodes. Plus, a checksum of 4 bytes for every 512, means for 
> 64MB block, the checksum will be 512KB, i.e. 128 ext3 blocks. Changing it to 
> generate 1 ext3 checksum block per DFS block will speedup read/write without 
> any loss of reliability.
> 

But (speaking to non-MapReduce users) make sure this doesn't adversely affect 
your usage patterns.  If your checksum size is 64KB, then the minimum read size 
is 64KB.  So, an extremely unlucky read of 2 bytes might cause 128KB+overhead 
to travel across the network.

Know thine usage scenarios.

Brian



smime.p7s
Description: S/MIME cryptographic signature


Re: Hadoop use direct I/O in Linux?

2011-01-05 Thread Milind Bhandarkar
> 
> Know thine usage scenarios.


Yup.

- milind

---
Milind Bhandarkar
(mbhandar...@linkedin.com)
(650-776-3236)








Re: Hadoop use direct I/O in Linux?

2011-01-05 Thread Da Zheng
I'm not sure of that. I wrote a small checksum program for testing. 
After the size of a block gets to larger than 8192 bytes, I don't see 
much performance improvement. See the code below. I don't think 64MB can 
bring us any benefit.
I did change io.bytes.per.checksum to 131072 in hadoop, and the program 
ran about 4 or 5 minutes faster (the total time for reducing is about 35 
minutes).


import java.util.zip.CRC32;
import java.util.zip.Checksum;


public class Test1 {
public static void main(String args[]) {
Checksum sum = new CRC32();
byte[] bs = new byte[512];
final int tot_size = 64 * 1024 * 1024;
long time = System.nanoTime();
for (int k = 0; k < tot_size / bs.length; k++) {
for (int i = 0; i < bs.length; i++)
bs[i] = (byte) i;
sum.update(bs, 0, bs.length);
}
System.out.println("takes " + (System.nanoTime() - time) / 1000 
/ 1000);

}
}


On 01/05/2011 05:03 PM, Milind Bhandarkar wrote:

I agree with Jay B. Checksumming is usually the culprit for high CPU on clients 
and datanodes. Plus, a checksum of 4 bytes for every 512, means for 64MB block, 
the checksum will be 512KB, i.e. 128 ext3 blocks. Changing it to generate 1 
ext3 checksum block per DFS block will speedup read/write without any loss of 
reliability.

- milind

---
Milind Bhandarkar
(mbhandar...@linkedin.com)
(650-776-3236)










Re: svn commit: r1055684 - /hadoop/common/branches/branch-0.20/CHANGES.txt

2011-01-05 Thread Ian Holsman
Is 20.3 a 'dead' release ?

I haven't seen any discussion of this on the apache lists about creating a 20.3 
release, and kind of goes against all the discussion that we recently had with 
StAck about creating a 'append' release on 0.20.


I'm not against 20.3, but I would like to see some discussion and not have 
things reverted out of it without discussion.



On Jan 6, 2011, at 10:12 AM, omal...@apache.org wrote:

> Author: omalley
> Date: Wed Jan  5 23:12:49 2011
> New Revision: 1055684
> 
> URL: http://svn.apache.org/viewvc?rev=1055684&view=rev
> Log:
> HADOOP-1734. Move to release 0.20.4 since I already made the tag 0.20.3.
> 
> Modified:
>hadoop/common/branches/branch-0.20/CHANGES.txt
> 
> Modified: hadoop/common/branches/branch-0.20/CHANGES.txt
> URL: 
> http://svn.apache.org/viewvc/hadoop/common/branches/branch-0.20/CHANGES.txt?rev=1055684&r1=1055683&r2=1055684&view=diff
> ==
> --- hadoop/common/branches/branch-0.20/CHANGES.txt (original)
> +++ hadoop/common/branches/branch-0.20/CHANGES.txt Wed Jan  5 23:12:49 2011
> @@ -8,6 +8,9 @@ Release 0.20.4 - Unreleased
> 
>   IMPROVEMENTS
> 
> +MAPREDUCE-1734. Un-deprecate the old MapReduce API in the 0.20 branch.
> +(todd)
> +
> Release 0.20.3 - 2011-1-5
> 
>   NEW FEATURES
> @@ -89,9 +92,6 @@ Release 0.20.3 - 2011-1-5
> 
> MAPREDUCE-1832. Allow file sizes less than 1MB in DFSIO benchmark. (shv)
> 
> -MAPREDUCE-1734. Un-deprecate the old MapReduce API in the 0.20 branch.
> -(todd)
> -
> Release 0.20.2 - 2010-2-19
> 
>   NEW FEATURES
> 
> 



Re: setting "mapred.task.cache.levels" to 0 makes Hadoop stall

2011-01-05 Thread Greg Roelofs
Zhenhua Guo  wrote:

> It seems that mapred.task.cache.levels is used by JobTracker to create
> task caches for nodes at various levels. This makes data-locality
> scheduling possible.
> If I set mapred.task.cache.levels to 0 and use default network
> topology, then mapreduce job will stall forever. The reason is
> JobInProgress::findNewMapTask always returns -1. Field
> "nonRunningMapCache" is empty and field "nonLocalMaps" is also empty.
> I wonder whether it is designed to behave like that. Or when
> mapred.task.cache.levels is set 0, Hadoop should fall back to some
> default caching strategy. E.g. put all tasks into
> JobInProgress::nonLocalMaps.

I think there should either be a fallback mechanism or the code should
disallow/ignore values less than 1.  Can you file a JIRA issue for this?
https://issues.apache.org/jira/secure/CreateIssue!default.jspa

Thanks,
  Greg


Re: Hadoop use direct I/O in Linux?

2011-01-05 Thread Milind Bhandarkar
Have you tried with org.apache.hadoop.util.DataChecksum and 
org.apache.hadoop.util.PureJavaCrc32 ?

- Milind

On Jan 5, 2011, at 3:42 PM, Da Zheng wrote:

> I'm not sure of that. I wrote a small checksum program for testing. After the 
> size of a block gets to larger than 8192 bytes, I don't see much performance 
> improvement. See the code below. I don't think 64MB can bring us any benefit.
> I did change io.bytes.per.checksum to 131072 in hadoop, and the program ran 
> about 4 or 5 minutes faster (the total time for reducing is about 35 minutes).
> 
> import java.util.zip.CRC32;
> import java.util.zip.Checksum;
> 
> 
> public class Test1 {
>public static void main(String args[]) {
>Checksum sum = new CRC32();
>byte[] bs = new byte[512];
>final int tot_size = 64 * 1024 * 1024;
>long time = System.nanoTime();
>for (int k = 0; k < tot_size / bs.length; k++) {
>for (int i = 0; i < bs.length; i++)
>bs[i] = (byte) i;
>sum.update(bs, 0, bs.length);
>}
>System.out.println("takes " + (System.nanoTime() - time) / 1000 / 
> 1000);
>}
> }
> 
> 
> On 01/05/2011 05:03 PM, Milind Bhandarkar wrote:
>> I agree with Jay B. Checksumming is usually the culprit for high CPU on 
>> clients and datanodes. Plus, a checksum of 4 bytes for every 512, means for 
>> 64MB block, the checksum will be 512KB, i.e. 128 ext3 blocks. Changing it to 
>> generate 1 ext3 checksum block per DFS block will speedup read/write without 
>> any loss of reliability.
>> 
>> - milind
>> 
>> ---
>> Milind Bhandarkar
>> (mbhandar...@linkedin.com)
>> (650-776-3236)
>> 
>> 
>> 
>> 
>> 
>> 
> 

---
Milind Bhandarkar
(mbhandar...@linkedin.com)
(650-776-3236)








[jira] Created: (HADOOP-7087) SequenceFile.createWriter ignores FileSystem parameter

2011-01-05 Thread Todd Lipcon (JIRA)
SequenceFile.createWriter ignores FileSystem parameter
--

 Key: HADOOP-7087
 URL: https://issues.apache.org/jira/browse/HADOOP-7087
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 0.22.0
Reporter: Todd Lipcon
Assignee: Todd Lipcon


The SequenceFile.createWriter methods that take a FileSystem ignore this 
parameter after HADOOP-6856. This is causing some MR tests to fail and is a 
breaking change when users pass unqualified paths to these calls.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-7088) JMX Bean that exposes version and build information

2011-01-05 Thread Dmytro Molkov (JIRA)
JMX Bean that exposes version and build information
---

 Key: HADOOP-7088
 URL: https://issues.apache.org/jira/browse/HADOOP-7088
 Project: Hadoop Common
  Issue Type: New Feature
Reporter: Dmytro Molkov


It would be great if each daemon in the cluster had a JMX bean that would 
expose the build version, hadoop version and information of this sort.
This makes it easier for cluster management tools to identify versions of the 
software running on different components.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HADOOP-7089) Use readlink to get absolute paths in the scripts

2011-01-05 Thread Eli Collins (JIRA)
Use readlink to get absolute paths in the scripts 
--

 Key: HADOOP-7089
 URL: https://issues.apache.org/jira/browse/HADOOP-7089
 Project: Hadoop Common
  Issue Type: Improvement
  Components: scripts
Reporter: Eli Collins
Assignee: Eli Collins
Priority: Minor
 Fix For: 0.22.0, 0.23.0


The manual link resolution logic in bin/hadoop-config.sh can be replaced with 
readlink -m -n.  Ditto with other uses of cd + pwd to get an absolute path, 
which can be fragile.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



Re: Hadoop use direct I/O in Linux?

2011-01-05 Thread Da Zheng
isn't DataChecksum just a wrapper of CRC32?
I'm still using Hadoop 0.20.2. there is no PureJavaCrc32

Da

On 1/5/11 7:44 PM, Milind Bhandarkar wrote:
> Have you tried with org.apache.hadoop.util.DataChecksum and 
> org.apache.hadoop.util.PureJavaCrc32 ?
> 
> - Milind
> 
> On Jan 5, 2011, at 3:42 PM, Da Zheng wrote:
> 
>> I'm not sure of that. I wrote a small checksum program for testing. After 
>> the size of a block gets to larger than 8192 bytes, I don't see much 
>> performance improvement. See the code below. I don't think 64MB can bring us 
>> any benefit.
>> I did change io.bytes.per.checksum to 131072 in hadoop, and the program ran 
>> about 4 or 5 minutes faster (the total time for reducing is about 35 
>> minutes).
>>
>> import java.util.zip.CRC32;
>> import java.util.zip.Checksum;
>>
>>
>> public class Test1 {
>>public static void main(String args[]) {
>>Checksum sum = new CRC32();
>>byte[] bs = new byte[512];
>>final int tot_size = 64 * 1024 * 1024;
>>long time = System.nanoTime();
>>for (int k = 0; k < tot_size / bs.length; k++) {
>>for (int i = 0; i < bs.length; i++)
>>bs[i] = (byte) i;
>>sum.update(bs, 0, bs.length);
>>}
>>System.out.println("takes " + (System.nanoTime() - time) / 1000 / 
>> 1000);
>>}
>> }
>>
>>
>> On 01/05/2011 05:03 PM, Milind Bhandarkar wrote:
>>> I agree with Jay B. Checksumming is usually the culprit for high CPU on 
>>> clients and datanodes. Plus, a checksum of 4 bytes for every 512, means for 
>>> 64MB block, the checksum will be 512KB, i.e. 128 ext3 blocks. Changing it 
>>> to generate 1 ext3 checksum block per DFS block will speedup read/write 
>>> without any loss of reliability.
>>>
>>> - milind
>>>
>>> ---
>>> Milind Bhandarkar
>>> (mbhandar...@linkedin.com)
>>> (650-776-3236)
>>>
>>>
>>>
>>>
>>>
>>>
>>
> 
> ---
> Milind Bhandarkar
> (mbhandar...@linkedin.com)
> (650-776-3236)
> 
> 
> 
> 
> 
> 



[jira] Created: (HADOOP-7090) Possible resource leaks in hadoop core code

2011-01-05 Thread Gokul (JIRA)
Possible resource leaks in hadoop core code
---

 Key: HADOOP-7090
 URL: https://issues.apache.org/jira/browse/HADOOP-7090
 Project: Hadoop Common
  Issue Type: Bug
Affects Versions: 0.21.0
Reporter: Gokul


It is always a good practice to close the IO streams in a finally block.. 

For example, look at the following piece of code in the Writer class of 
BloomMapFile 

{code:title=BloomMapFile .java|borderStyle=solid}
public synchronized void close() throws IOException {
  super.close();
  DataOutputStream out = fs.create(new Path(dir, BLOOM_FILE_NAME), true);
  bloomFilter.write(out);
  out.flush();
  out.close();
}
{code} 

If an exception occurs during fs.create or on any other line,  out.close() will 
not be executed..

The following can reduce the scope of resorce leaks..
{code:title=BloomMapFile .java|borderStyle=solid}
public synchronized void close() throws IOException {
  super.close();
  DataOutputStream out = null;
  try{
  out = fs.create(new Path(dir, BLOOM_FILE_NAME), true);
  bloomFilter.write(out);
  out.flush();
  }finally{
 IOUtils.closeStream(out);
}
{code} 



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HADOOP-6872) ChecksumFs#listStatus should filter out .crc files

2011-01-05 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6872?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko resolved HADOOP-6872.
-

Resolution: Duplicate

Fixed as a part of HADOOP-6906.

> ChecksumFs#listStatus should filter out .crc files
> --
>
> Key: HADOOP-6872
> URL: https://issues.apache.org/jira/browse/HADOOP-6872
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: fs
>Affects Versions: 0.21.0
>Reporter: Hairong Kuang
> Fix For: 0.22.0
>
>
> Currently ChecksumFs#listStatus listing not only regular files but also .crc 
> files. Crc files should be filtered out.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HADOOP-6718) Client does not close connection when an exception happens during SASL negotiation

2011-01-05 Thread Konstantin Shvachko (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-6718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Konstantin Shvachko resolved HADOOP-6718.
-

Resolution: Duplicate

Incorporated in HADOOP-6706 for 0.22.

> Client does not close connection when an exception happens during SASL 
> negotiation
> --
>
> Key: HADOOP-6718
> URL: https://issues.apache.org/jira/browse/HADOOP-6718
> Project: Hadoop Common
>  Issue Type: Bug
>  Components: ipc
>Affects Versions: 0.22.0
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.22.0
>
> Attachments: 6718-bp20.patch
>
>
> setupSaslConnection in the RPC client might fail to successfully set up a 
> sasl connection (e.g. if the principal is wrongly configured). It throws an 
> exception back to the caller (setupIOstreams). setupIOstreams marks the 
> connection as closed but it doesn't really close the socket. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.