Build failed in Jenkins: Hadoop-Common-trunk #403

2012-05-11 Thread Apache Jenkins Server
See 

Changes:

[atm] HDFS-3026. HA: Handle failure during HA state transition. Contributed by 
Aaron T. Myers.

[eli] HDFS-3400. DNs should be able start with jsvc even if security is 
disabled. Contributed by Aaron T. Myers

[szetszwo] HDFS-3385. The last block of INodeFileUnderConstruction is not 
necessarily a BlockInfoUnderConstruction, so do not cast it in 
FSNamesystem.recoverLeaseInternal(..).

[eli] HDFS-3401. Cleanup DatanodeDescriptor creation in the tests. Contributed 
by Eli Collins

[eli] HADOOP-8388. Remove unused BlockLocation serialization. Contributed by 
Colin Patrick McCabe

[eli] Remove SHORT_STRING_MAX, left out of the previous commit.

[eli] HADOOP-8361. Avoid out-of-memory problems when deserializing strings. 
Contributed by Colin Patrick McCabe

[eli] HDFS-3134. harden edit log loader against malformed or malicious input. 
Contributed by Colin Patrick McCabe

[szetszwo] HDFS-3369. Rename {get|set|add}INode(..) methods in BlockManager and 
BlocksMap to {get|set|add}BlockCollection(..).  Contributed by John George

[bobby] HADOOP-8375. test-patch should stop immediately once it has found 
compilation errors (bobby)

[eli] HDFS-3230. Cleanup DatanodeID creation in the tests. Contributed by Eli 
Collins

[atm] HDFS-3395. NN doesn't start with HA+security enabled and HTTP address set 
to 0.0.0.0. Contributed by Aaron T. Myers.

[umamahesh] Reverting (Need to re-do the patch. new BlockInfo does not set 
iNode ) HDFS-3157. Error in deleting block is keep on coming from DN even after 
the block report and directory scanning has happened.

[eli] HDFS-3396. FUSE build fails on Ubuntu 12.04. Contributed by Colin Patrick 
McCabe

[eli] HADOOP-7868. Hadoop native fails to compile when default linker option is 
-Wl,--as-needed. Contributed by Trevor Robinson

[eli] HDFS-3328. NPE in DataNode.getIpcPort. Contributed by Eli Collins

[todd] HDFS-3341, HADOOP-8340. SNAPSHOT build versions should compare as less 
than their eventual release. Contributed by Todd Lipcon.

[suresh] HADOOP-8372. NetUtils.normalizeHostName() incorrectly handles hostname 
starting with a numeric character. Contributed by Junping Du.

[bobby] MAPREDUCE-4237. TestNodeStatusUpdater can fail if localhost has a 
domain associated with it (bobby)

[atm] HDFS-3390. DFSAdmin should print full stack traces of errors when DEBUG 
logging is enabled. Contributed by Aaron T. Myers.

[bobby] HADOOP-8373. Port RPC.getServerAddress to 0.23 (Daryn Sharp via bobby)

[bobby] HADOOP-8354. test-patch findbugs may fail if a dependent module is 
changed Contributed by Tom White and Robert Evans.

--
[...truncated 45274 lines...]
[DEBUG]   (f) reactorProjects = [MavenProject: 
org.apache.hadoop:hadoop-annotations:3.0.0-SNAPSHOT @ 

 MavenProject: org.apache.hadoop:hadoop-auth:3.0.0-SNAPSHOT @ 

 MavenProject: org.apache.hadoop:hadoop-auth-examples:3.0.0-SNAPSHOT @ 

 MavenProject: org.apache.hadoop:hadoop-common:3.0.0-SNAPSHOT @ 

 MavenProject: org.apache.hadoop:hadoop-common-project:3.0.0-SNAPSHOT @ 

[DEBUG]   (f) useDefaultExcludes = true
[DEBUG]   (f) useDefaultManifestFile = false
[DEBUG] -- end configuration --
[INFO] 
[INFO] --- maven-enforcer-plugin:1.0:enforce (dist-enforce) @ 
hadoop-common-project ---
[DEBUG] Configuring mojo 
org.apache.maven.plugins:maven-enforcer-plugin:1.0:enforce from plugin realm 
ClassRealm[plugin>org.apache.maven.plugins:maven-enforcer-plugin:1.0, parent: 
sun.misc.Launcher$AppClassLoader@126b249]
[DEBUG] Configuring mojo 
'org.apache.maven.plugins:maven-enforcer-plugin:1.0:enforce' with basic 
configurator -->
[DEBUG]   (s) fail = true
[DEBUG]   (s) failFast = false
[DEBUG]   (f) ignoreCache = false
[DEBUG]   (s) project = MavenProject: 
org.apache.hadoop:hadoop-common-project:3.0.0-SNAPSHOT @ 

[DEBUG]   (s) version = [3.0.2,)
[DEBUG]   (s) version = 1.6
[DEBUG]   (s) rules = 
[org.apache.maven.plugins.enforcer.RequireMavenVersion@6d9538, 
org.apache.maven.plugins.enforcer.RequireJavaVersion@5fd060]
[DEBUG]   (s) session = org.apache.maven.execution.MavenSession@cf68af
[DEBUG]   (s) skip = false
[DEBUG] -- end configuration --
[DEBUG] Executing rule: org.apache.maven.plugins.enforcer.RequireMavenVersion
[DEBUG] Rule org.apache.maven.plugins.enforcer.RequireMavenVersion is cacheable.
[DEBUG] Key org.apache.ma

Build failed in Jenkins: Hadoop-Common-0.23-Build #249

2012-05-11 Thread Apache Jenkins Server
See 

Changes:

[bobby] svn merge -c 1336399. FIXES: MAPREDUCE-4237. TestNodeStatusUpdater can 
fail if localhost has a domain associated with it (bobby)

[bobby] MAPREDUCE-4162. Correctly set token service (Daryn Sharp via bobby)

[bobby] HADOOP-8373. Port RPC.getServerAddress to 0.23 (Daryn Sharp via bobby)

--
[...truncated 12313 lines...]
  [javadoc] Loading source files for package org.apache.hadoop.fs.local...
  [javadoc] Loading source files for package org.apache.hadoop.fs.permission...
  [javadoc] Loading source files for package org.apache.hadoop.fs.s3...
  [javadoc] Loading source files for package org.apache.hadoop.fs.s3native...
  [javadoc] Loading source files for package org.apache.hadoop.fs.shell...
  [javadoc] Loading source files for package org.apache.hadoop.fs.viewfs...
  [javadoc] Loading source files for package org.apache.hadoop.http...
  [javadoc] Loading source files for package org.apache.hadoop.http.lib...
  [javadoc] Loading source files for package org.apache.hadoop.io...
  [javadoc] Loading source files for package org.apache.hadoop.io.compress...
  [javadoc] Loading source files for package 
org.apache.hadoop.io.compress.bzip2...
  [javadoc] Loading source files for package 
org.apache.hadoop.io.compress.lz4...
  [javadoc] Loading source files for package 
org.apache.hadoop.io.compress.snappy...
  [javadoc] Loading source files for package 
org.apache.hadoop.io.compress.zlib...
  [javadoc] Loading source files for package org.apache.hadoop.io.file.tfile...
  [javadoc] Loading source files for package org.apache.hadoop.io.nativeio...
  [javadoc] Loading source files for package org.apache.hadoop.io.retry...
  [javadoc] Loading source files for package org.apache.hadoop.io.serializer...
  [javadoc] Loading source files for package 
org.apache.hadoop.io.serializer.avro...
  [javadoc] Loading source files for package org.apache.hadoop.ipc...
  [javadoc] Loading source files for package org.apache.hadoop.ipc.metrics...
  [javadoc] Loading source files for package org.apache.hadoop.jmx...
  [javadoc] Loading source files for package org.apache.hadoop.log...
  [javadoc] Loading source files for package org.apache.hadoop.log.metrics...
  [javadoc] Loading source files for package org.apache.hadoop.metrics...
  [javadoc] Loading source files for package org.apache.hadoop.metrics.file...
  [javadoc] Loading source files for package 
org.apache.hadoop.metrics.ganglia...
  [javadoc] Loading source files for package org.apache.hadoop.metrics.jvm...
  [javadoc] Loading source files for package org.apache.hadoop.metrics.spi...
  [javadoc] Loading source files for package org.apache.hadoop.metrics.util...
  [javadoc] Loading source files for package org.apache.hadoop.metrics2...
  [javadoc] Loading source files for package 
org.apache.hadoop.metrics2.annotation...
  [javadoc] Loading source files for package 
org.apache.hadoop.metrics2.filter...
  [javadoc] Loading source files for package org.apache.hadoop.metrics2.impl...
  [javadoc] Loading source files for package org.apache.hadoop.metrics2.lib...
  [javadoc] Loading source files for package org.apache.hadoop.metrics2.sink...
  [javadoc] Loading source files for package 
org.apache.hadoop.metrics2.sink.ganglia...
  [javadoc] Loading source files for package 
org.apache.hadoop.metrics2.source...
  [javadoc] Loading source files for package org.apache.hadoop.metrics2.util...
  [javadoc] Loading source files for package org.apache.hadoop.net...
  [javadoc] Loading source files for package org.apache.hadoop.record...
  [javadoc] Loading source files for package 
org.apache.hadoop.record.compiler...
  [javadoc] Loading source files for package 
org.apache.hadoop.record.compiler.ant...
  [javadoc] Loading source files for package 
org.apache.hadoop.record.compiler.generated...
  [javadoc] Loading source files for package org.apache.hadoop.record.meta...
  [javadoc] Loading source files for package org.apache.hadoop.security...
  [javadoc] Loading source files for package 
org.apache.hadoop.security.authorize...
  [javadoc] Loading source files for package org.apache.hadoop.security.token...
  [javadoc] Loading source files for package 
org.apache.hadoop.security.token.delegation...
  [javadoc] Loading source files for package org.apache.hadoop.tools...
  [javadoc] Loading source files for package org.apache.hadoop.util...
  [javadoc] Loading source files for package org.apache.hadoop.util.bloom...
  [javadoc] Loading source files for package org.apache.hadoop.util.hash...
  [javadoc] 2 errors
 [xslt] Processing 

 to 

 [xslt] Loading stylesheet 
/home/jenkins/tools/findbugs/latest/s

Is it possible to execute Hive queries parallelly by writing mapper and reducer

2012-05-11 Thread Bhavesh Shah
Hello all,
I am asking you about the increasing the performance of Hive. I tried with
mappers and reducers but I didn't see difference in execution.
Don't know why, may be I did in some other way which may be not correct or
due to some other reason.

I am thinking that Is it possible to execute Hive queries parallelly?
Means,
Normally the queries get execute in queue manner.
query1
query2
query3
.
.
.
n

I am thinking that if we use mapreduce program  in Hive JDBC program, then
is it possible to execute it parallelly.
Don't know will it work or not? Thats I am asking you about it.
But again my questions are:
1) If it is possible then may be it require multiple Hive Thrift Server?
2) Is it possible to open multiple Hive Thrift Server?
3) I think it is not possible to open multiple Hive Thrift Server on same
port.?
4) Can we open multiple Hive Thrift Server on different different port?

Please suggest me some solution to this. If you have other idea other than
this then pls share with with me
I will also try that.
Thanks


-- 
Regards,
Bhavesh Shah


Re: Sailfish

2012-05-11 Thread Robert Evans
That makes perfect sense to me.  Especially because it really is a new 
implementation of shuffle that is optimized for very large jobs.  I am happy to 
see anything go in that is going to improve the performance of hadoop, and I 
look forward to running some benchmarks on the changes.  I am not super 
familiar with sailfish, but from what I remember from a while ago it is the 
modified version of KFS that is in reality doing the sorting.  The maps will 
output data to "chunks" aka blocks that when each chunk is full it is sorted.  
When the sorting is finished for a chunk the reducers are now free to pull the 
sorted data from the chunks and run.  I have a few concerns with it though.


 1.  How do we securely handle different comparators?  Currently comparators 
run as the user that launched the job, not as a privileged user.  Sailfish 
seems to require that comparators run as a privileged user, or we only support 
pure bitwise sorting of keys.
 2.  How does this work in a mixed environment?  Sailfish, as I understand it, 
is optimized for large map/reduce jobs, and can be slower on small jobs than 
the current implementation.  How do we make it so that large jobs are able to 
run faster, but not negatively impact the more common small jobs?  We could run 
both in parallel and switch between them depending on the size of the job's 
input, or a config key of some sort, but then the RAM needed to make these big 
jobs run fast would not be available for smaller jobs to use when no really big 
job is running.

--Bobby Evans

On 5/11/12 1:32 AM, "Todd Lipcon"  wrote:

Hey Sriram,

We discussed this before, but for the benefit of the wider audience: :)

It seems like the requirements imposed on KFS by Sailfish are in most
ways much simplier than the requirements of a full distributed
filesystem. The one thing we need is atomic record append -- but we
don't need anything else, like filesystem metadata/naming,
replication, corrupt data scanning, etc. All of the data is
transient/short-lived and at replication count 1.

So I think building something specific to this use case would be
pretty practical - and my guess is it might even have some benefits
over trying to use a full DFS.

In the MR2 architecture, I'd probably try to build this as a service
plugin in the NodeManager (similar to the way that the ShuffleHandler
in the current implementation works)

-Todd

On Thu, May 10, 2012 at 11:01 PM, Sriram Rao  wrote:
> Srivas,
>
> Sailfish is builds upon record append (a feature not present in HDFS).
>
> The software that is currently released is based on Hadoop-0.20.2.  You use
> the Sailfish version of Hadoop-0.20.2, KFS for the intermediate data, and
> then HDFS (or KFS) for storing the job/input.  Since the changes are all in
> the handling of map output/reduce input, it is transparent to existing jobs.
>
> What is being proposed below is to bolt all the starting/stopping of the
> related deamons into YARN as a first step.  There are other approaches that
> are possible, which have a similar effect.
>
> Hope this helps.
>
> Sriram
>
>
> On Thu, May 10, 2012 at 10:50 PM, M. C. Srivas  wrote:
>
>> Sriram,   Sailfish depends on append. I just noticed the HDFS disabled
>> append. How does one use this with Hadoop?
>>
>>
>> On Wed, May 9, 2012 at 9:00 AM, Otis Gospodnetic <
>> otis_gospodne...@yahoo.com
>> > wrote:
>>
>> > Hi Sriram,
>> >
>> > >> The I-file concept could possibly be implemented here in a fairly self
>> > contained way. One
>> > >> could even colocate/embed a KFS filesystem with such an alternate
>> > >> shuffle, like how MR task temporary space is usually colocated with
>> > >> HDFS storage.
>> >
>> > >  Exactly.
>> >
>> > >> Does this seem reasonable in any way?
>> >
>> > > Great. Where do go from here?  How do we get a colloborative effort
>> > going?
>> >
>> >
>> > Sounds like a JIRA issue should be opened, the approach briefly
>> described,
>> > and the first implementation attempt made.  Then iterate.
>> >
>> > I look forward to seeing this! :)
>> >
>> > Otis
>> > --
>> >
>> > Performance Monitoring for Solr / ElasticSearch / HBase -
>> > http://sematext.com/spm
>> >
>> >
>> >
>> > >
>> > > From: Sriram Rao 
>> > >To: common-dev@hadoop.apache.org
>> > >Sent: Tuesday, May 8, 2012 6:48 PM
>> > >Subject: Re: Sailfish
>> > >
>> > >Dear Andy,
>> > >
>> > >> From: Andrew Purtell 
>> > >> ...
>> > >
>> > >> Do you intend this to be a joint project with the Hadoop community or
>> > >> a technology competitor?
>> > >
>> > >As I had said in my email, we are looking for folks to colloborate
>> > >with us to help get us integrated with Hadoop.  So, to be explicitly
>> > >clear, we are intending for this to be a joint project with the
>> > >community.
>> > >
>> > >> Regrettably, KFS is not a "drop in replacement" for HDFS.
>> > >> Hypothetically: I have several petabytes of data in an existing HDFS
>> > >> deployment, which is the norm, and a continuous MapReduce workflow.
>> 

[jira] [Created] (HADOOP-8391) Hadoop-auth should use log4j

2012-05-11 Thread Eli Collins (JIRA)
Eli Collins created HADOOP-8391:
---

 Summary: Hadoop-auth should use log4j
 Key: HADOOP-8391
 URL: https://issues.apache.org/jira/browse/HADOOP-8391
 Project: Hadoop Common
  Issue Type: Improvement
  Components: conf
Affects Versions: 2.0.0
Reporter: Eli Collins


Per HADOOP-8086 hadoop-auth uses slf4j, don't see why it shouldn't use log4j to 
be consistent with the rest of Hadoop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8392) Add YARN audit logging to log4j.properties

2012-05-11 Thread Eli Collins (JIRA)
Eli Collins created HADOOP-8392:
---

 Summary: Add YARN audit logging to log4j.properties
 Key: HADOOP-8392
 URL: https://issues.apache.org/jira/browse/HADOOP-8392
 Project: Hadoop Common
  Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Eli Collins


MAPREDUCE-2655 added MR/NM audit logging but it's not hooked up into 
log4j.properties or the bin and env scripts like the other audit logs so you 
have to modify the deployed binary to change them. Let's add the relevant 
plumbing that the other audit loggers have, and update log4.properties with a 
sample configuration that's disabled by default, eg see [this 
comment|https://issues.apache.org/jira/browse/MAPREDUCE-2655?focusedCommentId=13084191&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13084191].

Also, looks like mapred.AuditLogger and its plumbing can be removed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8393) hadoop-config.sh missing variable exports, causes Yarn jobs to fail with ClassNotFoundException MRAppMaster

2012-05-11 Thread Patrick Hunt (JIRA)
Patrick Hunt created HADOOP-8393:


 Summary: hadoop-config.sh missing variable exports, causes Yarn 
jobs to fail with ClassNotFoundException MRAppMaster
 Key: HADOOP-8393
 URL: https://issues.apache.org/jira/browse/HADOOP-8393
 Project: Hadoop Common
  Issue Type: Bug
  Components: scripts
Reporter: Patrick Hunt
Assignee: Patrick Hunt


If you start a pseudo distributed yarn using "start-yarn.sh" you need to 
specify exports for HADOOP_COMMON_HOME, HADOOP_HDFS_HOME, YARN_HOME, 
YARN_CONF_DIR, and HADOOP_MAPRED_HOME in hadoop-env.sh (or elsewhere), 
otherwise the spawned node manager will be missing these in it's environment. 
This is due to start-yarn using yarn-daemons. With this fix it's possible to 
start yarn (etc...) with only HADOOP_CONF_DIR specified in the environment. 
Took some time to track down this failure, so seems worthwhile to fix.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8394) Add flag in RPC requests indicating when a call is a retry

2012-05-11 Thread Todd Lipcon (JIRA)
Todd Lipcon created HADOOP-8394:
---

 Summary: Add flag in RPC requests indicating when a call is a retry
 Key: HADOOP-8394
 URL: https://issues.apache.org/jira/browse/HADOOP-8394
 Project: Hadoop Common
  Issue Type: Improvement
  Components: ipc
Affects Versions: 2.0.0
Reporter: Todd Lipcon
Priority: Minor


For idempotent operations, the IPC client transparently retries calls. For 
operations which aren't inherently idempotent, we often have to use some tricky 
logic to make them idempotent -- see HDFS-3031 for example. It would be nice if 
the RPC request had a flag indicating that the client was making a retry. Then, 
in the server side logic, we can add sanity checks that, when the logic 
indicates a call is an idempotent retry, the RPC call agrees.

One example where this is useful is the close() RPC. We can make it idempotent 
by saying that close() on an already-closed file should succeed. But, it's 
really an error to call close() twice outside the context of retries. Having 
this property set on the call would allow us to enable the "double close is OK" 
semantics only for retries.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HADOOP-8224) Don't hardcode hdfs.audit.logger in the scripts

2012-05-11 Thread Eli Collins (JIRA)

 [ 
https://issues.apache.org/jira/browse/HADOOP-8224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eli Collins resolved HADOOP-8224.
-

   Resolution: Fixed
Fix Version/s: 2.0.0
 Hadoop Flags: Reviewed

I've committed this and merged to branch-2. Thanks Tomohiko!

> Don't hardcode hdfs.audit.logger in the scripts
> ---
>
> Key: HADOOP-8224
> URL: https://issues.apache.org/jira/browse/HADOOP-8224
> Project: Hadoop Common
>  Issue Type: Improvement
>  Components: conf
>Affects Versions: 2.0.0
>Reporter: Eli Collins
>Assignee: Tomohiko Kinebuchi
> Fix For: 2.0.0
>
> Attachments: HADOOP-8224.txt, HADOOP-8224.txt, hadoop-8224.txt
>
>
> The HADOOP_*OPTS defined for HDFS in hadoop-env.sh hard-code the 
> hdfs.audit.logger (is explicitly set via "-Dhdfs.audit.logger=INFO,RFAAUDIT") 
> so it's not overridable. Let's allow someone to override it as we do the 
> other parameters by introducing HADOOP_AUDIT_LOGGER.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8395) Text shell command unnecessarily demands that a SequenceFile's key class be WritableComparable

2012-05-11 Thread Harsh J (JIRA)
Harsh J created HADOOP-8395:
---

 Summary: Text shell command unnecessarily demands that a 
SequenceFile's key class be WritableComparable
 Key: HADOOP-8395
 URL: https://issues.apache.org/jira/browse/HADOOP-8395
 Project: Hadoop Common
  Issue Type: Bug
  Components: util
Affects Versions: 2.0.0
Reporter: Harsh J
Priority: Trivial


Text from Display set of Shell commands (hadoop fs -text), has a strict 
subclass check for a sequence-file-header loaded key class to be a subclass of 
WritableComparable.

The sequence file writer itself has no such checks (one can create sequence 
files with just plain writable keys, comparable is needed for sequence file's 
sorter alone, which not all of them use always), and hence its not reasonable 
for Text command to carry it either.

We should relax the check and simply just check for "Writable", not 
"WritableComparable".

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HADOOP-8396) DataStreamer, OutOfMemoryError, unable to create new native thread

2012-05-11 Thread Catalin Alexandru Zamfir (JIRA)
Catalin Alexandru Zamfir created HADOOP-8396:


 Summary: DataStreamer, OutOfMemoryError, unable to create new 
native thread
 Key: HADOOP-8396
 URL: https://issues.apache.org/jira/browse/HADOOP-8396
 Project: Hadoop Common
  Issue Type: Bug
  Components: io
Affects Versions: 1.0.2
 Environment: Ubuntu 64bit, 4GB of RAM, Core Duo processors, commodity 
hardware.
Reporter: Catalin Alexandru Zamfir
Priority: Blocker


We're trying to write about 1 few billion records, via "Avro". When we got this 
error, that's unrelated to our code:

10725984 [Main] INFO net.gameloft.RnD.Hadoop.App - ## At: 2:58:43.290 # 
Written: 52100 records
Exception in thread "DataStreamer for file 
/Streams/Cubed/Stuff/objGame/aRandomGame/objType/aRandomType/2012/05/11/20/29/Shard.avro
 block blk_3254486396346586049_75838" java.lang.OutOfMemoryError: unable to 
create new native thread
at java.lang.Thread.start0(Native Method)
at java.lang.Thread.start(Thread.java:657)
at 
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:612)
at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:184)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1202)
at org.apache.hadoop.ipc.Client.call(Client.java:1046)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy8.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at 
org.apache.hadoop.hdfs.DFSClient.createClientDatanodeProtocolProxy(DFSClient.java:160)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:3117)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.access$2200(DFSClient.java:2586)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2790)
10746169 [Main] INFO net.gameloft.RnD.Hadoop.App - ## At: 2:59:03.474 # 
Written: 52200 records
Exception in thread "ResponseProcessor for block blk_4201760269657070412_73948" 
java.lang.OutOfMemoryError
at sun.misc.Unsafe.allocateMemory(Native Method)
at java.nio.DirectByteBuffer.(DirectByteBuffer.java:117)
at java.nio.ByteBuffer.allocateDirect(ByteBuffer.java:305)
at sun.nio.ch.Util.getTemporaryDirectBuffer(Util.java:75)
at sun.nio.ch.IOUtil.read(IOUtil.java:223)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:254)
at 
org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:55)
at 
org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:155)
at 
org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:128)
at java.io.DataInputStream.readFully(DataInputStream.java:195)
at java.io.DataInputStream.readLong(DataInputStream.java:416)
at 
org.apache.hadoop.hdfs.protocol.DataTransferProtocol$PipelineAck.readFields(DataTransferProtocol.java:124)
at 
org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$ResponseProcessor.run(DFSClient.java:2964)
#
# There is insufficient memory for the Java Runtime Environment to continue.
# Native memory allocation (malloc) failed to allocate 32 bytes for intptr_t in 
/build/buildd/openjdk-6-6b23~pre11/build/openjdk/hotspot/src/share/vm/runtime/deoptimization.cpp
[thread 1587264368 also had an error]
[thread 309168 also had an error]
[thread 1820371824 also had an error]
[thread 1343454064 also had an error]
[thread 1345444720 also had an error]
# An error report file with more information is saved as:
# [thread 1345444720 also had an error]
[thread -1091290256 also had an error]
[thread 678165360 also had an error]
[thread 678497136 also had an error]
[thread 675511152 also had an error]
[thread 1385937776 also had an error]
[thread 911969136 also had an error]
[thread -1086207120 also had an error]
[thread -1088251024 also had an error]
[thread -1088914576 also had an error]
[thread -1086870672 also had an error]
[thread 441797488 also had an error][thread 445778800 also had an error]

[thread 440400752 also had an error]
[thread 444119920 also had an error][thread 1151298416 also had an error]

[thread 443124592 also had an error]
[thread 1152625520 also had an error]
[thread 913628016 also had an error]
[thread -1095345296 also had an error][thread 1390799728 also had an error]

[thread 443788144 also had an error]
[thread 676506480 also had an error]
[thread 1630595952 also had an error]
pure virtual method called
terminate called without an active exception
pure virtual method called
Aborted

It seems to be a memory leak. We were opening 5 - 10 buffers to different paths 
when writing and closing them. We've tested that those buffers do not overrun. 
And th