Re: Re: hadoop Exception: java.io.IOException: Couldn't set up IO streams

2014-02-28 Thread shashwat shriparv
Great ... * Warm Regards_**∞_* * Shashwat Shriparv* [image: http://www.linkedin.com/pub/shashwat-shriparv/19/214/2a9][image: https://twitter.com/shriparv] [image: https://www.facebook.com/shriparv]

Re: What if file format is dependent upon first few lines?

2014-02-28 Thread Fengyun RAO
thanks, Jay, it really helps. 2014-02-28 10:32 GMT+08:00 Jay Vyas : > -- method 1 -- > > You could, i think, just extend fileinputformat, with isSplittable = > false. Then each file wont be brokeen up into separate blocks, and > processed as a whole per mapper. This is probably the easiest thin

MultipleOutputs and Custom Writers

2014-02-28 Thread Josh Smith
I need to output 3 different XML files from my map reduce job. I currently have it working for a single file, but when I try to add in the logic for MultipleOutputs to the reducer to support splitting the files, I get errors from the writer attempting to write to the same file. I assume thats cause

Reduce side join of similar records

2014-02-28 Thread João Paulo Forny
I'm implementing a join between two datasets A and B by a String key, which is the name attribute. I need to match similar names in this join. My first thought, given that I was implementing secondary sort to get the values extracted from database A before the values from database B, was to create

apache-hadoop-2.2.0 linux task-controller

2014-02-28 Thread Tim Randles
I downloaded the source tarball for apache-hadoop-2.2.0 but it doesn't seem to include the linux task-controller source. Is there some place else I can download the task-controller source? Has this functionality by superseded by something new? Thanks, Tim -- Tim Randles Los Alamos National

Re: YARN -- Debug messages in logs

2014-02-28 Thread Xuan Gong
Hey, Kishore: you can change the log level from INFO to DEBUG in log4j.properties Thanks, Xuan Gong On Fri, Feb 28, 2014 at 10:09 AM, Krishna Kishore Bonagiri < write2kish...@gmail.com> wrote: > Hi, > > How can I get the debug log messages from RM and other daemons? > > For example, > >

YARN -- Debug messages in logs

2014-02-28 Thread Krishna Kishore Bonagiri
Hi, How can I get the debug log messages from RM and other daemons? For example, Currently I could see messages from LOG.info() only, i.e. something like this: LOG.info(event.getContainerId() + " Container Transitioned from " + oldState + " to " + getState()); How can I get those from

Re: HBase Exception: org.apache.hadoop.hbase.UnknownRowLockException

2014-02-28 Thread Shailesh Samudrala
The version I'm using is 0.90.6. We are trying to implement rowLock & rowUnLock on a HBase table to support our multi-operation transactions on a JSON object(receive value from user -> read HBase row -> calculate new value based on received value and current value in HBase -> put new Value to HBas

Re: HBase Exception: org.apache.hadoop.hbase.UnknownRowLockException

2014-02-28 Thread Ted Yu
In newer releases, there are multiple mechanisms where your scenario can be implemented. Please consider upgrading your deployment. Some references: https://blogs.apache.org/hbase/entry/coprocessor_introduction src/main/java/org/apache/hadoop/hbase/coprocessor/MultiRowMutationEndpoint.java (0.94)

Re: HBase Exception: org.apache.hadoop.hbase.UnknownRowLockException

2014-02-28 Thread Shailesh Samudrala
Hi Ted, Thank you for the references. Unfortunately, the next planned environment upgrade is towards the 2nd half of this year. Also, the transactions I talked about in my earlier email have already been implemented & we are currently trying to eliminate possibilities for multiple processes perfor

Using a specific local path in fs.defaultFS (e.g. file:///local/)

2014-02-28 Thread Chris Mildebrandt
Hello, I'm trying to figure out if I can use a specific path in the fs.defaultFS property. For example, I have a local directory at /local and would like the root path to start under /local. So I used the value file:///local/ as the value of fs.defaultFS. This had no affect and still used the root

Re: apache-hadoop-2.2.0 linux task-controller

2014-02-28 Thread Benoy Antony
Hi Tim, The equivalent functionality is in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c task-controller => container-executor Benoy Antony On Fri, Feb 28, 2014 at 10:18 AM, Tim Randles wrote: >

Re: apache-hadoop-2.2.0 linux task-controller

2014-02-28 Thread Tim Randles
Thank you! On 02/28/2014 02:11 PM, Benoy Antony wrote: Hi Tim, The equivalent functionality is in hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/native/container-executor/impl/container-executor.c task-controller => container-executor Benoy Antony

Hadoop "Spill Failed" Exception in an ec2 instance with 420 GB of instance storage

2014-02-28 Thread S.L
Hi All, I am using Hadoop2.3.0 and have installed it as single node cluster (psuedo-distributed mode) on CentOS 6.4 Amazon ec2 instance with an instance storage of 420GB and 7.5GB of RAM , my understanding is that the " Spill Failed " exception only occurs when the node runs out of the disk spac

Re: YARN - Running Client with third party jars

2014-02-28 Thread Fengyun RAO
read this: http://stackoverflow.com/questions/574594/how-can-i-create-an-executable-jar-with-dependencies-using-maven 2014-02-26 13:22 GMT+08:00 Anand Mundada : > Hi I want to use json jar in client code. > I tried to create runnable jar which include all required jars. > But I am getting follow

Map-Reduce: How to make MR output one file an hour?

2014-02-28 Thread Fengyun RAO
It's a common web log analysis situation. The original weblog is saved every hour on multiple servers. Now we would like the parsed log results to be saved one file an hour. How to make it? In our MR job, the input is a directory with many files in many hours, let's say 4X files in X hours. if the