Hello,
we are trying to configure hadoop HDP 2.2 running on azure cloud to use a
Azure Storage BLOB instead of regular HDFS.
Cluster is up and running, we can list files in azure blob storage over
hdoop fs commands. But when trying to run smoke test mapreduce teragen we
are getting following excep
Wow, pretty awesome documentation!
Thx
On 15 January 2015 at 19:53, Wangda Tan wrote:
> You can check HDP 2.2's document:
>
> http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/YARN_RM_v22/capacity_scheduler/index.html
>
> HTH,
> Wangda
>
> On Thu, Jan 15, 20
Hello,
I am configuring capacity scheduler all seems ok but I cannot find what is
the meaning of the following property
yarn.scheduler.capacity.root.unfunded.capacity
I just found that everywhere is set to 50 and description is "No
description".
Can anybody clarify or point to where to find rel
Hello experienced users,
we are new to hadoop hence using nearly default configuration including
scheduler - which I guess by default is Capacity Scheduler.
Lately we were confronted with following behaviour on the cluster. We are
using apache oozie for job submission of various data pipes. We ha
then we can, I think. We do have this property mapreduce.job.maps.
>
> Regards,
> Shahab
>
> On Tue, Oct 21, 2014 at 2:42 AM, Jakub Stransky
> wrote:
>
>> Hello,
>>
>> as far as I understand. Number of mappers you cannot drive. The number of
>> reducers y
requested(and used ofcourse) by my pig-script (not as a yarn queue
> configuration or some such stuff.. I want to limit it from outside on a
> per job basis. I would ideally like to set the number in my pig-script.)
> Can I do this?
> Thanks,
> Sunil.
>
--
Jakub Stransky
cz.linkedin.com/in/jakubstransky
Distcp?
On 17 Oct 2014 20:51, "Alexander Pivovarov" wrote:
> try to run on dest cluster datanode
> $ hadoop fs -cp hdfs://from_cluster/hdfs://to_cluster/
>
>
>
> On Fri, Oct 17, 2014 at 11:26 AM, Shivram Mani wrote:
>
>> What is your approx input size ?
>> Do you have multiple files
Hello experienced users,
I did try to use profiling of tasks during mapreduce
mapreduce.task.profile
true
mapreduce.task.profile.maps
0-5
mapreduce.task.profile.params
-agentlib:hprof=cpu=samples,heap=sites,depth=6,force=n,thread=y,
u run reduce task, you need 1024 MB (mapreduce.reduce.memory.mb).
> If you run the MapReduce app master, you need 1024 MB (
> yarn.app.mapreduce.am.resource.mb).
>
> Therefore, you run MapReduce job, you can run only 2 containers per
> NodeManager (3 x 768 = 2304 < 2048) on your setup.
>
> 2014-09
ee higher CPU utilization than 30%.
>
> Cheers!
> Adam
>
> 2014-09-12 17:51 GMT+02:00 Jakub Stransky :
>
>> Hello experienced hadoop users,
>>
>> I have one beginners question regarding cpu utilization on datanodes when
>> running MR job. Cluster of 5 mach
e your response.
>
> Thanks,
> Siddhi
>
>
>
--
Jakub Stransky
cz.linkedin.com/in/jakubstransky
Hello experienced hadoop users,
I have one beginners question regarding cpu utilization on datanodes when
running MR job. Cluster of 5 machines, 2NN +3 DN really inexpensive hw
using following parameters:
# hadoop - yarn-site.xml
yarn.nodemanager.resource.memory-mb : 2048
yarn.scheduler.minimum-a
Hello experienced hadoop users,
I am having a data pipeline consisting of two java MR jobs coordinated by
oozie scheduler. Both of them process the same data but the first one is
more than 10 times slower than second one. Job counters on RM page are not
much helpful in that matter. I have verified
map memory as 768M and reduce memory as 1024M and am as
> 1024M.
>
> With AM and a single map task it is 1.7M and cannot start another
> container for reducer.
> Reduce these values and check.
>
> On 9/11/14, Jakub Stransky wrote:
> > Hello hadoop users,
> >
>
Hello hadoop users,
I am facing following issue when running M/R job during a reduce phase:
Container [pid=22961,containerID=container_1409834588043_0080_01_10] is
running beyond virtual memory limits. Current usage: 636.6 MB of 1 GB
physical memory used; 2.1 GB of 2.1 GB virtual memory used.
Hello,
I am getting following error when running on 500MB dataset compressed in
avro data format.
Container [pid=22961,containerID=container_1409834588043_0080_01_10] is
running beyond virtual memory limits. Current usage: 636.6 MB of 1 GB
physical memory used; 2.1 GB of 2.1 GB virtual memory
Hello,
we are using Hadoop 2.2.0 (HDP 2.0), avro 1.7.4. running on CentOS 6.3
I am facing a following issue when using a AvroMultipleOutputs with dynamic
output files. My M/R job works fine for a smaller amount of data or at
least the error hasn't appear there so far. With bigger amount of data I
18 matches
Mail list logo