from:"Jakub Stransky"

unsubscribe

2024-02-01 Thread Jakub Stransky

Configuring hadoop in Azure on linux using Azure BLOB storage

2016-01-28 Thread Jakub Stransky

Hello, we are trying to configure hadoop HDP 2.2 running on azure cloud to use a Azure Storage BLOB instead of regular HDFS. Cluster is up and running, we can list files in azure blob storage over hdoop fs commands. But when trying to run smoke test mapreduce teragen we are getting following excep

Re: Capacity scheduler properties

2015-01-15 Thread Jakub Stransky

Wow, pretty awesome documentation! Thx On 15 January 2015 at 19:53, Wangda Tan wrote: > You can check HDP 2.2's document: > > http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/YARN_RM_v22/capacity_scheduler/index.html > > HTH, > Wangda > > On Thu, Jan 15, 20

Capacity scheduler properties

2015-01-15 Thread Jakub Stransky

Hello, I am configuring capacity scheduler all seems ok but I cannot find what is the meaning of the following property yarn.scheduler.capacity.root.unfunded.capacity I just found that everywhere is set to 50 and description is "No description". Can anybody clarify or point to where to find rel

Memory consumption by AM

2014-10-23 Thread Jakub Stransky

Hello experienced users, we are new to hadoop hence using nearly default configuration including scheduler - which I guess by default is Capacity Scheduler. Lately we were confronted with following behaviour on the cluster. We are using apache oozie for job submission of various data pipes. We ha

Re: How to limit the number of containers requested by a pig script?

2014-10-21 Thread Jakub Stransky

then we can, I think. We do have this property mapreduce.job.maps. > > Regards, > Shahab > > On Tue, Oct 21, 2014 at 2:42 AM, Jakub Stransky > wrote: > >> Hello, >> >> as far as I understand. Number of mappers you cannot drive. The number of >> reducers y

Re: How to limit the number of containers requested by a pig script?

2014-10-20 Thread Jakub Stransky

requested(and used ofcourse) by my pig-script (not as a yarn queue > configuration or some such stuff.. I want to limit it from outside on a > per job basis. I would ideally like to set the number in my pig-script.) > Can I do this? > Thanks, > Sunil. > -- Jakub Stransky cz.linkedin.com/in/jakubstransky

Re: how to copy data between two hdfs cluster fastly?

2014-10-17 Thread Jakub Stransky

Distcp? On 17 Oct 2014 20:51, "Alexander Pivovarov" wrote: > try to run on dest cluster datanode > $ hadoop fs -cp hdfs://from_cluster/hdfs://to_cluster/ > > > > On Fri, Oct 17, 2014 at 11:26 AM, Shivram Mani wrote: > >> What is your approx input size ? >> Do you have multiple files

Cannot fine profiling log file

2014-09-23 Thread Jakub Stransky

Hello experienced users, I did try to use profiling of tasks during mapreduce mapreduce.task.profile true mapreduce.task.profile.maps 0-5 mapreduce.task.profile.params -agentlib:hprof=cpu=samples,heap=sites,depth=6,force=n,thread=y,

Re: CPU utilization

2014-09-12 Thread Jakub Stransky

u run reduce task, you need 1024 MB (mapreduce.reduce.memory.mb). > If you run the MapReduce app master, you need 1024 MB ( > yarn.app.mapreduce.am.resource.mb). > > Therefore, you run MapReduce job, you can run only 2 containers per > NodeManager (3 x 768 = 2304 < 2048) on your setup. > > 2014-09

Re: CPU utilization

2014-09-12 Thread Jakub Stransky

ee higher CPU utilization than 30%. > > Cheers! > Adam > > 2014-09-12 17:51 GMT+02:00 Jakub Stransky : > >> Hello experienced hadoop users, >> >> I have one beginners question regarding cpu utilization on datanodes when >> running MR job. Cluster of 5 mach

Re: Enable Debug logging for a job

2014-09-12 Thread Jakub Stransky

e your response. > > Thanks, > Siddhi > > > -- Jakub Stransky cz.linkedin.com/in/jakubstransky

CPU utilization

2014-09-12 Thread Jakub Stransky

Hello experienced hadoop users, I have one beginners question regarding cpu utilization on datanodes when running MR job. Cluster of 5 machines, 2NN +3 DN really inexpensive hw using following parameters: # hadoop - yarn-site.xml yarn.nodemanager.resource.memory-mb : 2048 yarn.scheduler.minimum-a

task slowness

2014-09-11 Thread Jakub Stransky

Hello experienced hadoop users, I am having a data pipeline consisting of two java MR jobs coordinated by oozie scheduler. Both of them process the same data but the first one is more than 10 times slower than second one. Job counters on RM page are not much helpful in that matter. I have verified

Re: virtual memory consumption

2014-09-11 Thread Jakub Stransky

map memory as 768M and reduce memory as 1024M and am as > 1024M. > > With AM and a single map task it is 1.7M and cannot start another > container for reducer. > Reduce these values and check. > > On 9/11/14, Jakub Stransky wrote: > > Hello hadoop users, > > >

virtual memory consumption

2014-09-11 Thread Jakub Stransky

Hello hadoop users, I am facing following issue when running M/R job during a reduce phase: Container [pid=22961,containerID=container_1409834588043_0080_01_10] is running beyond virtual memory limits. Current usage: 636.6 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory used.

running beyond virtual memory limits

2014-09-10 Thread Jakub Stransky

Hello, I am getting following error when running on 500MB dataset compressed in avro data format. Container [pid=22961,containerID=container_1409834588043_0080_01_10] is running beyond virtual memory limits. Current usage: 636.6 MB of 1 GB physical memory used; 2.1 GB of 2.1 GB virtual memory

Error could only be replicated to 0 nodes instead of minReplication (=1)

2014-08-28 Thread Jakub Stransky

Hello, we are using Hadoop 2.2.0 (HDP 2.0), avro 1.7.4. running on CentOS 6.3 I am facing a following issue when using a AvroMultipleOutputs with dynamic output files. My M/R job works fine for a smaller amount of data or at least the error hasn't appear there so far. With bigger amount of data I

unsubscribe

Configuring hadoop in Azure on linux using Azure BLOB storage

Re: Capacity scheduler properties

Capacity scheduler properties

Memory consumption by AM

Re: How to limit the number of containers requested by a pig script?

Re: How to limit the number of containers requested by a pig script?

Re: how to copy data between two hdfs cluster fastly?

Cannot fine profiling log file

Re: CPU utilization

Re: CPU utilization

Re: Enable Debug logging for a job

CPU utilization

task slowness

Re: virtual memory consumption

virtual memory consumption

running beyond virtual memory limits

Error could only be replicated to 0 nodes instead of minReplication (=1)

18 matches

Site Navigation

Mail list logo

Footer information