Re: Capacity scheduler properties
You can check HDP 2.2's document: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/YARN_RM_v22/capacity_scheduler/index.html HTH, Wangda On Thu, Jan 15, 2015 at 4:22 AM, Jakub Stransky stransky...@gmail.com wrote: Hello, I am configuring capacity scheduler all seems ok but I cannot find what is the meaning of the following property yarn.scheduler.capacity.root.unfunded.capacity I just found that everywhere is set to 50 and description is No description. Can anybody clarify or point to where to find relevant documentation? Thx Jakub
Re: Capacity scheduler properties
Wow, pretty awesome documentation! Thx On 15 January 2015 at 19:53, Wangda Tan wheele...@gmail.com wrote: You can check HDP 2.2's document: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/YARN_RM_v22/capacity_scheduler/index.html HTH, Wangda On Thu, Jan 15, 2015 at 4:22 AM, Jakub Stransky stransky...@gmail.com wrote: Hello, I am configuring capacity scheduler all seems ok but I cannot find what is the meaning of the following property yarn.scheduler.capacity.root.unfunded.capacity I just found that everywhere is set to 50 and description is No description. Can anybody clarify or point to where to find relevant documentation? Thx Jakub -- Jakub Stransky cz.linkedin.com/in/jakubstransky
Re: Capacity scheduler properties
Actually this made me curious, but I don't see any reference to that specific conf entry in the doc, at least at a first text search. Since the Hortonworks' appears to be the only real documentation, I would suggest you to download the source code and find where and how that particular parameter is used in there (unfortunately, it is often the only way). It must be a new entry in the config of the CS, so hopefully it will have a better description in the next versions of Hadoop. Regards Fabio On 01/15/2015 10:52 PM, Jakub Stransky wrote: Wow, pretty awesome documentation! Thx On 15 January 2015 at 19:53, Wangda Tan wheele...@gmail.com mailto:wheele...@gmail.com wrote: You can check HDP 2.2's document: http://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.2.0/YARN_RM_v22/capacity_scheduler/index.html HTH, Wangda On Thu, Jan 15, 2015 at 4:22 AM, Jakub Stransky stransky...@gmail.com mailto:stransky...@gmail.com wrote: Hello, I am configuring capacity scheduler all seems ok but I cannot find what is the meaning of the following property yarn.scheduler.capacity.root.unfunded.capacity I just found that everywhere is set to 50 and description is No description. Can anybody clarify or point to where to find relevant documentation? Thx Jakub -- Jakub Stransky cz.linkedin.com/in/jakubstransky http://cz.linkedin.com/in/jakubstransky
Re: How to partition a file to smaller size for performing KNN in hadoop mapreduce
Is there any way.. Waiting for a reply.I have posted the question every where..but none is responding back. I feel like this is the right place to ask doubts. As some of u may came across the same issue and get stuck. On Thu, Jan 15, 2015 at 12:34 PM, unmesha sreeveni unmeshab...@gmail.com wrote: Yes, One of my friend is implemeting the same. I know global sharing of Data is not possible across Hadoop MapReduce. But I need to check if that can be done somehow in hadoop Mapreduce also. Because I found some papers in KNN hadoop also. And I trying to compare the performance too. Hope some pointers can help me. On Thu, Jan 15, 2015 at 12:17 PM, Ted Dunning ted.dunn...@gmail.com wrote: have you considered implementing using something like spark? That could be much easier than raw map-reduce On Wed, Jan 14, 2015 at 10:06 PM, unmesha sreeveni unmeshab...@gmail.com wrote: In KNN like algorithm we need to load model Data into cache for predicting the records. Here is the example for KNN. [image: Inline image 1] So if the model will be a large file say1 or 2 GB we will be able to load them into Distributed cache. The one way is to split/partition the model Result into some files and perform the distance calculation for all records in that file and then find the min ditance and max occurance of classlabel and predict the outcome. How can we parttion the file and perform the operation on these partition ? ie 1 record Distance parttition1,partition2, 2nd record Distance parttition1,partition2,... This is what came to my thought. Is there any further way. Any pointers would help me. -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/ -- *Thanks Regards * *Unmesha Sreeveni U.B* *Hadoop, Bigdata Developer* *Centre for Cyber Security | Amrita Vishwa Vidyapeetham* http://www.unmeshasreeveni.blogspot.in/
Capacity scheduler properties
Hello, I am configuring capacity scheduler all seems ok but I cannot find what is the meaning of the following property yarn.scheduler.capacity.root.unfunded.capacity I just found that everywhere is set to 50 and description is No description. Can anybody clarify or point to where to find relevant documentation? Thx Jakub