Re: How to partition a file to smaller size for performing KNN in hadoop mapreduce

2015-01-15 Thread unmesha sreeveni
Is there any way.. Waiting for a reply.I have posted the question every where..but none is responding back. I feel like this is the right place to ask doubts. As some of u may came across the same issue and get stuck. On Thu, Jan 15, 2015 at 12:34 PM, unmesha sreeveni unmeshab...@gmail.com wrote:

How to partition a file to smaller size for performing KNN in hadoop mapreduce

2015-01-14 Thread unmesha sreeveni
In KNN like algorithm we need to load model Data into cache for predicting the records. Here is the example for KNN. [image: Inline image 1] So if the model will be a large file say1 or 2 GB we will be able to load them into Distributed cache. The one way is to split/partition the model

Re: How to partition a file to smaller size for performing KNN in hadoop mapreduce

2015-01-14 Thread Ted Dunning
have you considered implementing using something like spark? That could be much easier than raw map-reduce On Wed, Jan 14, 2015 at 10:06 PM, unmesha sreeveni unmeshab...@gmail.com wrote: In KNN like algorithm we need to load model Data into cache for predicting the records. Here is the

Re: How to partition a file to smaller size for performing KNN in hadoop mapreduce

2015-01-14 Thread unmesha sreeveni
Yes, One of my friend is implemeting the same. I know global sharing of Data is not possible across Hadoop MapReduce. But I need to check if that can be done somehow in hadoop Mapreduce also. Because I found some papers in KNN hadoop also. And I trying to compare the performance too. Hope some