RE: Map reduce technique

Samir Kumar Das Mohapatra Tue, 05 Mar 2013 22:05:12 -0800

            job.setInputFormatClass(SequenceFileInputFormat.class);

Just you have to follow Hadoop API from apache web-site


Hints:

1)     Create sequence file prior to the Job.(Java Algorithm )

Example POC: You have to change based on your requirement



import java.io.IOException;

import java.net.URI;



import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

import org.apache.hadoop.io.IOUtils;

import org.apache.hadoop.io.IntWritable;

import org.apache.hadoop.io.SequenceFile;

import org.apache.hadoop.io.Text;



//White, Tom (2012-05-10). Hadoop: The Definitive Guide (Kindle Locations 
5375-5384). OReilly Media - A. Kindle Edition.



public class SequenceFileWriteDemo {



    private static final String[] DATA = { "One, two, buckle my shoe", "Three, 
four, shut the door", "Five, six, pick up sticks", "Seven, eight, lay them 
straight", "Nine, ten, a big fat hen" };



    public static void main( String[] args) throws IOException {

                    //local file path

                    String uri = "/home/hadoop/Desktop/Image/test_02.txt";

                    Configuration conf = new Configuration();

                    FileSystem fs = FileSystem.get(URI.create( uri), conf);

                    Path path = new Path( uri);

                    IntWritable key = new IntWritable();

                    Text value = new Text();

                    SequenceFile.Writer writer = null;

                    try {

                                    writer = SequenceFile.createWriter( fs, 
conf, path, key.getClass(), value.getClass());

                                    for (int i = 0; i < 100; i ++) {

                                                    key.set( 100 - i);

                                                    value.set( DATA[ i % 
DATA.length]);

                                                    // System.out.printf("[% 
s]\t% s\t% s\n", writer.getLength(), key, value);

                                                    writer.append( key, value); 
}

                    } finally

                    { IOUtils.closeStream( writer);

                    }

    }

}



Note: you have to convert all image file to one sequence file.



2)      Put it into the HDFS

3)     Write MAP/Reduce  based on the logic what you need



From: AMARNATH, Balachandar [mailto:balachandar.amarn...@airbus.com]
Sent: 06 March 2013 11:24
To: user@hadoop.apache.org
Subject: RE: Map reduce technique

Thanks for the mail,

Can u please share few links to start with?


Regards
Bala

From: Samir Kumar Das Mohapatra [mailto:dasmo...@adobe.com]
Sent: 06 March 2013 11:21
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: RE: Map reduce technique

I think  you have to look the sequence file  as input format .

Basically, the way this works is, you will have a separate Java process that 
takes several image files, reads the ray bytes into memory, then stores the 
data into a key-value pair in a SequenceFile. Keep going and keep writing into 
HDFS. This may take a while, but you'll only have to do it once.

Regards,
Samir.

From: AMARNATH, Balachandar [mailto:balachandar.amarn...@airbus.com]
Sent: 06 March 2013 11:07
To: user@hadoop.apache.org<mailto:user@hadoop.apache.org>
Subject: Map reduce technique

Hi,

I am new to map reduce paradigm. I read in a tutorial that says that 'map' 
function splits the data and into key value pairs. This means, the map-reduce 
framework automatically splits the data into pieces or do we need to explicitly 
provide the method to split the data into pieces. If it does automatically, how 
it splits an image file (size etc)? I see, processing of an image file as a 
whole will give different results than processing them in chunks.



With thanks and regards
Balachandar




The information in this e-mail is confidential. The contents may not be 
disclosed or used by anyone other than the addressee. Access to this e-mail by 
anyone else is unauthorised.

If you are not the intended recipient, please notify Airbus immediately and 
delete this e-mail.

Airbus cannot accept any responsibility for the accuracy or completeness of 
this e-mail as it has been sent over public networks. If you have any concerns 
over the content of this message or its Accuracy or Integrity, please contact 
Airbus immediately.

All outgoing e-mails from Airbus are checked using regularly updated virus 
scanning software but you should take whatever measures you deem to be 
appropriate to ensure that this message and any attachments are virus free.

The information in this e-mail is confidential. The contents may not be 
disclosed or used by anyone other than the addressee. Access to this e-mail by 
anyone else is unauthorised.

If you are not the intended recipient, please notify Airbus immediately and 
delete this e-mail.

Airbus cannot accept any responsibility for the accuracy or completeness of 
this e-mail as it has been sent over public networks. If you have any concerns 
over the content of this message or its Accuracy or Integrity, please contact 
Airbus immediately.

All outgoing e-mails from Airbus are checked using regularly updated virus 
scanning software but you should take whatever measures you deem to be 
appropriate to ensure that this message and any attachments are virus free.

RE: Map reduce technique

Reply via email to