Custom output value for map function

2013-02-27 Thread Paul van Hoven
The output value in the map function is in most examples for hadoop
something like this:

public static class Map extends MapperLongWritable, Text, outputKey,
outputValue

Normally outputValue is something like Text or IntWriteable.

I got a custom class with its own properties like

public class Dog {
   string name;
   Date birthday;
   double weight;
}

Now how would I accomplish the following map function:

public static class Map extends MapperLongWritable, Text, IntWritable, Dog

?


Re: Custom output value for map function

2013-02-27 Thread Sandy Ryza
Hi Paul,

To do this, you need to make your Dog class implement Hadoop's Writable
interface, so that it can be serialized to and deserialized from bytes.
http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html

The methods you implement would look something like this:

public void write(DataOutput out) {
  out.writeDouble(weight);
  out.writeUTF(name);
  out.writeLong(date.toTimeInMillis());
}

public void readFields(DataInput in) {
  weight = in.readDouble();
  name = in.readUTF();
  date = new Date(in.readLong());
}

hope that helps,
Sandy

On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven 
paul.van.ho...@googlemail.com wrote:

 The output value in the map function is in most examples for hadoop
 something like this:

 public static class Map extends MapperLongWritable, Text, outputKey,
 outputValue

 Normally outputValue is something like Text or IntWriteable.

 I got a custom class with its own properties like

 public class Dog {
string name;
Date birthday;
double weight;
 }

 Now how would I accomplish the following map function:

 public static class Map extends MapperLongWritable, Text, IntWritable,
 Dog

 ?



Re: Custom output value for map function

2013-02-27 Thread Paul van Hoven
Great! Thank you.

I guess the order for writing and reading the data this way is
important. I mean, for

out.writeUTF(blabla)
out.writeInt(12)

the following would be correct

text = in.readUTF();
number = in.readInt();

and this would fail:

number = in.readInt();
text = in.readUTF();

?

2013/2/27 Sandy Ryza sandy.r...@cloudera.com:
 Hi Paul,

 To do this, you need to make your Dog class implement Hadoop's Writable
 interface, so that it can be serialized to and deserialized from bytes.
 http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html

 The methods you implement would look something like this:

 public void write(DataOutput out) {
   out.writeDouble(weight);
   out.writeUTF(name);
   out.writeLong(date.toTimeInMillis());
 }

 public void readFields(DataInput in) {
   weight = in.readDouble();
   name = in.readUTF();
   date = new Date(in.readLong());
 }

 hope that helps,
 Sandy

 On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven
 paul.van.ho...@googlemail.com wrote:

 The output value in the map function is in most examples for hadoop
 something like this:

 public static class Map extends MapperLongWritable, Text, outputKey,
 outputValue

 Normally outputValue is something like Text or IntWriteable.

 I got a custom class with its own properties like

 public class Dog {
string name;
Date birthday;
double weight;
 }

 Now how would I accomplish the following map function:

 public static class Map extends MapperLongWritable, Text, IntWritable,
 Dog

 ?




Re: Custom output value for map function

2013-02-27 Thread Sandy Ryza
That's right, the date needs to be written and read in the same order.

On Wed, Feb 27, 2013 at 11:04 AM, Paul van Hoven 
paul.van.ho...@googlemail.com wrote:

 Great! Thank you.

 I guess the order for writing and reading the data this way is
 important. I mean, for

 out.writeUTF(blabla)
 out.writeInt(12)

 the following would be correct

 text = in.readUTF();
 number = in.readInt();

 and this would fail:

 number = in.readInt();
 text = in.readUTF();

 ?

 2013/2/27 Sandy Ryza sandy.r...@cloudera.com:
  Hi Paul,
 
  To do this, you need to make your Dog class implement Hadoop's Writable
  interface, so that it can be serialized to and deserialized from bytes.
 
 http://hadoop.apache.org/docs/r1.1.1/api/org/apache/hadoop/io/Writable.html
 
  The methods you implement would look something like this:
 
  public void write(DataOutput out) {
out.writeDouble(weight);
out.writeUTF(name);
out.writeLong(date.toTimeInMillis());
  }
 
  public void readFields(DataInput in) {
weight = in.readDouble();
name = in.readUTF();
date = new Date(in.readLong());
  }
 
  hope that helps,
  Sandy
 
  On Wed, Feb 27, 2013 at 10:34 AM, Paul van Hoven
  paul.van.ho...@googlemail.com wrote:
 
  The output value in the map function is in most examples for hadoop
  something like this:
 
  public static class Map extends MapperLongWritable, Text, outputKey,
  outputValue
 
  Normally outputValue is something like Text or IntWriteable.
 
  I got a custom class with its own properties like
 
  public class Dog {
 string name;
 Date birthday;
 double weight;
  }
 
  Now how would I accomplish the following map function:
 
  public static class Map extends MapperLongWritable, Text, IntWritable,
  Dog
 
  ?