Re: How do I convert DataInput and ResultSet to array of String?

2009-06-04 Thread Aaron Kimball
e.g. for readFields(),

myItems = new ArrayListString();
int numItems = dataInput.readInt();
for (i = 0; i  numItems; i++) {
  myItems.add(Text.readString(dataInput));
}

then on the serialization (write) side, send:

dataOutput.writeInt(myItems.length());
for (int i = 0; i  myItems.length(); i++) {
  new Text(myItems.get(i)).writeString(dataOutput);
}

You should look at the source code for ArrayWritable, IntWritable, and Text
-- specifically their write() and readFields() methods -- to get a feel for
how to write such methods for your own types.

- Aaron


On Wed, Jun 3, 2009 at 4:19 PM, dealmaker vin...@gmail.com wrote:


 Would you provide a code sample of it?  I don't know how to do serializer
 in
 hadoop.

 I am using the following class as type of my value object:

  private static class StringArrayWritable extends ArrayWritable {
private StringArrayWritable (String [] aSString) {
  super (aSString);
}
  }

 Thanks.


 Aaron Kimball-3 wrote:
 
  The text serializer will pull out an entire string by using a null
  terminator at the end.
 
  If you need to know the number of string objects, though, you'll have to
  serialize that before the strings, then use a for loop to decode the rest
  of
  them.
  - Aaron
 
  On Tue, Jun 2, 2009 at 6:01 PM, dealmaker vin...@gmail.com wrote:
 
 
  Thanks.  The number of elements in this array of String is unknown until
  run
  time.  If datainput treats it as a byte array, I still have to know the
  size
  of each String.  How do I do that?   Would you suggest some code samples
  or
  links that deal with similar situation like this?  The only examples I
  got
  are the ones about counting number of words which deal with integers.
  Thanks.
 
 
  Aaron Kimball-3 wrote:
  
   Hi,
  
   You can't just turn either of these two types into arrays of strings
   automatically, because they are interfaces to underlying streams of
  data.
   You are required to know what protocol you are implementing -- i.e.,
  how
   many fields you are transmitting -- and manually read through that
 many
   fields yourself. For example, a DataInput object is effectively a
  pointer
   into a byte array. There may be many records in that byte array, but
  you
   only want to read the fields of the first record out.
  
   For DataInput / DataOutput, you can UTF8-decode the next field by
  calling
   Text.readString(dataInput) and Text.writeString(dataOutput).
   For ResultSet, you want resultSet.getString(fieldNum)
  
   As (yet another) shameless plug ( :smile: ), check out the tool we
 just
   released, which automates database import tasks. It auto-generates the
   classes necessary for your tables, too.
   http://www.cloudera.com/blog/2009/06/01/introducing-sqoop/
  
   At the very least, you might want to play with it a bit and read its
   source
   code so you have a better idea of how to implement your own class
  (since
   you're doing some more creative stuff like building up associative
  arrays
   for each field).
  
   Cheers,
   - Aaron
  
   On Mon, Jun 1, 2009 at 9:53 PM, dealmaker vin...@gmail.com wrote:
  
  
   bump.  Does anyone know?
  
   I am using the following class of arraywritable:
  
private static class StringArrayWritable extends ArrayWritable {
  private StringArrayWritable (String [] aSString) {
super (aSString);
   }
}
  
  
   dealmaker wrote:
   
Hi,
  How do I convert DataInput to array of String?
  How do I convert ResultSet to array of String?
Thanks.  Following is the code:
   
  static class Record implements Writable, DBWritable {
String [] aSAssoc;
   
public void write(DataOutput arg0) throws IOException {
  throw new UnsupportedOperationException(Not supported
 yet.);
}
   
public void readFields(DataInput in) throws IOException {
  this.aSAssoc = // How to convert DataInput to String Array?
}
   
public void write(PreparedStatement arg0) throws SQLException {
  throw new UnsupportedOperationException(Not supported
 yet.);
}
   
public void readFields(ResultSet rs) throws SQLException {
  this.aSAssoc = // How to convert ResultSet to String Array?
}
  }
   
   
  
   --
   View this message in context:
  
 
 http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23826464.html
   Sent from the Hadoop core-user mailing list archive at Nabble.com.
  
  
  
  
 
  --
  View this message in context:
 
 http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23843679.html
  Sent from the Hadoop core-user mailing list archive at Nabble.com.
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23861270.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.




Re: How do I convert DataInput and ResultSet to array of String?

2009-06-03 Thread Aaron Kimball
The text serializer will pull out an entire string by using a null
terminator at the end.

If you need to know the number of string objects, though, you'll have to
serialize that before the strings, then use a for loop to decode the rest of
them.
- Aaron

On Tue, Jun 2, 2009 at 6:01 PM, dealmaker vin...@gmail.com wrote:


 Thanks.  The number of elements in this array of String is unknown until
 run
 time.  If datainput treats it as a byte array, I still have to know the
 size
 of each String.  How do I do that?   Would you suggest some code samples or
 links that deal with similar situation like this?  The only examples I got
 are the ones about counting number of words which deal with integers.
 Thanks.


 Aaron Kimball-3 wrote:
 
  Hi,
 
  You can't just turn either of these two types into arrays of strings
  automatically, because they are interfaces to underlying streams of data.
  You are required to know what protocol you are implementing -- i.e., how
  many fields you are transmitting -- and manually read through that many
  fields yourself. For example, a DataInput object is effectively a pointer
  into a byte array. There may be many records in that byte array, but you
  only want to read the fields of the first record out.
 
  For DataInput / DataOutput, you can UTF8-decode the next field by calling
  Text.readString(dataInput) and Text.writeString(dataOutput).
  For ResultSet, you want resultSet.getString(fieldNum)
 
  As (yet another) shameless plug ( :smile: ), check out the tool we just
  released, which automates database import tasks. It auto-generates the
  classes necessary for your tables, too.
  http://www.cloudera.com/blog/2009/06/01/introducing-sqoop/
 
  At the very least, you might want to play with it a bit and read its
  source
  code so you have a better idea of how to implement your own class (since
  you're doing some more creative stuff like building up associative arrays
  for each field).
 
  Cheers,
  - Aaron
 
  On Mon, Jun 1, 2009 at 9:53 PM, dealmaker vin...@gmail.com wrote:
 
 
  bump.  Does anyone know?
 
  I am using the following class of arraywritable:
 
   private static class StringArrayWritable extends ArrayWritable {
 private StringArrayWritable (String [] aSString) {
   super (aSString);
  }
   }
 
 
  dealmaker wrote:
  
   Hi,
 How do I convert DataInput to array of String?
 How do I convert ResultSet to array of String?
   Thanks.  Following is the code:
  
 static class Record implements Writable, DBWritable {
   String [] aSAssoc;
  
   public void write(DataOutput arg0) throws IOException {
 throw new UnsupportedOperationException(Not supported yet.);
   }
  
   public void readFields(DataInput in) throws IOException {
 this.aSAssoc = // How to convert DataInput to String Array?
   }
  
   public void write(PreparedStatement arg0) throws SQLException {
 throw new UnsupportedOperationException(Not supported yet.);
   }
  
   public void readFields(ResultSet rs) throws SQLException {
 this.aSAssoc = // How to convert ResultSet to String Array?
   }
 }
  
  
 
  --
  View this message in context:
 
 http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23826464.html
  Sent from the Hadoop core-user mailing list archive at Nabble.com.
 
 
 
 

 --
 View this message in context:
 http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23843679.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.




Re: How do I convert DataInput and ResultSet to array of String?

2009-06-02 Thread Aaron Kimball
Hi,

You can't just turn either of these two types into arrays of strings
automatically, because they are interfaces to underlying streams of data.
You are required to know what protocol you are implementing -- i.e., how
many fields you are transmitting -- and manually read through that many
fields yourself. For example, a DataInput object is effectively a pointer
into a byte array. There may be many records in that byte array, but you
only want to read the fields of the first record out.

For DataInput / DataOutput, you can UTF8-decode the next field by calling
Text.readString(dataInput) and Text.writeString(dataOutput).
For ResultSet, you want resultSet.getString(fieldNum)

As (yet another) shameless plug ( :smile: ), check out the tool we just
released, which automates database import tasks. It auto-generates the
classes necessary for your tables, too.
http://www.cloudera.com/blog/2009/06/01/introducing-sqoop/

At the very least, you might want to play with it a bit and read its source
code so you have a better idea of how to implement your own class (since
you're doing some more creative stuff like building up associative arrays
for each field).

Cheers,
- Aaron

On Mon, Jun 1, 2009 at 9:53 PM, dealmaker vin...@gmail.com wrote:


 bump.  Does anyone know?

 I am using the following class of arraywritable:

  private static class StringArrayWritable extends ArrayWritable {
private StringArrayWritable (String [] aSString) {
  super (aSString);
 }
  }


 dealmaker wrote:
 
  Hi,
How do I convert DataInput to array of String?
How do I convert ResultSet to array of String?
  Thanks.  Following is the code:
 
static class Record implements Writable, DBWritable {
  String [] aSAssoc;
 
  public void write(DataOutput arg0) throws IOException {
throw new UnsupportedOperationException(Not supported yet.);
  }
 
  public void readFields(DataInput in) throws IOException {
this.aSAssoc = // How to convert DataInput to String Array?
  }
 
  public void write(PreparedStatement arg0) throws SQLException {
throw new UnsupportedOperationException(Not supported yet.);
  }
 
  public void readFields(ResultSet rs) throws SQLException {
this.aSAssoc = // How to convert ResultSet to String Array?
  }
}
 
 

 --
 View this message in context:
 http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23826464.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.




Re: How do I convert DataInput and ResultSet to array of String?

2009-06-02 Thread dealmaker

Thanks.  The number of elements in this array of String is unknown until run
time.  If datainput treats it as a byte array, I still have to know the size
of each String.  How do I do that?   Would you suggest some code samples or
links that deal with similar situation like this?  The only examples I got
are the ones about counting number of words which deal with integers.
Thanks.


Aaron Kimball-3 wrote:
 
 Hi,
 
 You can't just turn either of these two types into arrays of strings
 automatically, because they are interfaces to underlying streams of data.
 You are required to know what protocol you are implementing -- i.e., how
 many fields you are transmitting -- and manually read through that many
 fields yourself. For example, a DataInput object is effectively a pointer
 into a byte array. There may be many records in that byte array, but you
 only want to read the fields of the first record out.
 
 For DataInput / DataOutput, you can UTF8-decode the next field by calling
 Text.readString(dataInput) and Text.writeString(dataOutput).
 For ResultSet, you want resultSet.getString(fieldNum)
 
 As (yet another) shameless plug ( :smile: ), check out the tool we just
 released, which automates database import tasks. It auto-generates the
 classes necessary for your tables, too.
 http://www.cloudera.com/blog/2009/06/01/introducing-sqoop/
 
 At the very least, you might want to play with it a bit and read its
 source
 code so you have a better idea of how to implement your own class (since
 you're doing some more creative stuff like building up associative arrays
 for each field).
 
 Cheers,
 - Aaron
 
 On Mon, Jun 1, 2009 at 9:53 PM, dealmaker vin...@gmail.com wrote:
 

 bump.  Does anyone know?

 I am using the following class of arraywritable:

  private static class StringArrayWritable extends ArrayWritable {
private StringArrayWritable (String [] aSString) {
  super (aSString);
 }
  }


 dealmaker wrote:
 
  Hi,
How do I convert DataInput to array of String?
How do I convert ResultSet to array of String?
  Thanks.  Following is the code:
 
static class Record implements Writable, DBWritable {
  String [] aSAssoc;
 
  public void write(DataOutput arg0) throws IOException {
throw new UnsupportedOperationException(Not supported yet.);
  }
 
  public void readFields(DataInput in) throws IOException {
this.aSAssoc = // How to convert DataInput to String Array?
  }
 
  public void write(PreparedStatement arg0) throws SQLException {
throw new UnsupportedOperationException(Not supported yet.);
  }
 
  public void readFields(ResultSet rs) throws SQLException {
this.aSAssoc = // How to convert ResultSet to String Array?
  }
}
 
 

 --
 View this message in context:
 http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23826464.html
 Sent from the Hadoop core-user mailing list archive at Nabble.com.


 
 

-- 
View this message in context: 
http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23843679.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



Re: How do I convert DataInput and ResultSet to array of String?

2009-06-01 Thread dealmaker

bump.  Does anyone know?

I am using the following class of arraywritable:

  private static class StringArrayWritable extends ArrayWritable { 
private StringArrayWritable (String [] aSString) { 
  super (aSString); 
} 
  } 


dealmaker wrote:
 
 Hi,
   How do I convert DataInput to array of String?
   How do I convert ResultSet to array of String?
 Thanks.  Following is the code:
 
   static class Record implements Writable, DBWritable {
 String [] aSAssoc;
 
 public void write(DataOutput arg0) throws IOException {
   throw new UnsupportedOperationException(Not supported yet.);
 }
 
 public void readFields(DataInput in) throws IOException {
   this.aSAssoc = // How to convert DataInput to String Array?
 }
 
 public void write(PreparedStatement arg0) throws SQLException {
   throw new UnsupportedOperationException(Not supported yet.);
 }
 
 public void readFields(ResultSet rs) throws SQLException {
   this.aSAssoc = // How to convert ResultSet to String Array?
 }
   }
 
 

-- 
View this message in context: 
http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23826464.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.



How do I convert DataInput and ResultSet to array of String?

2009-05-28 Thread dealmaker

Hi,
  How do I convert DataInput to array of String?
  How do I convert ResultSet to array of String?
Thanks.  Following is the code:

  static class Record implements Writable, DBWritable {
String [] aSAssoc;

public void write(DataOutput arg0) throws IOException {
  throw new UnsupportedOperationException(Not supported yet.);
}

public void readFields(DataInput in) throws IOException {
  this.aSAssoc = // How to convert DataInput to String Array?
}

public void write(PreparedStatement arg0) throws SQLException {
  throw new UnsupportedOperationException(Not supported yet.);
}

public void readFields(ResultSet rs) throws SQLException {
  this.aSAssoc = // How to convert ResultSet to String Array?
}
  }

-- 
View this message in context: 
http://www.nabble.com/How-do-I-convert-DataInput-and-ResultSet-to-array-of-String--tp23770747p23770747.html
Sent from the Hadoop core-user mailing list archive at Nabble.com.