My models are simple java pojos with business fields/functions and
different custom mappings embedded: fixed length line, csvs, binary and
SQL(for export). I use them within map-reduce jobs that pick info from
the HDFS. So having field and column names within these models as well
as fields order for SQL serialization/deserialization I would really
like to reuse that and be able to cover with tests. Another solution
will be to implement scoop-like MR job, but I don't want to mess with
reimplementing that and I still want to be able to test --direct option.
So is there any way to plugin my models into the scoop? I can implement
DBWritable interface and any other interface required.
On 2014-09-23 11:13, Abraham Elmahrek wrote:
Hey Denis,
Could you describe your models a bit? Do they have a special structure
and require the output format to be different? Would they exist in
HDFS? HBase? etc.
What ever it may be, you could potentially hack it into Sqoop1 or you
could wait for Sqoop2 and write a connector. The code generated by
Sqoop is just a Writable that describes how to read fields and write
fields from/to your database. I don't think it's a good idea to modify
the generated code as it would work only for that single instance and
is kind of a mess to keep track of. Until I understand your models a
bit more... I think that's the best advice I can give though.
-Abe
On Mon, Sep 22, 2014 at 9:04 AM, Denis <[email protected]
<mailto:[email protected]>> wrote:
Hi,
I am looking for a good solution to integrate my model classes
with scoop. The only solution I see right now is to import with
/'scoop import/ /.../' command and then run a map job to convert
into my model. I don't like this approach because: 1 - I need to
duplicate fields sequence information while executing 'scoop
import ...', 2 - I don't see any easy way I can do a junit test to
check the imported data can be uploaded back to the DB without
errors (there is a custom upload procedure, not a scoop). So
ideally I would like to extend some interface, do some tricks and
plugin my model into the scoop (I still want to be able to
leverage --direct mode). Any help is highly appreciated. If my
ideal case will cause lot of pain to me, please share some
resources that describe how can I use 'sqoop codegen' results
later (again, ideally as a map-reduce job config).