My models are simple java pojos with business fields/functions and different custom mappings embedded: fixed length line, csvs, binary and SQL(for export). I use them within map-reduce jobs that pick info from the HDFS. So having field and column names within these models as well as fields order for SQL serialization/deserialization I would really like to reuse that and be able to cover with tests. Another solution will be to implement scoop-like MR job, but I don't want to mess with reimplementing that and I still want to be able to test --direct option. So is there any way to plugin my models into the scoop? I can implement DBWritable interface and any other interface required.

On 2014-09-23 11:13, Abraham Elmahrek wrote:
Hey Denis,

Could you describe your models a bit? Do they have a special structure and require the output format to be different? Would they exist in HDFS? HBase? etc.

What ever it may be, you could potentially hack it into Sqoop1 or you could wait for Sqoop2 and write a connector. The code generated by Sqoop is just a Writable that describes how to read fields and write fields from/to your database. I don't think it's a good idea to modify the generated code as it would work only for that single instance and is kind of a mess to keep track of. Until I understand your models a bit more... I think that's the best advice I can give though.

-Abe

On Mon, Sep 22, 2014 at 9:04 AM, Denis <[email protected] <mailto:[email protected]>> wrote:

    Hi,

        I am looking for a good solution to integrate my model classes
    with scoop. The only solution I see right now is to import with
    /'scoop import/ /.../' command and then run a map job to convert
    into my model. I don't like this approach because: 1 - I need to
    duplicate fields sequence information while executing 'scoop
    import ...', 2 - I don't see any easy way I can do a junit test to
    check the imported data can be uploaded back to the DB without
    errors (there is a custom upload procedure, not a scoop). So
    ideally I would like to extend some interface, do some tricks and
    plugin my model into the scoop (I still want to be able to
    leverage --direct mode). Any help is highly appreciated. If my
    ideal case will cause lot of pain to me, please share some
    resources that describe how can I use 'sqoop codegen' results
    later (again, ideally as a map-reduce job config).



Reply via email to