Re: Query _ Sqoop on EMR

Sneh Sun, 05 Feb 2017 20:35:24 -0800

Hi Liz,

I tried running the following command (create a job and then exec) to
incremental fetch data to S3 (on AWS EMR cluster with EMRFS consistent
view).
sqoop job --create incre_reservation -- import --connect "jdbc:postgresql://
rds-replica-hmssync.XXX.rds.amazonaws.com/hms" --username XXX --password
XXX --table reservationbooking --incremental lastmodified --check-column
modified_at --target-dir "s3://platform-poc/sqoop/reservation/incre"


The error which I get says that FS should be HDFS and not S3.
I came up with *alternate* approach to "delta fetch" the data to HDFS and
then run merge command.

I wanted to check if the "hop" to HDFS can be saved and direct merge could
happen at S3.

I got an another question, unrelated to the above:
-> Is there a way I can use wildcards to exclude tables (without specifying
the exact table names) while importing all the tables?

Thanks for your time!


Wishes,
Sneh
8884383482

On Fri, Feb 3, 2017 at 5:24 PM, Erzsebet Szilagyi <[email protected]
> wrote:

> Hi Sneh,
> Could you give us a sample command that you are trying to run?
> Thanks,
> Liz
>
> On Thu, Jan 19, 2017 at 1:36 PM, Sneh <[email protected]>
> wrote:
>
>> Dear Sqoop users,
>>
>> I've spawned an EMR cluster with Sqoop 1.4.6 and trying to "increment
>> fetch" data from RDS to S3.
>> The error I get is that FS should be HDFS and not S3.
>>
>> My EMR cluster is enabled for EMRFS consistent view.
>> I am trying to build a pipeline from RDS to S3. Need help in direction to
>> how to proceed when increment Sqoop job is unable to write to S3.
>>
>> Please help!
>>
>>
>> Wishes,
>> Sneh
>> 8884383482 <(888)%20438-3482>
>>
>>
>> <https://s3-ap-southeast-1.amazonaws.com/treebo-email/Great+Rates/sign.jpg>
>
>
>
>
> --
> Erzsebet Szilagyi
> Software Engineer
> [image: www.cloudera.com] <http://www.cloudera.com>
>

-- 
 <https://s3-ap-southeast-1.amazonaws.com/treebo-email/Great+Rates/sign.jpg>

Re: Query _ Sqoop on EMR

Reply via email to