Hi Liz, I tried running the following command (create a job and then exec) to incremental fetch data to S3 (on AWS EMR cluster with EMRFS consistent view). sqoop job --create incre_reservation -- import --connect "jdbc:postgresql:// rds-replica-hmssync.XXX.rds.amazonaws.com/hms" --username XXX --password XXX --table reservationbooking --incremental lastmodified --check-column modified_at --target-dir "s3://platform-poc/sqoop/reservation/incre"
The error which I get says that FS should be HDFS and not S3. I came up with *alternate* approach to "delta fetch" the data to HDFS and then run merge command. I wanted to check if the "hop" to HDFS can be saved and direct merge could happen at S3. I got an another question, unrelated to the above: -> Is there a way I can use wildcards to exclude tables (without specifying the exact table names) while importing all the tables? Thanks for your time! Wishes, Sneh 8884383482 On Fri, Feb 3, 2017 at 5:24 PM, Erzsebet Szilagyi <[email protected] > wrote: > Hi Sneh, > Could you give us a sample command that you are trying to run? > Thanks, > Liz > > On Thu, Jan 19, 2017 at 1:36 PM, Sneh <[email protected]> > wrote: > >> Dear Sqoop users, >> >> I've spawned an EMR cluster with Sqoop 1.4.6 and trying to "increment >> fetch" data from RDS to S3. >> The error I get is that FS should be HDFS and not S3. >> >> My EMR cluster is enabled for EMRFS consistent view. >> I am trying to build a pipeline from RDS to S3. Need help in direction to >> how to proceed when increment Sqoop job is unable to write to S3. >> >> Please help! >> >> >> Wishes, >> Sneh >> 8884383482 <(888)%20438-3482> >> >> >> <https://s3-ap-southeast-1.amazonaws.com/treebo-email/Great+Rates/sign.jpg> > > > > > -- > Erzsebet Szilagyi > Software Engineer > [image: www.cloudera.com] <http://www.cloudera.com> > -- <https://s3-ap-southeast-1.amazonaws.com/treebo-email/Great+Rates/sign.jpg>
