I go the solution. I have exclusively committed the changes in the oracle database. The updates have been identified by the sqoop and written to the filesystem location. I have used the Sqoop merge command which has replaced the old file with the new file and removed the duplicates based on the primary key.
But one question here the --split-by command in hive is accepting only 1 primary key. What if I ahve a combination of keys as primary keys.I am getting error while giving multiple fields in the split-by parameter. Please clarify On Thu, Nov 14, 2013 at 1:13 AM, Anas Mosaad <[email protected]> wrote: > Hi all, > > I'm not experienced with Sqoop but I'm trying to help. Is it possible to > see the SQL statements executed by Sqoop. I believe if the statements are > debugged anywhere, Blues will be able to pin point the issue. > > > Best Regards > Anas Mosaad > > > > From: burberry blues <[email protected]> > To: [email protected], > Date: 11/13/2013 07:02 PM > Subject: Re: Sqoop not picking up immediate changes > ------------------------------ > > > > HI Jarek, > Intially in db i have > > col1 col2 col3 > 1 a 08-NOV-2013 > 2 b 08-NOV-2013 > 3 c 08-NOV-2013 > > First time sqoop import command > =========================== > sqoop import --connect > jdbc:oracle:thin:@//url:driver/database--username<username> > --password <password> --table table1 --columns col1,col2,col3 > --incremental lastmodified --check-column col3 --last-value "2013-11-07 > 00.00.00.0" --split-by col1 --target-dir<outputdir> > > When i ran the above sqoop import i am able to successfully get all the 3 > records . > > > Now i made 2 updates in DB > > col1 col2 col3 > 1 d 10-NOV-2013 > 2 e 10-NOV-2013 > 3 c 08-NOV-2013 > > Second time Sqoop Command > ======================== > I read that sqoop is currently unable to merge the records of updates ,so > i am trying to get the updates in a new directory and then use "sqoop > merge" to merge this new one and the previous import output. > > So the command i ran is > > sqoop import --connect > jdbc:oracle:thin:@//url:driver/database--username<username> > --password <password> --table table1 --columns col1,col2,col3 > --incremental lastmodified --check-column col3 --last-value "2013-11-09 > 00.00.00.0" --split-by col1 --target-dir<outputdir1> > > This time accoring to the updates i should get records with col1 values > 1,2 as they are updated. > But the second sqoop import zero records in output.(Even during the job > execution it says map input reocrds or reduce output records as 0). > > Even the changes are happening in the DB(I checked the changes by running > the selest * query in db) why cant sqoop find them.It seems like sqoop > didnt find any updates from 9th nov .Please assist me in this issue. > > Thanks, > Blues. > > > > > On Wed, Nov 13, 2013 at 8:32 AM, Jarek Jarcec Cecho > <*[email protected]*<[email protected]>> > wrote: > Hi Blues, > would you mind sharing details about your use case? Table schemas, exact > commands (both on database and in command line) and associated logs? > > Wild guess - when you are changing the rows in the database, are you > committing the ongoing transaction? Sqoop will create a new connection with > new transaction, so due to ACID it won't pick up any uncommitted changes. > > Jarcec > > On Tue, Nov 12, 2013 at 10:36:10PM -0800, burberry blues wrote: > > Hi Team, > > > > I am having a problem with following scenario. > > > > In Db i update a column1 of a row and the column 2 got modified with > > current timestamp. > > But when i try to import those changes through sqoop using --incremental > > lastmodified --check-column column2 --last-value <less than current > > date>,it shows 0 records imported which are changed. > > > > There are changes in the DB but sqoop qorks as if it couldnt find the > > updated once and still pointing to the old records. > > > > i.e Before updating i have 3 records with date as 10th Nov,i asked sqoop > to > > import records after 9th Nov. It imports all 3 records. > > Now i change 1 row and date is updated to 12 Nov. Immediate I ask sqoop > to > > import records after 11th Nov .But it imports 0 records now.If i run the > > same import with date as 9th nov again it works fine and also* give me > > duplicate records*. > > > > Please help me in this issue at the earliest. > > > > Thanks, > > Blues > >
