On Wed, May 20, 2009 at 10:52 AM, Aaron Kimball <aa...@cloudera.com> wrote:
> You said that you're concerned with the performance of DELETE, but I don't > know a better way around this if all your input sources are forced to write > to the same table. Ideally you could have a "current" table and a "frozen" > table; writes always go to the current table and the import is done from > the > frozen table. Then you can DROP TABLE frozen relatively quickly > post-import. > At the time of next import you change which table is current and which is > frozen, and repeat. In MySQL you can create updateable views, so you might > want to use a view as an indirection pointer to synchronously change all > your writers from one underlying table to the other. > You can also do an atomic table swap in MySQL. I've used a pattern like this before: CREATE TABLE current_staging LIKE current; DROP TABLE old; RENAME TABLE current to old, current_staging to current; If you're using MySQL 5.1 by any chance, you can also use table partitions to very quickly select or drop portions of tables. -Todd > I'll put a shameless plug here -- I'm developing a tool called sqoop > designed to import from databases into HDFS; patch is available at > http://issues.apache.org/jira/browse/hadoop-5815. It doesn't currently > have > support for WHERE clauses, but it's on the roadmap. Please check it out and > let me know what you think. > > Cheers, > - Aaron > > > On Wed, May 20, 2009 at 9:48 AM, dealmaker <vin...@gmail.com> wrote: > > > > > No, my prime objective is not to backup db. I am trying to move the > > records > > from mysql db to hadoop for processing. Hadoop itself doesn't keep any > > records. After that, I will remove the same mysql records processed in > > hadoop from the mysql db. The main point isn't about getting the mysql > > records, the main point is removing the same mysql records that are > > processed in hadoop from mysql db. > > > > > > Edward J. Yoon-2 wrote: > > > > > > Oh.. According to my understanding, To maintain a steady DB size, > > > delete and backup the old records. If so, I guess you can continuously > > > do that using WHERE and LIMIT clauses. Then you can reduce the I/O > > > costs...... It should be dumped at once? > > > > > > On Thu, May 21, 2009 at 12:48 AM, dealmaker <vin...@gmail.com> wrote: > > >> > > >> Other parts of the non-hadoop system will continue to add records to > > >> mysql db > > >> when I move those records (and remove the very same records from > mysql > > >> db > > >> at the same time) to hadoop for processing. That's why I am doing > those > > >> mysql commands. > > >> > > >> What are you suggesting? If I do it like you suggest, dump all > records > > >> from > > >> mysql db to a file in hdfs, how do I remove those very same records > from > > >> the > > >> mysql db at the same time? Just rename it first and then dump them > and > > >> then > > >> read them from the hdfs file? > > >> > > >> or should I do it my way? which way is faster? > > >> Thanks. > > >> > > >> > > >> Edward J. Yoon-2 wrote: > > >>> > > >>> Hadoop is a distributed filesystem. If you wanted to backup your > table > > >>> data to hdfs, you can use SELECT * INTO OUTFILE 'file_name' FROM > > >>> tbl_name; Then, put it to hadoop dfs. > > >>> > > >>> Edward > > >>> > > >>> On Thu, May 21, 2009 at 12:08 AM, dealmaker <vin...@gmail.com> > wrote: > > >>>> > > >>>> No, actually I am using mysql. So it doesn't belong to Hive, I > think. > > >>>> > > >>>> > > >>>> owen.omalley wrote: > > >>>>> > > >>>>> > > >>>>> On May 19, 2009, at 11:48 PM, dealmaker wrote: > > >>>>> > > >>>>>> > > >>>>>> Hi, > > >>>>>> I want to backup a table and then create a new empty one with > > >>>>>> following > > >>>>>> commands in Hadoop. How do I do it in java? Thanks. > > >>>>> > > >>>>> Since this is a question about Hive, you should be asking on > > >>>>> hive-u...@hadoop.apache.org > > >>>>> . > > >>>>> > > >>>>> -- Owen > > >>>>> > > >>>>> > > >>>> > > >>>> -- > > >>>> View this message in context: > > >>>> > > > http://www.nabble.com/How-to-Rename---Create-DB-Table-in-Hadoop--tp23629956p23637131.html > > >>>> Sent from the Hadoop core-user mailing list archive at Nabble.com. > > >>>> > > >>>> > > >>> > > >>> > > >>> > > >>> -- > > >>> Best Regards, Edward J. Yoon @ NHN, corp. > > >>> edwardy...@apache.org > > >>> http://blog.udanax.org > > >>> > > >>> > > >> > > >> -- > > >> View this message in context: > > >> > > > http://www.nabble.com/How-to-Rename---Create-Mysql-DB-Table-in-Hadoop--tp23629956p23638051.html > > >> Sent from the Hadoop core-user mailing list archive at Nabble.com. > > >> > > >> > > > > > > > > > > > > -- > > > Best Regards, Edward J. Yoon @ NHN, corp. > > > edwardy...@apache.org > > > http://blog.udanax.org > > > > > > > > > > -- > > View this message in context: > > > http://www.nabble.com/How-to-Rename---Create-Mysql-DB-Table-in-Hadoop--tp23629956p23639294.html > > Sent from the Hadoop core-user mailing list archive at Nabble.com. > > > > >