Hi, I configured hadoop-1.0.3, hbase-0.92.1 and hive-0.10.0 . Created table in HBase.Inserted records.Processing the data using Hive. I have to show a graph with some points ( 7 - 7 days or 12 for one year).In one day records may include 1000 - lacks.I need to show average of these 1000 - lacks records.is there any built in haddop mechanism to process these records fast.
Also I need to run a hive query or job (when we run a hive query actually a job is submitting) in every 1 hour.Is there a scheduling mechanism in hadoop to handle thsese Please reply. Balamurali On Tue, Jul 23, 2013 at 12:24 PM, Mohammad Tariq <donta...@gmail.com> wrote: > Hello Sandeep, > > You don't have to convert the data in order to copy it into the HDFS. But > you might have to think about the MR processing of these files because of > the format of these files. > > You could probably make use of Sqoop <http://sqoop.apache.org/>. > > I also came across DMX-H a few days ago while browsing. I don't know > anything about the licensing and how good it is. Just thought of sharing it > with you. You can visit their > page<http://www.syncsort.com/en/Data-Integration/Home>to see more. They also > provide a VM(includes CDH) to get started quickly. > > Warm Regards, > Tariq > cloudfront.blogspot.com > > > On Tue, Jul 23, 2013 at 11:54 AM, Sandeep Nemuri <nhsande...@gmail.com>wrote: > >> Hi , >> >> "How to copy datasets from Mainframe to HDFS directly? I know that we >> can NDM files to Linux box and then we can use simple put command to copy >> data to HDFS. But, how to copy data directly from mainframe to HDFS? I >> have PS, PDS and VSAM datasets to copy to HDFS for analysis using >> MapReduce. >> >> Also, Do we need to convert data from EBCDIC to ASCII before copy? " >> >> -- >> --Regards >> Sandeep Nemuri >> > >