Prabhu, I believe it's a common practice to store export data files in HDFS in the CSV format, or TSV (tab-separated), or something similar with perhaps another type of delimiter.
When you create a table in Hive or Impala you can specify what the delimiter character is. So no, the format doesn't have to be Avro or JSON. Here's an example: CREATE EXTERNAL TABLE webpages (pageid SMALLINT, name STRING, assoc_files STRING) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' LOCATION '/data/myfiles' However, you may want to think about how these files will be consumed, and such issues as compression. Here's a link to one article <http://www.inquidia.com/news-and-info/hadoop-file-formats-its-not-just-csv-anymore> that ponders these issues. I think that's a great summary which also helps one decide which route to take. - Dmitry On Fri, Apr 1, 2016 at 2:53 AM, prabhu Mahendran <prabhuu161...@gmail.com> wrote: > Is there is any way to store the exact data in hdfs from > databases(oracle,mysql,sqlserver) without convert the data into avro or > json?. > > On Wed, Mar 30, 2016 at 2:51 PM, Simon Ball <sb...@hortonworks.com> wrote: > >> Are you planning to use something like Hive or Spark to query the data? >> Both will work fine with Avro formatted data under a table. I’m not sure >> what you mean by “Table Structure” or if you have a particular format in >> mind, but there is I believe talk of adding processors that will write >> direct to ORC format so convert the Avro data to ORC within NiFi. >> >> Simon >> >> On 30 Mar 2016, at 07:06, prabhu Mahendran <prabhuu161...@gmail.com> >> wrote: >> >> For Below reasons i have choose Sqoop in NIFI Processor is the best >> method to move data in Table Structure. >> >> If once move the Table from oracle or sql server into HDFS then whole >> moved data which must be in Table format not in avro or >> json..etc. >> >> For Example:Table Data from Oracle which is in form of Table >> Structure and using Execute SQL to move those data into HDFS which >> is in avro or json format.but i need that data in Table Structure. >> >> And I have try QueryDatabaseTable Processor in nifi-0.6.0 It can return >> the Table record in avro format but i need those data in Table Structure. >> >> So anyone please help me to solve this. >> >> >> >> >> >> On Tue, Mar 29, 2016 at 3:02 PM, Simon Ball <sb...@hortonworks.com> >> wrote: >> >>> Another processor that may be of interest to you is the >>> QueryDatabaseTable processor, which has just been released in 0.6.0. This >>> provides incremental load capabilities similar to sqoop. >>> >>> If you’re looking for the schema type functionality, bear in mind that >>> the ExecuteSQL (and new Query processor) preserve schema with Avro. >>> >>> Sqoop also allows import to HBase, which you can do with PutHBaseJson >>> (use the ConvertAvroToJson processor to feed this). >>> >>> Distributed partitoned queries isn’t in there yet, but I believe is on >>> the way, so sqoop may have the edge for that use case today. >>> >>> Granted, NiFi doesn’t have much by way of HCatalog integration at the >>> moment, but most of the functionality you’ll find in Sqoop is in NiFi. >>> Unless you are looking to move terabytes at a time, then NiFi should be >>> able to handle most of what you would use sqoop for, so it would be very >>> interesting to hear more detail on your use case, and why you needed sqoop >>> on top of NiFi. >>> >>> Simon >>> >>> >>> On 29 Mar 2016, at 09:06, prabhu Mahendran <prabhuu161...@gmail.com> >>> wrote: >>> >>> Hi, >>> >>> Yes, In my case i have created the Custom processor with Sqoop API which >>> accommodates complete functionality of sqoop. >>> As per you concern we have able to move the data only from HDFS to SQl >>> or Vice versa, But sqoop having more functionality which we can achieve it >>> by Sqoop.RunTool() in org.apache.sqoop.sqoop. The Sqoop Java client will >>> works well and Implement that API into new Sqoop NIFI processor Doesn't >>> work! >>> >>> On Tue, Mar 29, 2016 at 12:49 PM, Conrad Crampton < >>> conrad.cramp...@secdata.com> wrote: >>> >>>> Hi, >>>> If you could explain exactly what you are trying to achieve I.e. What >>>> part of the data pipeline you are looking to use NiFi for and where you >>>> wish to retain Sqoop I could perhaps have a more informed input (although I >>>> have only been using NiFi myself for a few weeks). Sqoop obviously can move >>>> the data from RDBM systems through to HDFS (and vice versa) as can NiFi, >>>> not sure why you would want the mix (or at least I can’t see it from the >>>> description you have provided thus far). >>>> I have limited knowledge of Sqoop, but either way, I am sure you could >>>> ‘drive’ Sqoop from a custom NiFi processor if you so choose, and you can >>>> ‘drive’ NiFi externally (using the REST api) - if Sqoop can consume it. >>>> Regards >>>> Conrad >>>> >>>> >>>> From: prabhu Mahendran <prabhuu161...@gmail.com> >>>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org> >>>> Date: Tuesday, 29 March 2016 at 07:55 >>>> To: "users@nifi.apache.org" <users@nifi.apache.org> >>>> Subject: Re: Sqoop Support in NIFI >>>> >>>> Hi Conrad, >>>> >>>> Thanks for Quick Response. >>>> >>>> Yeah.Combination of Execute SQL and Put HDFS works well instead >>>> of Sqoop.But is there any possible to use Sqoop(client) to do like this? >>>> >>>> Prabhu Mahendran >>>> >>>> On Tue, Mar 29, 2016 at 12:04 PM, Conrad Crampton < >>>> conrad.cramp...@secdata.com> wrote: >>>> >>>>> Hi, >>>>> Why use sqoop at all? Use a combination of ExecuteSQL [1] and PutHDFS >>>>> [2]. >>>>> I have just replace the use of Flume using a combination of >>>>> ListenSyslog and PutHDFS which I guess is a similar architectural pattern. >>>>> HTH >>>>> Conrad >>>>> >>>>> >>>>> >>>>> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html >>>>> [1] >>>>> >>>>> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.PutHDFS/index.html >>>>> [2] >>>>> >>>>> From: prabhu Mahendran <prabhuu161...@gmail.com> >>>>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org> >>>>> Date: Tuesday, 29 March 2016 at 07:27 >>>>> To: "users@nifi.apache.org" <users@nifi.apache.org> >>>>> Subject: Sqoop Support in NIFI >>>>> >>>>> Hi, >>>>> >>>>> I am new to nifi. >>>>> >>>>> I have to know that "Is there is any Support for Sqoop with >>>>> help of NIFI Processors?." >>>>> >>>>> And in which way to done the following case with help of Sqoop. >>>>> >>>>> Move data from oracle,SqlServer,MySql into HDFS and vice versa. >>>>> >>>>> >>>>> Thanks, >>>>> Prabhu Mahendran >>>>> >>>>> >>>>> >>>>> >>>>> ***This email originated outside SecureData*** >>>>> >>>>> Click here <https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==> >>>>> to report this email as spam. >>>>> >>>>> >>>>> SecureData, combating cyber threats >>>>> >>>>> ------------------------------ >>>>> >>>>> The information contained in this message or any of its attachments >>>>> may be privileged and confidential and intended for the exclusive use of >>>>> the intended recipient. If you are not the intended recipient any >>>>> disclosure, reproduction, distribution or other dissemination or use of >>>>> this communications is strictly prohibited. The views expressed in this >>>>> email are those of the individual and not necessarily of SecureData Europe >>>>> Ltd. Any prices quoted are only valid if followed up by a formal written >>>>> quote. >>>>> >>>>> SecureData Europe Limited. Registered in England & Wales 04365896. >>>>> Registered Address: SecureData House, Hermitage Court, Hermitage Lane, >>>>> Maidstone, Kent, ME16 9NT >>>>> >>>> >>>> >>> >>> >> >> >