Prabhu,

I believe it's a common practice to store export data files in HDFS in the
CSV format, or TSV (tab-separated), or something similar with perhaps
another type of delimiter.

When you create a table in Hive or Impala you can specify what the
delimiter character is.  So no, the format doesn't have to be Avro or JSON.

Here's an example:

CREATE EXTERNAL TABLE webpages
(pageid SMALLINT, name STRING, assoc_files STRING)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY '\t'
LOCATION '/data/myfiles'

However, you may want to think about how these files will be consumed, and
such issues as compression. Here's a link to one article
<http://www.inquidia.com/news-and-info/hadoop-file-formats-its-not-just-csv-anymore>
that ponders these issues.  I think that's a great summary which also helps
one decide which route to take.

- Dmitry


On Fri, Apr 1, 2016 at 2:53 AM, prabhu Mahendran <prabhuu161...@gmail.com>
wrote:

> Is there is any way to store the exact data in hdfs from
> databases(oracle,mysql,sqlserver) without convert the data into avro or
> json?.
>
> On Wed, Mar 30, 2016 at 2:51 PM, Simon Ball <sb...@hortonworks.com> wrote:
>
>> Are you planning to use something like Hive or Spark to query the data?
>> Both will work fine with Avro formatted data under a table. I’m not sure
>> what you mean by “Table Structure” or if you have a particular format in
>> mind, but there is I believe talk of adding processors that will write
>> direct to ORC format so convert the Avro data to ORC within NiFi.
>>
>> Simon
>>
>> On 30 Mar 2016, at 07:06, prabhu Mahendran <prabhuu161...@gmail.com>
>> wrote:
>>
>> For Below reasons i have choose Sqoop in NIFI Processor is the best
>> method to move data in Table Structure.
>>
>>     If once move the Table from oracle or sql server into HDFS then whole
>> moved data which must be in Table format             not in avro or
>> json..etc.
>>
>>     For Example:Table Data from Oracle which is in form of Table
>> Structure and using Execute SQL to move those data         into HDFS  which
>> is in avro or json format.but i need that data in Table Structure.
>>
>> And I have try QueryDatabaseTable Processor in nifi-0.6.0 It can return
>> the Table record in avro format but i need those data in Table Structure.
>>
>> So anyone please help me to solve this.
>>
>>
>>
>>
>>
>> On Tue, Mar 29, 2016 at 3:02 PM, Simon Ball <sb...@hortonworks.com>
>> wrote:
>>
>>> Another processor that may be of interest to you is the
>>> QueryDatabaseTable processor, which has just been released in 0.6.0. This
>>> provides incremental load capabilities similar to sqoop.
>>>
>>> If you’re looking for the schema type functionality, bear in mind that
>>> the ExecuteSQL (and new Query processor) preserve schema with Avro.
>>>
>>> Sqoop also allows import to HBase, which you can do with PutHBaseJson
>>> (use the ConvertAvroToJson processor to feed this).
>>>
>>> Distributed partitoned queries isn’t in there yet, but I believe is on
>>> the way, so sqoop may have the edge for that use case today.
>>>
>>> Granted, NiFi doesn’t have much by way of HCatalog integration at the
>>> moment, but most of the functionality you’ll find in Sqoop is in NiFi.
>>> Unless you are looking to move terabytes at a time, then NiFi should be
>>> able to handle most of what you would use sqoop for, so it would be very
>>> interesting to hear more detail on your use case, and why you needed sqoop
>>> on top of NiFi.
>>>
>>> Simon
>>>
>>>
>>> On 29 Mar 2016, at 09:06, prabhu Mahendran <prabhuu161...@gmail.com>
>>> wrote:
>>>
>>> Hi,
>>>
>>> Yes, In my case i have created the Custom processor with Sqoop API which
>>> accommodates complete functionality of sqoop.
>>> As per you concern we have able to move the data only from HDFS to SQl
>>> or Vice versa, But sqoop having more functionality which we can achieve it
>>> by Sqoop.RunTool() in org.apache.sqoop.sqoop. The Sqoop Java client will
>>> works well and Implement that API into new Sqoop NIFI processor Doesn't
>>> work!
>>>
>>> On Tue, Mar 29, 2016 at 12:49 PM, Conrad Crampton <
>>> conrad.cramp...@secdata.com> wrote:
>>>
>>>> Hi,
>>>> If you could explain exactly what you are trying to achieve I.e. What
>>>> part of the data pipeline you are looking to use NiFi for and where you
>>>> wish to retain Sqoop I could perhaps have a more informed input (although I
>>>> have only been using NiFi myself for a few weeks). Sqoop obviously can move
>>>> the data from RDBM systems through to HDFS (and vice versa) as can NiFi,
>>>> not sure why you would want the mix (or at least I can’t see it from the
>>>> description you have provided thus far).
>>>> I have limited knowledge of Sqoop, but either way, I am sure you could
>>>> ‘drive’ Sqoop from a custom NiFi processor if you so choose, and you can
>>>> ‘drive’ NiFi externally (using the REST api) - if Sqoop can consume it.
>>>> Regards
>>>> Conrad
>>>>
>>>>
>>>> From: prabhu Mahendran <prabhuu161...@gmail.com>
>>>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>>>> Date: Tuesday, 29 March 2016 at 07:55
>>>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>>>> Subject: Re: Sqoop Support in NIFI
>>>>
>>>> Hi Conrad,
>>>>
>>>> Thanks for Quick Response.
>>>>
>>>> Yeah.Combination of Execute SQL and Put HDFS works well instead
>>>> of Sqoop.But is there any possible to use Sqoop(client) to do like this?
>>>>
>>>> Prabhu Mahendran
>>>>
>>>> On Tue, Mar 29, 2016 at 12:04 PM, Conrad Crampton <
>>>> conrad.cramp...@secdata.com> wrote:
>>>>
>>>>> Hi,
>>>>> Why use sqoop at all? Use a combination of ExecuteSQL [1] and PutHDFS
>>>>> [2].
>>>>> I have just replace the use of Flume using a combination of
>>>>> ListenSyslog and PutHDFS which I guess is a similar architectural pattern.
>>>>> HTH
>>>>> Conrad
>>>>>
>>>>>
>>>>>
>>>>> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.standard.ExecuteSQL/index.html
>>>>>  [1]
>>>>>
>>>>> http://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi.processors.hadoop.PutHDFS/index.html
>>>>>  [2]
>>>>>
>>>>> From: prabhu Mahendran <prabhuu161...@gmail.com>
>>>>> Reply-To: "users@nifi.apache.org" <users@nifi.apache.org>
>>>>> Date: Tuesday, 29 March 2016 at 07:27
>>>>> To: "users@nifi.apache.org" <users@nifi.apache.org>
>>>>> Subject: Sqoop Support in NIFI
>>>>>
>>>>> Hi,
>>>>>
>>>>> I am new to nifi.
>>>>>
>>>>>        I have to know that  "Is there is any Support for Sqoop with
>>>>> help of NIFI Processors?."
>>>>>
>>>>> And in which way to done the following case with help of Sqoop.
>>>>>
>>>>>     Move data from oracle,SqlServer,MySql into HDFS and vice versa.
>>>>>
>>>>>
>>>>> Thanks,
>>>>> Prabhu Mahendran
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> ***This email originated outside SecureData***
>>>>>
>>>>> Click here <https://www.mailcontrol.com/sr/MZbqvYs5QwJvpeaetUwhCQ==>
>>>>> to report this email as spam.
>>>>>
>>>>>
>>>>> SecureData, combating cyber threats
>>>>>
>>>>> ------------------------------
>>>>>
>>>>> The information contained in this message or any of its attachments
>>>>> may be privileged and confidential and intended for the exclusive use of
>>>>> the intended recipient. If you are not the intended recipient any
>>>>> disclosure, reproduction, distribution or other dissemination or use of
>>>>> this communications is strictly prohibited. The views expressed in this
>>>>> email are those of the individual and not necessarily of SecureData Europe
>>>>> Ltd. Any prices quoted are only valid if followed up by a formal written
>>>>> quote.
>>>>>
>>>>> SecureData Europe Limited. Registered in England & Wales 04365896.
>>>>> Registered Address: SecureData House, Hermitage Court, Hermitage Lane,
>>>>> Maidstone, Kent, ME16 9NT
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>

Reply via email to