Shweta, if you need responses within a certain timeframe, you may want to 
investigate commercial vendor support. The Apache NiFi open source community 
attempts to answer any questions we can, but makes no guarantee about accuracy, 
availability, or response time. 


Andy LoPresto
alopre...@apache.org
alopresto.apa...@gmail.com
He/Him
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Jun 29, 2020, at 4:24 AM, shweta soni <sssmb...@gmail.com> wrote:
> 
> Hello Team,
> 
> Request you to please help me with the issues mentioned in the trailing
> mail. Thanking you in anticipation.
> 
> Thanks & Regards
> Shweta Soni
> 
> Am Do., 25. Juni 2020 um 23:36 Uhr schrieb shweta soni <sssmb...@gmail.com>:
> 
>> Hello Team,
>> 
>> 
>> 
>> We are using Nifi in our data ingestion process. The version details are:  
>> *Nifi
>> 1.11.4 , Cloudera Enterprise 5.16.2 and Hive 1.1*. I posted my issues on
>> Nifi SLACK channel, but did not get answer for some of my questions. So I
>> am posting all my queries here in ardent hope of getting
>> solutions/workarounds for them.  We are facing below issues:
>> 
>> 
>> 
>> 
>> 
>> 1.       *SCENARIO*: In RDBMS Source we have Date/Timestamp column. In
>> Hive Destination we have Date/Timestamp columns but when we are trying to
>> ingest from source to destination, we are getting Int/Longwritable cannot
>> be written to Date/Timestamp errors in Hue. We are using following
>> processors:  QueryDatabaseProcessor à UpdateRecord( column mapping and
>> output schema) à PutHDFS à ReplaceText à PutHiveQL. Below are the Avro
>> Output Schema since we don’t have Date or Timestamp as datatype in Avro
>> Schema.
>> 
>>       
>> {"name":"dob","type":["null",{"type":"long","logicalType":"timestamp-millis"}
>> 
>>        {"name":"doA","type":["null",{"type":"int","logicalType":"date"}
>> 
>> 
>>       *Q. Please let me know how can we put data/timestamp source columns to 
>> data/timestamp destination columns?*
>> 
>> 
>> 
>> 
>> 
>> 
>> 2.       *SCENARIO* : Decimal data is not being inserted in ORC table.
>> 
>> Solution: I am loading data in Avro table and then doing INSERT INTO ORC
>> table from it. This solution I found from Cloudera community.
>> 
>> 
>> *Q. Is there any other solution for loading decimal data in ORC table?*
>> 
>> 
>> 
>> 
>> 
>> 
>> *3.       **SCENARIO**: *We have a 1 time full load flow in Nifi – 
>> QueryDatabase à PutHiveQL. à LogAttribute. This acts as a pipeline in our 
>> custom based UI. This will run only once. In Nifi UI we can manually start 
>> processors to start the flow and once all the flowfiles are processed and 
>> the success queue of PutHiveQL becomes empty we can stop the processor in 
>> Nifi UI. But now we want to know programmatically that this flow ended at 
>> particular time and we want to show the pipeline status as completed in our 
>> custom based UI. So how can we stimulate this?
>> 
>> 
>> 
>> *        Q. *Since Nifi is for continuous data transfer, how can we know 
>> that a particular flow has ended?
>> 
>> 
>> 
>> 
>> 
>> 
>> 4.       *SCENARIO** :*I have Hive table with complex datatypes i.e. Array, 
>> Map. When I am trying to get this data via SELECTHIVEQL processor , it is 
>> giving output in String format for all the columns. Then in next 
>> UpdateProcessor it is giving error that string datatype cannot be converted 
>> to Array or Map.
>> 
>> Avro Output Schema:
>> 
>> {"type": "array", "items": "double"}
>> 
>> {"type": "map", "values": "int"}
>> 
>> 
>> *Q. How to handle complex datatype in Hive via Nifi. Source table as Hive 
>> and destination table as another Hive table.*
>> 
>> 
>> 
>> 
>> 
>> 
>> 5.       *SCENARIO*: In QueryDatabase Processor we have Max-value column 
>> which helps in incremental load. But there is not such functionality for 
>> Hive table incremental load(i.e.SELECTHIVEQL). I tried with 
>> GenerateTableFetch processor and QueryDatabase processor using Hive1_1 
>> Connection service but it is not working. I was told on Nifi SLACK channel 
>> to raise JIRA for new processor GenerateHiveTableFetch/QueryHiveDatabase 
>> processor.
>> 
>> 
>> 
>> *Q. Is there any alternative in which we can handle hive table incremental 
>> load or should I go ahead and raise JIRA for the same?*
>> 
>> 
>> 
>> Request you to please help us. Thanking you in anticipation.
>> 
>> 
>> 
>> 
>> 
>> Thanks & Regards,
>> Shweta Soni
>> 

Reply via email to