MapReduce Job State: PREP over 8 hours, state no change

2016-08-05 Thread Ascot Moss
Hi,

I have submitted a mapreduce  job, and can find it from job list, however I
find its STATE is PREP over last 8 hours, any idea why it takes so long to
"PREP"?

regards



(mapred job -list)

  JobId State StartTime UserName   Queue
Priority UsedContainers RsvdContainers UsedMem RsvdMem NeededMem   AM info

 job_1470075140254_0003   PREP 1470402873895


Re: Teradata into hadoop Migration

2016-08-05 Thread Arun Natva
Bhagaban,
First step is to ingest data into Hadoop using sqoop.
Teradata has powerful connectors to Hadoop where the connectors are to be 
installed on all data nodes and then run imports using fast export etc., 

Challenge would be to create the same workflows in Hadoop that you had in 
teradata.

Teradata is rich in features compared to Hive & Impala.

Mostly data in teradata is encrypted so pls make sure you have HDFS encryption 
at rest enabled.

You can use oozie to create a chain of SQLs to mimic your ETL jobs written in 
Datastage or TD itself or Informatica.

Please note that TD may perform better than Hadoop since it has proprietary 
hardware and software which is efficient, Hadoop can save you money.


Sent from my iPhone

> On Aug 5, 2016, at 12:02 PM, praveenesh kumar  wrote:
> 
> From TD perspective have a look at this - https://youtu.be/NTTQdAfZMJA They 
> are planning to opensource it. Perhaps you can get in touch with the team. 
> Let me know if you are interested. If you are TD contacts, ask about this, 
> they should be able to point to the right people.
> 
> Again, this is not sales pitch. This tool looks like what you are looking for 
> and will be open source soon. Let me know if you want to get in touch with 
> the folks you are working on this. 
> 
> Regards
> Prav
> 
>> On Fri, Aug 5, 2016 at 4:29 PM, Wei-Chiu Chuang  wrote:
>> Hi,
>> 
>> I think Cloudera Navigator Optimizer is the tool you are looking for. It 
>> allows you to transform SQL queries (TD) into Impala and Hive.
>> http://blog.cloudera.com/blog/2015/11/introducing-cloudera-navigator-optimizer-for-optimal-sql-workload-efficiency-on-apache-hadoop/
>> Hope this doesn’t sound like a sales pitch. If you’re a Cloudera paid 
>> customer you should reach out to the account/support team for more 
>> information.
>> 
>> *disclaimer: I work for Cloudera
>> 
>> Wei-Chiu Chuang
>> A very happy Clouderan
>> 
>>> On Aug 4, 2016, at 10:50 PM, Rakesh Radhakrishnan  
>>> wrote:
>>> 
>>> Sorry, I don't have much insight about this apart from basic Sqoop. AFAIK, 
>>> it is more of vendor specific, you may need to dig more into that line.
>>> 
>>> Thanks,
>>> Rakesh
>>> 
 On Mon, Aug 1, 2016 at 11:38 PM, Bhagaban Khatai 
  wrote:
 Thanks Rakesh for the useful information. But we are using sqoop for data 
 transfer but all TD logic we are implementing thru Hive.
 But it's taking time by using mapping provided by TD team and the same 
 logic we are implementing.
 
 What I want some tool or ready-made framework so that development effort 
 would be less.
 
 Thanks in advance for your help.
 
 Bhagaban 
 
> On Mon, Aug 1, 2016 at 6:07 PM, Rakesh Radhakrishnan  
> wrote:
> Hi Bhagaban,
> 
> Perhaps, you can try "Apache Sqoop" to transfer data to Hadoop from 
> Teradata. Apache Sqoop provides an efficient approach for transferring 
> large data between Hadoop related systems and structured data stores. It 
> allows support for a data store to be added as a so-called connector and 
> can connect to various databases including Oracle etc.
> 
> I hope the below links will be helpful to you,
> http://sqoop.apache.org/
> http://blog.cloudera.com/blog/2012/01/cloudera-connector-for-teradata-1-0-0/
> http://hortonworks.com/blog/round-trip-data-enrichment-teradata-hadoop/
> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
> 
> Below are few data ingestion tools, probably you can dig more into it,
> https://www.datatorrent.com/product/datatorrent-ingestion/
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
> 
> Thanks,
> Rakesh
> 
>> On Mon, Aug 1, 2016 at 4:54 PM, Bhagaban Khatai 
>>  wrote:
>> Hi Guys-
>> 
>> I need a quick help if anybody done any migration project in TD into 
>> hadoop.
>> We have very tight deadline and I am trying to find any tool (online or 
>> paid) for quick development.
>> 
>> Please help us here and guide me if any other way is available to do the 
>> development fast.
>> 
>> Bhagaban
> 


Re: Teradata into hadoop Migration

2016-08-05 Thread praveenesh kumar
>From TD perspective have a look at this - https://youtu.be/NTTQdAfZMJA They
are planning to opensource it. Perhaps you can get in touch with the team.
Let me know if you are interested. If you are TD contacts, ask about this,
they should be able to point to the right people.

Again, this is not sales pitch. This tool looks like what you are looking
for and will be open source soon. Let me know if you want to get in touch
with the folks you are working on this.

Regards
Prav

On Fri, Aug 5, 2016 at 4:29 PM, Wei-Chiu Chuang 
wrote:

> Hi,
>
> I think Cloudera Navigator Optimizer is the tool you are looking for. It
> allows you to transform SQL queries (TD) into Impala and Hive.
> http://blog.cloudera.com/blog/2015/11/introducing-cloudera-
> navigator-optimizer-for-optimal-sql-workload-efficiency-on-apache-hadoop/
> Hope this doesn’t sound like a sales pitch. If you’re a Cloudera paid
> customer you should reach out to the account/support team for more
> information.
>
> *disclaimer: I work for Cloudera
>
> Wei-Chiu Chuang
> A very happy Clouderan
>
> On Aug 4, 2016, at 10:50 PM, Rakesh Radhakrishnan 
> wrote:
>
> Sorry, I don't have much insight about this apart from basic Sqoop. AFAIK,
> it is more of vendor specific, you may need to dig more into that line.
>
> Thanks,
> Rakesh
>
> On Mon, Aug 1, 2016 at 11:38 PM, Bhagaban Khatai  > wrote:
>
>> Thanks Rakesh for the useful information. But we are using sqoop for data
>> transfer but all TD logic we are implementing thru Hive.
>> But it's taking time by using mapping provided by TD team and the same
>> logic we are implementing.
>>
>> What I want some tool or ready-made framework so that development effort
>> would be less.
>>
>> Thanks in advance for your help.
>>
>> Bhagaban
>>
>> On Mon, Aug 1, 2016 at 6:07 PM, Rakesh Radhakrishnan 
>> wrote:
>>
>>> Hi Bhagaban,
>>>
>>> Perhaps, you can try "Apache Sqoop" to transfer data to Hadoop from
>>> Teradata. Apache Sqoop provides an efficient approach for transferring
>>> large data between Hadoop related systems and structured data stores. It
>>> allows support for a data store to be added as a so-called connector and
>>> can connect to various databases including Oracle etc.
>>>
>>> I hope the below links will be helpful to you,
>>> http://sqoop.apache.org/
>>> http://blog.cloudera.com/blog/2012/01/cloudera-connector-for
>>> -teradata-1-0-0/
>>> http://hortonworks.com/blog/round-trip-data-enrichment-teradata-hadoop/
>>> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-
>>> 123ApproachtoTeradataOffloadwithHadoop.pdf
>>>
>>> Below are few data ingestion tools, probably you can dig more into it,
>>> https://www.datatorrent.com/product/datatorrent-ingestion/
>>> https://www.datatorrent.com/dtingest-unified-streaming-batch
>>> -data-ingestion-hadoop/
>>>
>>> Thanks,
>>> Rakesh
>>>
>>> On Mon, Aug 1, 2016 at 4:54 PM, Bhagaban Khatai <
>>> email.bhaga...@gmail.com> wrote:
>>>
 Hi Guys-

 I need a quick help if anybody done any migration project in TD into
 hadoop.
 We have very tight deadline and I am trying to find any tool (online or
 paid) for quick development.

 Please help us here and guide me if any other way is available to do
 the development fast.

 Bhagaban

>>>
>>>
>>
>
>


Re: Teradata into hadoop Migration

2016-08-05 Thread Wei-Chiu Chuang
Hi,

I think Cloudera Navigator Optimizer is the tool you are looking for. It allows 
you to transform SQL queries (TD) into Impala and Hive.
http://blog.cloudera.com/blog/2015/11/introducing-cloudera-navigator-optimizer-for-optimal-sql-workload-efficiency-on-apache-hadoop/
 

Hope this doesn’t sound like a sales pitch. If you’re a Cloudera paid customer 
you should reach out to the account/support team for more information.

*disclaimer: I work for Cloudera

Wei-Chiu Chuang
A very happy Clouderan

> On Aug 4, 2016, at 10:50 PM, Rakesh Radhakrishnan  wrote:
> 
> Sorry, I don't have much insight about this apart from basic Sqoop. AFAIK, it 
> is more of vendor specific, you may need to dig more into that line.
> 
> Thanks,
> Rakesh
> 
> On Mon, Aug 1, 2016 at 11:38 PM, Bhagaban Khatai  > wrote:
> Thanks Rakesh for the useful information. But we are using sqoop for data 
> transfer but all TD logic we are implementing thru Hive.
> But it's taking time by using mapping provided by TD team and the same logic 
> we are implementing.
> 
> What I want some tool or ready-made framework so that development effort 
> would be less.
> 
> Thanks in advance for your help.
> 
> Bhagaban 
> 
> On Mon, Aug 1, 2016 at 6:07 PM, Rakesh Radhakrishnan  > wrote:
> Hi Bhagaban,
> 
> Perhaps, you can try "Apache Sqoop" to transfer data to Hadoop from Teradata. 
> Apache Sqoop provides an efficient approach for transferring large data 
> between Hadoop related systems and structured data stores. It allows support 
> for a data store to be added as a so-called connector and can connect to 
> various databases including Oracle etc.
> 
> I hope the below links will be helpful to you,
> http://sqoop.apache.org/ 
> http://blog.cloudera.com/blog/2012/01/cloudera-connector-for-teradata-1-0-0/ 
> 
> http://hortonworks.com/blog/round-trip-data-enrichment-teradata-hadoop/ 
> 
> http://dataconomy.com/wp-content/uploads/2014/06/Syncsort-A-123ApproachtoTeradataOffloadwithHadoop.pdf
>  
> 
> 
> Below are few data ingestion tools, probably you can dig more into it,
> https://www.datatorrent.com/product/datatorrent-ingestion/ 
> 
> https://www.datatorrent.com/dtingest-unified-streaming-batch-data-ingestion-hadoop/
>  
> 
> 
> Thanks,
> Rakesh
> 
> On Mon, Aug 1, 2016 at 4:54 PM, Bhagaban Khatai  > wrote:
> Hi Guys-
> 
> I need a quick help if anybody done any migration project in TD into hadoop.
> We have very tight deadline and I am trying to find any tool (online or paid) 
> for quick development.
> 
> Please help us here and guide me if any other way is available to do the 
> development fast.
> 
> Bhagaban
> 
> 
>