[rtc-linux] 1) Big Data with ETL 2) Informatica (Big Data) Looking for Locals only Columbus, OH

Ramesh recruiter Fri, 08 Jan 2021 05:56:27 -0800

*1)*
*Job Title:Big Data & ETL  Looking for Locals only Columbus, OH*




*Location: Columbus, OHDuration: Long TermResponsibilities:*
• Participate in Team activities, Design discussions, Stand up meetings and
planning Review with team.
• Perform data analysis, data profiling, data quality and data ingestion in
various layers using big data/Hadoop/Hive/Impala queries, PySpark programs
and UNIX shell scripts.
• Follow the organization coding standard document, Create mappings,
sessions and workflows as per the mapping specification document.
• Perform Gap and impact analysis of ETL and IOP jobs for the new
requirement and enhancements.
• Create jobs in Hadoop using SQOOP, PYSPARK and Stream Sets to meet the
business user needs.
• Create mockup data, perform Unit testing and capture the result sets
against the jobs developed in lower environment.
• Updating the production support Run book, Control M schedule document as
per the production release.
• Create and update design documents, provide detail description about
workflows after every production release.
• Continuously monitor the production data loads, fix the issues, update
the tracker document with the issues, Identify the performance issues.
• Performance tuning long running ETL/ELT jobs by creating partitions,
enabling full load and other standard approaches.
• Perform Quality assurance check, Reconciliation post data loads and
communicate to vendor for receiving fixed data.
• Participate in ETL/ELT code review and design re-usable frameworks.
• Create Remedy/Service Now tickets to fix production issues, create
Support Requests to deploy Database, Hadoop, Hive, Impala, UNIX, ETL/ELT
and SAS code to UAT environment.
• Create Remedy/Service Now tickets and/or incidents to trigger Control M
jobs for FTP and ETL/ELT jobs on ADHOC, daily, weekly, Monthly and
quarterly basis as needed.
• Model and create STAGE / ODS / Data warehouse Hive and Impala tables as
and when needed.
• Create Change requests, workplan, Test results, BCAB checklist documents
for the code deployment to production environment and perform the code
validation post deployment.
• Work with Hadoop Admin, ETL and SAS admin teams for code deployments and
health checks.
• Create re-usable UNIX shell scripts for file archival, file validations
and Hadoop workflow looping.
• Create re-usable framework for Audit Balance Control to capture
Reconciliation, mapping parameters and variables, serves as single point of
reference for workflows.
• Create PySpark programs to ingest historical and incremental data.
• Create SQOOP scripts to ingest historical data from EDW oracle database
to Hadoop IOP, created HIVE tables and Impala views creation scripts for
Dimension tables.
• Participate in meetings to continuously upgrade the Functional and
technical expertise.

* REQUIRED Skill Sets:*

• 6+ years of experience with Big Data, Hadoop on Data Warehousing or Data
Integration projects.
• Analysis, Design, development, support and Enhancements of ETL/ELT in
data warehouse environment with Cloudera Bigdata Technologies (Hadoop,
MapReduce, Sqoop, PySpark, Spark, HDFS, Hive, Impala, StreamSets, Kudu,
Oozie, Hue, Kafka, Yarn, Python, Flume, Zookeeper, Sentry, Cloudera
Navigator) along with Oracle SQL/PL-SQL, Unix commands and shell scripting;
• Strong development experience in creating Sqoop scripts, PySpark
programs, HDFS commands, HDFS file formats (Parquet, Avro, ORC etc.),
StreamSets pipeline creation, jobs scheduling, hive/impala queries, Unix
commands, scripting and shell scripting etc.
• Writing Hadoop/Hive/Impala scripts for gathering stats on table post data
loads.
• Strong SQL experience (Oracle and Hadoop (Hive/Impala etc.)).
• Writing complex SQL queries and performed tuning based on the
Hadoop/Hive/Impala explain plan results.
• Proven ability to write high quality code.
• 6+ years of experience building data sets and familiarity with PHI and
PII data.
• Expertise implementing complex ETL/ELT logic.
• Develop and enforce strong reconciliation process.
• Accountable for ETL/ELT design documentation.
• Good knowledge of Big Data, Hadoop, Hive, Impala database, data security
and dimensional model design.
• Basic knowledge of UNIX/LINUX shell scripting.
• Utilize ETL/ELT standards and practices towards establishing and
following centralized metadata repository.
• Good experience in working with Visio, Excel, PowerPoint, Word, etc.
• Effective communication, presentation, & organizational skills.
• Familiar with Project Management methodologies like Waterfall and Agile
• Ability to establish priorities & follow through on projects, paying
close attention to detail with minimal supervision.
• Required Education: BS/BA degree or combination of education & experience

*DESIRED Skill Sets:*

• Demonstrate effective leadership, analytical and problem-solving skills
• Required excellent written and oral communication skills with technical
and business teams.
• Ability to work independently, as well as part of a team
• Stay abreast of current technologies in area of IT assigned
• Establish facts and draw valid conclusions
• Recognize patterns and opportunities for improvement throughout the
entire organization
• Ability to discern critical from minor problems and innovate new solutions

*2)*




*Job Title: Informatica (Big Data)  Looking for Locals only Columbus,
OH  Location: Columbus, OHDuration: Long TermResponsibilities:*

• Participate in Team activities, Design discussions, Stand up meetings and
planning Review with team.
• Perform data analysis, data profiling, data quality and data ingestion in
various layers using Database queries, Informatica PowerCenter, Informatica
Analyst score cards, Pyspark programs and UNIX shell scripts.
• Follow the organization coding standard document, Create mappings,
sessions and workflows as per the mapping specification document.
• Perform Gap and impact analysis of ETL and IOP jobs for the new
requirement and enhancements.
• Create jobs in Informatica PowerCenter, Informatica Developer IDQ, Hadoop
using SQOOP, PYSPARK and Stream Sets to meet the business user needs.
• Create mockup data, perform Unit testing and capture the result sets
against the jobs developed in lower environment.
• Updating the production support Run book, Control M schedule document as
per the production release.
• Create and update design documents, provide detail description about
workflows after every production release.
• Continuously monitor the production data loads, fix the issues, update
the tracker document with the issues, Identify the performance issues.
• Performance tuning long running ETL jobs by creating partitions, enabling
bulk load, increasing commit interval and other standard approaches.
• Perform Quality assurance check, Reconciliation post data loads and
communicate to vendor for receiving fixed data.
• Participate in ETL code review and design re-usable frameworks.
• Create Remedy incidents to fix production issues, create Support Requests
to deploy Database, UNIX, ETL and SAS code to UAT environment.
• Create Remedy incidents to trigger Control M jobs for FTP and ETL jobs on
ADHOC, weekly, Monthly and quarterly basis as needed.
• Model and create STAGE / ODS / Data warehouse dimension tables as and
when needed.
• Create Change requests, workplan, Test results, BCAB checklist documents
for the code deployment to production environment and perform the code
validation post deployment.
• Work with DBA, ETL and SAS admin teams for code deployments and health
checks.
• Create re-usable UNIX shell scripts for file archival, file validations
and informatica workflow looping.
• Create re-usable framework for Audit Balance Control to capture
Reconciliation, mapping parameters and variables, serves as single point of
reference for workflows.
• Create Pyspark programs to ingest historical and incremental data.
• Create SQOOP scripts to ingest historical data from EDW oracle database
to Hadoop IOP, created HIVE tables and Impala views creation scripts for
Dimension tables.
• Writing Data base Stored procedures for gathering stats on table post
data loads, enabling and disabling constraints and indexes.
• Writing complex SQL queries and performed tuning based on the explain
plan results.
• Extract unstructured and semi-structured data using data processor
transformation in IDQ.
• Participate in meetings to continuously upgrade the Functional and
technical expertise.

*REQUIRED Skill Sets:*

• 8+ years of experience with Informatica Power Center on Data Warehousing
or Data Integration projects
• Proven ability to write high quality code
• 7+ years of experience with Expertise implementing complex ETL logic
• 3+ years of experience Develop and enforce strong reconciliation process
• Accountable for ETL design documentation
• 5+ years of Strong SQL experience (prefer Oracle)
• 5+ years of Good knowledge of relational database, data vault and
dimensional model design
• 3+ years of Basic knowledge of UNIX/LINUX shell scripting
• Utilize ETL standards and practices towards establishing and following
centralized metadata repository
• Computer literacy with Excel, PowerPoint, Word, etc.
• Effective communication, presentation, & organizational skills
• Ability to establish priorities & follow through on projects, paying
close attention to detail with minimal supervision
• Required Education: BS/BA degree or combination of education & experience
• Familiar with Project Management methodologies like Waterfall and Agile
• Perform other duties as assigned
• Analysis, Design, development, support and Enhancements of ETL/ELT in
data warehouse environment with Cloudera Bigdata Technologies (Hadoop,
MapReduce, Sqoop, PySpark, Spark, HDFS, Hive, Impala, StreamSets, Kudu,
Oozie, Hue, Kafka, Yarn, Python, Flume, Zookeeper, Sentry, Cloudera
Navigator) along with Informatica, Oracle SQL/PL-SQL, Unix commands and
shell scripting;
• 2+ years of Strong development experience in creating Sqoop scripts,
PySpark programs, HDFS commands, HDFS file formats (Parquet, Avro, ORC
etc.), StreamSets pipeline creation, jobs scheduling, hive/impala queries,
Unix commands, scripting and shell scripting etc.


*DESIRED Skill Sets:*
• Demonstrate effective leadership, analytical and problem-solving skills
• Required excellent written and oral communication skills with technical
and business teams.
• Ability to work independently, as well as part of a team
• Stay abreast of current technologies in area of IT assigned
• Establish facts and draw valid conclusions
• Recognize patterns and opportunities for improvement throughout the
entire organization
• Ability to discern critical from minor problems and innovate new solutions

Thanks
Ramesh

-- 
You received this message because you are subscribed to "rtc-linux".
Membership options at http://groups.google.com/group/rtc-linux .
Please read http://groups.google.com/group/rtc-linux/web/checklist
before submitting a driver.
--- 
You received this message because you are subscribed to the Google Groups 
"rtc-linux" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rtc-linux/CABWaCvAELtiEAsNYrC5FjdHDBmXmDh8FdTXwabbBAh%3DmsW9q1g%40mail.gmail.com.

[rtc-linux] 1) Big Data with ETL 2) Informatica (Big Data) Looking for Locals only Columbus, OH

Reply via email to