Spark Developer Columbus Indiana


Skills : Spark, Hive



Profile Summary
• Leads projects for design, development and maintenance of a data and
analytics platform. Effectively and efficiently process, store and make
data available to analysts and other consumers. Works with key business
stakeholders, IT experts and subject-matter experts to plan, design and
deliver optimal analytics and data science solutions. Works on one or many
product teams at a time.
Key Responsibilities
• Designs and automates deployment of our distributed system for ingesting
and transforming data from various types of sources (relational,
event-based, unstructured).
• Designs and implements framework to continuously monitor and troubleshoot
data quality and data integrity issues.
• Implements data governance processes and methods for managing metadata,
access, retention to data for internal and external users.
• Designs and provide guidance on building reliable, efficient, scalable
and quality data pipelines with monitoring and alert mechanisms that
combine a variety of sources using ETL/ELT tools or scripting languages.
• Designs and implements physical data models to define the database
structure. Optimizing database performance through efficient indexing and
table relationships.
• Participates in optimizing, testing, and troubleshooting of data
pipelines.
• Designs, develops and operates large scale data storage and processing
solutions using different distributed and cloud based platforms for storing
data (e.g. Data Lakes, Hadoop, Hbase, Cassandra, MongoDB, Accumulo,
DynamoDB, others).
• Uses innovative and modern tools, techniques and architectures to
partially or completely automate the most-common, repeatable and tedious
data preparation and integration tasks in order to minimize manual and
error-prone processes and improve productivity. Assists with renovating the
data management infrastructure to drive automation in data integration and
management.
• Ensures the timeliness and success of critical analytics initiatives by
using agile development technologies such as DevOps, Scrum, Kanban
• Coaches and develops less experienced team members.
Experiences
REQUIRED - Intermediate experience resulting in the following skills and
knowledge:
- Familiarity analyzing complex business systems, industry requirements,
and/or data regulations
- Background in processing and managing large data sets
- Design and development for a Big Data platform using open source and
third-party tools
- SPARK, Scala/Java, Map-Reduce, Hive, Hbase, and Kafka or equivalent
college coursework
- SQL query language
- Clustered compute cloud-based implementation experience
- Experience developing applications requiring large file movement for a
Cloud-based environment and other data extraction tools and methods from a
variety of sources
- Experience in building analytical solutions
PREFERRED - Intermediate experience resulting in the following skills and
knowledge:
- Experience with IoT technology
- Experience in Agile software development



*Regards,*

*ANKIT MENDIRATTA*

*Lead Recruiter*

*Net**2**Source Inc.*

*Global HQ Address – 7250 Dallas Pkwy, Suite 825 Plano, Texas 75024*

*Office: (201) 340-8700 x **459 | Fax: (201) 221-8131| Email:
**anki...@net2source.com
<anki...@net2source.com>*

-- 
You received this message because you are subscribed to "rtc-linux".
Membership options at http://groups.google.com/group/rtc-linux .
Please read http://groups.google.com/group/rtc-linux/web/checklist
before submitting a driver.
--- 
You received this message because you are subscribed to the Google Groups 
"rtc-linux" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to rtc-linux+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/rtc-linux/CALQk_bdK-kzs6Xc5FHp-HKvwWzN2y7CNk9q0d0W%3D3i2TBVBzLw%40mail.gmail.com.

Reply via email to