Big Data - Data Lake Developer | Milpitas, CA Contract: 1+yr (renewed quarterly)
Interview: Phone and Skype Candidates are required to take a java programming test with prime vendor on Skype for 60-90minutes. The Data Lake engineer should come from the background of a senior Hadoop developer. The position will be in support of data management, ingestion, and client consumption. This individual has an absolute requirement to be well versed in Big Data fundamentals such as HDFS and YARN. More than a working knowledge of Hive is required with understanding of partitioning/reducer size/block sizing/etc. Preferably, the candidate has a strong knowledge of Spark on either Python or Scala. Basic Spark knowledge is required. Knowledge of other industry ETL tools (including No SQL) such Cassandra/Drill/Impala/etc. is a plus. The candidate should be comfortable with Unix and standard enterprise environment tools/tech such as ftp/scp/ssh/Java/Python/SQL/etc. Our Enterprise environment also contains a number of other Utilities such as Tableau, Informatica, Pentaho, Mule, Talend, and others. Proficiency in these is a plus. Must have: Java + big data (hive hbase). Big Data - Data Lake Developer: Very strong in Core Java implementations. All the applications in Big Data are written in core Java. Must be able to code algorithms and try to reduce Big O in Java (O(n), O(n log n), O(n2), etc). Eg: sorting, searching, etc. Client is going to ask the candidate to implement code in core java on a webex (audio, video and screen sharing). Scoop is being used heavily. 90% of all the data imports are being done using Scoop. Number of ways the data can be imported, parameters used, distribute jobs, optimize the parameters, etc Very good understanding and implementation experience of Hive and HBase (NoSQL) Wiring Bash scripts (Shell Scripting) and working in Unix environment is mandatory most of the unix commands, grep logs, write bash scripts and schedule them, etc Excellent in RDBMS SQL. Client has access to many data sources Teradata, SQL Server, MySQL, Oracle etc. The candidate must be able to easily connect and run complex queries. Python and Kafka are a plus Java REST API implementation is a plus -The Data Lake engineer should come from the background of a senior Hadoop developer. -The position will be in support of data management, ingestion, and client consumption. -This individual has an absolute requirement to be well versed in Big Data fundamentals such as HDFS and YARN. -More than a working knowledge of Hive is required with understanding of partitioning/reducer size/block sizing/etc. -Preferably, the candidate has a strong knowledge of Spark on either Python or Scala. -Basic Spark knowledge is required. -Knowledge of other industry ETL tools (including No SQL) such Cassandra/Drill/Impala/etc. is a plus. -The candidate should be comfortable with Unix and standard enterprise environment tools/tech such as ftp/scp/ssh/Java/Python/SQL/etc. -Our Enterprise environment also contains a number of other Utilities such as Tableau, Informatica, Pentaho, Mule, Talend, and others. Proficiency in these is a plus. *Garima Gupta | Technical Recruiter | Apetan Consulting LLC* *Tel: 201-620-9700* <201-620-9700*> 133 | gar...@apetan.com <gar...@apetan.com> * | garimaapetan...@gmail.com -- You received this message because you are subscribed to the Google Groups "Citrix and Sap problems" group. To unsubscribe from this group and stop receiving emails from it, send an email to citrix-and-sap-problems+unsubscr...@googlegroups.com. To post to this group, send email to citrix-and-sap-problems@googlegroups.com. Visit this group at https://groups.google.com/group/citrix-and-sap-problems. For more options, visit https://groups.google.com/d/optout.