out of 5
based on 124 user reviews)
Affy Informatics offers the Hadoop training which provides opportunities for getting jobs.The highly proficient trainers at the Institute deliver proper training on this course which assists the students in this arena and they are able to achieve a successful position in IT industry. Hadoop is an open source, Java-based programming framework that supports the processing and storage of extremely large data sets in a distributed computing environment.
HADOOP Training Syllabus
- What is big data?
- Big data challenges?
- BIGDATA processing and storage problems
- Platforms for BIGDATA processing and storage
- What is Hadoop?
- Benefits of Hadoop
- Hadoop core components – HDFS & MR
- Overview of Hadoop EcoSystem
- Vendor comparison (Cloudera, Hortonworks, MapR)
HDFS (Hadoop Distributed File System)
- What is HDFS?
- Why do we need HDFS?
- HDFS Architecture
- Concepts of HDFS: Block, NameNode, DataNode, Secondary Namenode
- Replication and Rack-awareness
- HDFS federation
- High availability
- What is MapReduce?
- MapReduce Architecture
- JobTracker and TaskTracker setup
- Steps to develop MapReduce Jobs
- Internal execution of MapReduce Jobs
- Shuffle, Sort & Partitioning
- Speculative Execution
- Input/Output formats
- Writing & Debugging MR programs in java
- Map Reduce API
- Combiner in Map Reduce
- Partitioner in map reduce
- Compression techniques in map reduce
- What is PIG? Why do we need PIG?
- PIG installation and running pig
- Pig Latin scripts
- Data types
- PIG interaction via grunt shell(Local & Hadoop mode)
- Different mode of execution
- Relational operators in pig
- COGROUP , CROSS , DISTINCT , FILTER , FOREACH , GROUP , JOIN(INNER) , JOIN(OUTER) , LIMIT , LOAD , ORDER , SAMPLE , SPILT , STORE , UNION
- Diagnostic operators in pig
- describe, dump, explain, illustrate
- eval functions in pig
- AVG, CONCAT, COUNT, DIFF , IF, EMPTY , MAX , MIN , SIZE , SUM , TOKENIZE
- MR Vs Pig
- Pig User Defined Functions(UDF)