Securing Hadoop with Kerberos –  GTBD6

Course Description

This course covers Kerberos concepts, components, installation, configuration, and troubleshooting. Realms are tested by Kerberizing NFS and SSH services. The core components of Hadoop (HDFS, and Mapreduce) are reviewed with emphasis on the security model, and a simple Hadoop cluster is installed and configured. The cluster is then integrated with the Kerberos realm and configured to run in secure mode.

Objectives:

  • Understand Kerberos operation.
  • Install and configure a Kerberos realm.
  • Secure local SSH and NFS services.
  • Understand core Hadoop services.
  • Install and configure a Hadoop cluster.
  • Configure a cluster to operate in secure mode.

Supported Distributions:

Red Hat Enterprise Linux 6

 

Duration

4 Days

 

Course Prerequisites

Course assumes familiarity with Linux and core system administration skills. Qualified participants should be comfortable working from the shell, editing files, managing local services, and using SSH. Familiarity with Hadoop is beneficial, but not absolutely essential.

 

Suggested Follow on Courses

There are a number of options. Please contact us for further information.

 

Course Content

  1. Hadoop: The Big Picture
    1. Data Analysis
    2. Big Data
    3. Hadoop Core Architecture
    4. Hadoop Ecosystem
    5. Hadoop Ecosystem continued
    6. Running Commands on Multiple Systems

    Lab Tasks

    1. Running Commands on Multiple Hosts
    2. Preparing to Install Hadoop
  2. HDFS
    1. Design Goals
    2. Design
    3. Blocks
    4. Block Replication
    5. Namenode Daemon
    6. Secondary Namenode Daemon
    7. Datanode Daemon
    8. Accessing HDFS
    9. Permissions and Users
    10. Adding and Removing Datanodes
    11. Balancing

    Lab Tasks

    1. Single Node HDFS
    2. Multi-node HDFS
    3. Files and HDFS
    4. Managing and Maintaining HDFS
  3. Kerberos Concepts and Components
    1. Common Security Problems
    2. Account Proliferation
    3. The Kerberos Solution
    4. Kerberos History
    5. Kerberos Implementations
    6. Kerberos Concepts
    7. Kerberos Principals
    8. Kerberos Safeguards
    9. Kerberos Components
    10. Authentication Process
    11. Identification Types
    12. Logging In
    13. Gaining Privileges
    14. Using Privileges
    15. Kerberos Components and the KDC
    16. Kerberized Services Review
    17. Kerberized Clients
    18. KDC Server Daemons
    19. Configuration Files
    20. Utilities Overview
  4. Implementing Kerberos
    1. Plan Topology and Implementation
    2. Kerberos 5 Client Software
    3. Kerberos 5 Server Software
    4. Synchronize Clocks
    5. Create Master KDC
    6. Configuring the Master KDC
    7. KDC Logging
    8. Kerberos Realm Defaults
    9. Specifying [realms]
    10. Specifying [domain_realm]
    11. Allow Administrative Access
    12. Create KDC Databases
    13. Create Administrators
    14. Install Keys for Services
    15. Start Services
    16. Add Host Principals
    17. Add Common Service Principals
    18. Configure Slave KDCs
    19. Create Principals for Slaves
    20. Define Slaves as KDCs
    21. Copy Configuration to Slaves
    22. Install Principals on Slaves
    23. Synchronization of Database
    24. Create Stash on Slaves
    25. Start Slave Daemons
    26. Client Configuration
    27. Install krb5.conf on Clients
    28. Client PAM Configuration
    29. Install Client Host Keys

    Lab Tasks

    1. Implementing Kerberos
  5. Administering and Using Kerberos
    1. Administrative Tasks
    2. Key Tables
    3. Managing Keytabs
    4. Managing Principals
    5. Viewing Principals
    6. Adding, Deleting, and Modifying Principals
    7. Principal Policy
    8. Overall Goals for Users
    9. Signing In to Kerberos
    10. Ticket types
    11. Viewing Tickets
    12. Removing Tickets
    13. Passwords
    14. Changing Passwords
    15. Giving Others Access
    16. Using Kerberized Services
    17. Kerberized FTP
    18. Enabling Kerberized Services
    19. OpenSSH and Kerberos

    Lab Tasks

    1. Using Kerberized Clients
    2. Forwarding Kerberos Tickets
    3. OpenSSH with Kerberos
    4. Wireshark and Kerberos
  6. Securing the Local Filesystem
    1. NFS Properties
    2. NFS Export Option
    3. NFSv4 and GSSAPI Auth
    4. Implementing NFSv4
    5. Implementing Kerberos with NFS

    Lab Tasks

    1. Implementing NFSv4
  7. Kerberizing Apache
    1. httpd.conf – Server Settings
    2. HTTP User Authentication
    3. Authentication via Kerberos
    4. tcpdump and wireshark

    Lab Tasks

    1. Enabling SSO in Apache with mod_auth_kerb
  8. Kerberizing HDFS
    1. Kerberizing HDFS
    2. Kerberizing HDFS (cont.)
    3. Kerberizing HDFS (cont.)

    Lab Tasks

    1. Kerberized HDFS
    2. Disable Kerberized HDFS
  9. MapReduce
    1. MapReduce
    2. Terminology and Data Flow
    3. MapReduce Daemons
    4. MapReduce Essential Configuration
    5. Failure and Recovery
    6. YARN

    Lab Tasks

    1. MapReduce
  10. Kerberizing MapReduce
    1. Kerberizing Mapreduce
    2. Kerberizing Mapreduce (cont.)
    3. Kerberizing Mapreduce (cont.)

    Lab Tasks

    1. Re-Enable Kerberized HDFS
    2. Kerberized Mapreduce v1

A.  Installing Hadoop with Ambari Lab Tasks

  1. Installing Hadoop with Ambari

See more Hadoop courses