Hadoop实战-环境搭建

admin 发布于:2018-6-7 10:55 分类:系统架构  有 1719 人浏览,获得评论 0 条 标签: Hadoop 

Hadoop实战(1)-环境搭建

 

1.准备的软件
centos7
SSH
Java 1.8.0_171
Hadoop 2.9.1

2.安装SSH
#yum install openssh-server openssh-clients

3.新建软件安装目录
#mdkir -p /usr/local/soft/
#cd /usr/local/soft/

3.安装Java Java 1.8.0_171
3.1. 下载
#wget http://download.oracle.com/otn-pub/java/jdk/8u171-b11/512cd62ec5174c3487ac17c61aaa89e8/jdk-8u171-linux-x64.tar.gz?AuthParam=1528334282_eed030b012c430a6a5d6cebcfd2ff96f
3.2. 解压
#tar –zvxf jdk-8u171-linux-x64.tar.gz
3.3. 设置环境变量
#vi /etc/profile
export JAVA_HOME=/usr/local/soft/jdk1.8.0_171
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH
export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH
3.4.环境变量生效
#source /etc/profile


4. 安装 Hadoop 2.9.1
4.1. 下载
#wget https://mirrors.tuna.tsinghua.edu.cn/apache/hadoop/common/hadoop-2.9.1/hadoop-2.9.1.tar.gz
4.2  解压
#tar –zvxf hadoop-2.9.1.tar.gz
4.3  更改文件名为hadoop
#mv hadoop-2.9.1 hadoop

4.4. 设置环境变量
#vi /etc/profile
export Hadoop_HOME=/usr/local/soft/hadoop
export PATH=${HADOOP_HOME}/bin:${HADOOP_HOME}/sbin:$PATH
export CLASSPATH=$($HADOOP_HOME/bin/hadoop classpath):$CLASSPATH

4.5.环境变量生效
#source /etc/profile


5. Hadoop单机模式
5.1 Hadoop单机模式配置
#cd /usr/local/soft/hadoop/
#vi etc/hadoop/hadoop-env.sh
export JAVA_HOME=/usr/local/soft/jdk1.8.0_171

5.2 Hadoop单机模式实战
#mkdir LocalInput
#cp etc/hadoop/*.xml LocalInput
#bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.9.1.jar grep LocalInput LocalOutput 'dfs[a-z.]+'
#cat LocalOutput/*

6. SSH免登录设置
# ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
# chmod 0600 ~/.ssh/authorized_keys
# ssh localhost

 

7 Hadoop伪分布模式
7.1 Hadoop伪分布模式配置
7.1.1 #vi etc/hadoop/core-site.xml

<configuration>
    <property>
        <name>fs.defaultFS</name>
        <value>hdfs://localhost:9000</value>
    </property>
</configuration>

7.1.2 #vi etc/hadoop/hdfs-site.xml
<configuration>
    <property>
        <name>dfs.replication</name>
        <value>1</value>
    </property>
</configuration>

7.1.3 #vi etc/hadoop/mapred-site.xml
<configuration>
    <property>
        <name>mapreduce.framework.name</name>
        <value>yarn</value>
    </property>
</configuration>

7.1.4 #vi etc/hadoop/yarn-site.xml
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
</configuration>


7.2 Hadoop伪分布模式启动和查看
7.2.1 格式化文件系统
# bin/hdfs namenode -format

7.2.2 启动 NameNode 、DataNode 、ResourceManager 、NodeManager
#sbin/start-all.sh

7.2.3 查看监控页面
http://localhost:50070/  (查看NameNode)
http://localhost:8088/   (查看ResourceManager)


7.3 Hadoop伪分布模式文件上传和下载
7.3.1新建执行 MapReduce jobs 需要的目录
#bin/hdfs dfs -mkdir /user
#bin/hdfs dfs -mkdir /user/root

7.3.2 本地文件上传到HDFS和从HDFS下载本地
新建本地文件
#cd LocalInput
#vi F1.txt
Hello World
Hello Hadoop
#vi F2.txt
Hello JAVA
JAVA 是 一门 面向对象 编程 语言

把本地文件上传到HDFS
#bin/hdfs dfs -put LocalInput/*.txt HdfsInput

查看HDFS文件
#bin/hdfs dfs -ls HdfsInput

把HDFS文件下载到本地LocalOutput
#bin/hdfs dfs -get HdfsInput LocalOutput

查看已经下载到本地的文件
#ls -l LocalOutput


7.4 Hadoop伪分布模式停止
#sbin/stop-all.sh

 

 

Hadoop实战(2)-MapReduce

http://www.wangfeilong.cn/server/115.html

 

 

Hadoop实战(3)-PHP-MapReduce

 http://www.wangfeilong.cn/server/116.html