Hadoop的安装部署
下载安装包,放置服务器安装目录并解压
下载的安装包最好放到opt去解压安装,因为后面data有默认在opt创建,如果没有在opt,要把opt给bxwl授权,授权都需要切换到root用户 chmod 777 -R /opt ,否则sbin/start-dfs.sh 会报错Cannot set priority of datanode process XX;报错详情可格式化查看hdfs namenode -format
1、配置HADOOP_HOME环境变量
[bxwl@snode028 bin]$ vim /etc/profile.d/bxwl.sh
export JAVA_HOME=/opt/jdk1.8.0_291
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export HADOOP_HOME=/opt/hadoop-3.3.2
#export HADOOP_CONF_DIR=/opt/hadoop-3.3.2/etc
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:$PATH
2、配置集群服务器节点workers
[bxwl@snode028 ~]$ cd /opt/hadoop-3.3.2/etc/hadoop
[bxwl@snode028 hadoop]$ vim workers
snode028
snode029
snode030
3、配置core-site.xml
指定NameNode的地址、指定Hadoop数据的存储目录、配置HDFS网页登录使用的静态用户为bxwl(不能是root)
[bxwl@snode028 hadoop]$ vim core-site.xml
… <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://snode028:8020</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/opt/hadoop-3.3.2/data</value> </property> <property> <name>hadoop.http.staticuser.user</name> <value>bxwl</value> </property> </configuration>
4、配置hdfs-site.xml
指定NameNode的 http地址、指定secondary NameNode的http地址、在webhdfs后台系统能查看文件内容
[bxwl@snode028 hadoop]$ vim hdfs-site.xml
… <configuration> <property> <name>dfs.namenode.http-address</name> <value>snode028:9870</value> </property> <property> <name>dfs.namenode.secondary.http-address</name> <value>snode029:9868</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> </configuration>
5、配置mapred-site.xml
指定MapReduce程序运行在Yarn上、Job历史服务器地址、Job历史服务器Web端地址
[bxwl@snode028 hadoop]$ vim mapred-site.xml
… <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>snode028:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>snode028:19888</value> </property> </configuration>
6、配置yarn-site.xml
指定MR走shuffle、指定ResourceManager的地址、环境变量的继承、开启日志聚集功能、设置日志聚集服务器地址、设置日志保留时间为7天
[bxwl@snode028 hadoop]$ vim yarn-site.xml
… yarn.nodemanager.aux-services mapreduce_shuffle yarn.resourcemanager.hostname snode030 yarn.nodemanager.env-whitelist JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CLASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME yarn.log-aggregation-enable true yarn.log.server.url http://snode028:19888/jobhistory/logs yarn.log-aggregation.retain-seconds 604800
7、格式化 NameNode (第一次启动时需要)
[bxwl@snode028 hadoop-3.3.2]$ hdfs namenode -format
WARNING: /opt/hadoop-3.3.2/logs does not exist. Creating.
2022-04-15 18:44:40,635 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = snode028/192.168.100.28
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 3.3.2
STARTUP_MSG: …
8、启动HDFS
[bxwl@snode028 hadoop-3.3.2]$ sbin/start-dfs.sh
Starting namenodes on [snode028]
Starting datanodes
snode030: WARNING: /opt/hadoop-3.3.2/logs does not exist. Creating.
snode029: WARNING: /opt/hadoop-3.3.2/logs does not exist. Creating.
Starting secondary namenodes [snode029]
Web访问HDFS
浏览器直接访问:http://192.168.100.28:9870
9、启动Yarn
// Yarn配置在snode030,所以要到snode030上启动
[bxwl@snode028 hadoop-3.3.2]$ ssh snode030
[bxwl@snode030 ~]$ cd /opt/hadoop-3.3.2
[bxwl@snode030 hadoop-3.3.2]$ sbin/start-yarn.sh
Starting resourcemanager
Starting nodemanagers
[bxwl@snode030 hadoop-3.3.2]$ ~/bin/jpsall
———- snode028 jps ————
6672 DataNode
6521 NameNode
7003 NodeManager
6029 QuorumPeerMain
7101 Jps
———- snode029 jps ————
6146 SecondaryNameNode
6306 NodeManager
6036 DataNode
5750 QuorumPeerMain
6406 Jps
———- snode030 jps ————
6195 NodeManager
6070 ResourceManager
5595 QuorumPeerMain
5837 DataNode
6527 Jps
[bxwl@snode030 hadoop-3.3.2]$
Web访问Yarn
浏览器直接访问:http://192.168.100.30:8088
10、编写hadoop 启动关闭的脚本
[bxwl@snode028 bin]$ vim hadoop.sh
#!/bin/bash
case $1 in
“start”){
echo ———-HDFS 启动————
ssh snode028 “/opt/hadoop-3.3.2/sbin/start-dfs.sh”
echo ———- Yarn 启动————
ssh snode030 “/opt/hadoop-3.3.2/sbin/start-yarn.sh”
echo ———- Job历史服务器 启动————
ssh snode028 “/opt/hadoop-3.3.2/bin/mapred –daemon start historyserver”
};;
“stop”){
echo ———- Job历史服务器 关闭————
ssh snode028 “/opt/hadoop-3.3.2/bin/mapred –daemon stop historyserver”
echo ———- Yarn $i 关闭————
ssh snode030 “/opt/hadoop-3.3.2/sbin/stop-yarn.sh”
echo ———-HDFS $i 关闭————
ssh snode028 “/opt/hadoop-3.3.2/sbin/stop-dfs.sh”
};;
esac
// 文件权限
[bxwl@snode028 bin]$ chmod +x hadoop.sh
// 关闭
[bxwl@snode028 bin]$ hadoop.sh stop
———- Job历史服务器 关闭————
———- Yarn 关闭————
Stopping nodemanagers
Stopping resourcemanager
———-HDFS 关闭————
Stopping namenodes on [snode028]
Stopping datanodes
Stopping secondary namenodes [snode029]
// 启动
[bxwl@snode028 bin]$ hadoop.sh start
———-HDFS 启动————
Starting namenodes on [snode028]
Starting datanodes
Starting secondary namenodes [snode029]
———- Yarn 启动————
Starting resourcemanager
Starting nodemanagers
———- Job历史服务器 启动————