一、基本作用
概念: Sqoop被称为协作框架,是在Hadoop.2.X生态系统的辅助型框架,简单说,就是一个数据转换工具,类似的协作框架有文件收集库框架Flume,任务协调框架Oozie,大数据Web工具Hue过程: 数据源(RDBMS)取得数据<--->数据清洗/数据分析<--->HDFS/HBASE/HDFS作用: Sql-to-Hadoop,是连接关系型数据库和Hadoop的桥梁,以mapreduce为底层,通过参数与与mapreduce模板封装成jar包,提交给Yarn,利用MapReduce加快数据传输速度,批处理方式进行数据传输版本: 1.4.x 为Sqoop1 1.99.x为Sqoop2 二进制下载包下载地址: http://archive.cloudera.com/cdh5/cdh/5/
二、简单配置
sqoop-1.4.5-cdh5.3.6/confsqoop-env.shexport HADOOP_COMMON_HOME=/opt/cdh-5.6.3/hadoop-2.5.0-cdh5.3.6export HADOOP_MAPRED_HOME=/opt/cdh-5.6.3/hadoop-2.5.0-cdh5.3.6export HIVE_HOME=/opt/cdh-5.6.3/hive-0.13.1-cdh5.3.6
三、简单使用
# 连接mysql数据库时注意将mysql的驱动jar包放入lib目录下$ bin/sqoop helpAvailable commands: codegen Generate code to interact with database records create-hive-table Import a table definition into Hive eval Evaluate a SQL statement and display the results export Export an HDFS directory to a database table help List available commands import Import a table from a database to HDFS import-all-tables Import tables from a database to HDFS import-mainframe Import datasets from a mainframe server to HDFS job Work with saved jobs list-databases List available databases on a server list-tables List available tables in a database merge Merge results of incremental imports metastore Run a standalone Sqoop metastore version Display version information$ bin/sqoop list-databases --connect jdbc:mysql://10.0.0.108:3306 --username root --password root$ bin/sqoop list-tables --connect jdbc:mysql://10.0.0.108:3306/mysql --username root --password root$ bin/sqoop import --help$ bin/sqoop export --help