1. show functions
  2. describe function concat
  3. describe function extended concat

标准函数(UDF, 单行)

聚合函数(UDAF, 集合)

接受 从0行到多行 的 0个或多个列,然后返回单一值

通常配合Group使用

表生成函数(UDTF, )

接受 0或多个输入,然后产生多列或多行输出。

  • array
  • explode
  • lateral view

注册

  1. add jar /home/hadoop/hive/hive-udf-1.0-SNAPSHOT.jar;
  2. create temporary function latest_valid_string as 'com.haima.sage.bigdata.hive.LatestValidStringUDAF';
  3. create temporary function latest_valid_long as 'com.haima.sage.bigdata.hive.LatestValidLongUDAF';
  4. create temporary function latest_valid_int as 'com.haima.sage.bigdata.hive.LatestValidIntUDAF';
  5. select car_id, latest_valid_long(unix_timestamp(substr(occur_timestamp, 2, 19), 'yyyy-MM-dd HH:mm:ss') * 1000, cast(price as BIGINT)) from car_info_origin group by car_id;
  6. select car_id, latest_valid_string(unix_timestamp(substr(occur_timestamp, 2, 19), 'yyyy-MM-dd HH:mm:ss') * 1000, car_color) from car_info_origin group by car_id;
  7. select letters(occur_timestamp) from car_info_origin;
  8. select g_concat(price) from car_info_origin;

建表

  1. CREATE EXTERNAL TABLE
  2. IF NOT EXISTS `car_info_origin` (
  3. `occur_timestamp` string COMMENT '爬取日期',
  4. `car_id` string COMMENT '车辆ID',
  5. `price` INT COMMENT '车辆报价',
  6. `car_color` string COMMENT '颜色'
  7. )
  8. ROW FORMAT
  9. DELIMITED FIELDS TERMINATED BY '\t'
  10. LINES TERMINATED BY '\n'
  11. STORED AS TEXTFILE
  12. LOCATION '/data/uxin/test/data';

数据

  1. '2019-03-17 20:00:40.225 +0800' 1 90 白色
  2. '2019-03-17 20:00:40.225 +0800' 1 89

测试

  1. select car_id, unix_timestamp(substr(occur_timestamp, 2, 19), 'yyyy-MM-dd HH:mm:ss') * 1000 from car_info_origin;
  2. select latest_valid_string(occur_timestamp , car_color) from car_info_origin;
文档更新时间: 2019-06-26 18:53