丐帮帮主 IT技术博客

hive去除NUL数据

问题：hive表的字段数据中有NUL数据，导致无法将hive数据导出

解决：使用regexp_replace(CFWH, '\x00', '')，x00可以匹配到NUL

方法：

在数据从ods层进入dwd层时进行数据治理

// 在hive中的查询语句
select sxh, regexp_replace(replace(CFWH, decode(unhex(hex(127)), "US-ASCII"), ""), "\\x00", "")
from ods_xscfjl where sxh in (1020,1107);

# 脚本中的调用语句
dwd_cfjl="
set hive.input.format=org.apache.hadoop.hive.ql.io.HiveInputFormat;
insert overwrite table ${APP}.dwd_cfjl partition(dt='$do_date')
select
    SXH, 
    XH, 
    regexp_replace(replace(CFWH, decode(unhex(hex(127)), 'US-ASCII'), ''), '\\\\x00', '') CFWH, 
    CJSJ,
    CJZ
from ${APP}.ods_xscfjl
where dt='$do_date';
"

注：'x00' 需要加一个转义符,正则中写'\x00'，decode(unhex(hex(127)), 'US-ASCII')是过滤字段中的DEL字符

阅读全文

Database,Bigdata
2023-11-04
评论
71 次浏览

FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

问题：运行ods_to_dwd_db_init.sh将数据从ods层处理后到dwd层进行动态分区时出错：FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask. Spark job failed during runtime. Please check stacktrace for the root cause.

解决：
增加

set hive.exec.max.dynamic.partitions=2000;
set hive.exec.max.dynamic.partitions.pernode=2000;

阅读全文

Bigdata
2023-11-01
评论
69 次浏览

sqoop达梦数据库jdbc写法

问题：如何通过sqoop将数据导入到达梦数据库中？

方法：

./sqoop list-tables --driver dm.jdbc.driver.DmDriver  --connect jdbc:dm://192.168.142.235:5247 --username SYSDBA --password SYSDBA

阅读全文

Database,Bigdata
2023-10-28
评论
72 次浏览

python字符串str与hex转换

hive去除NUL数据

hive处理特殊字符

FAILED: Execution Error, return code 3 from org.apache.hadoop.hive.ql.exec.spark.SparkTask

sqoop达梦数据库jdbc写法

最受欢迎的文章

最近回复

友情链接