开源时序数据库 InfluxDB

介绍

InfluxDB™ is a time series database designed to handle high write and query loads. It is an integral component of the TICK stack. InfluxDB is meant to be used as a backing store for any use case involving large amounts of timestamped data, including DevOps monitoring, application metrics, IoT sensor data, and real-time analytics.

基本概念

DataBase

 类似于传统数据库中的 DataBase 概念

Measurement

 和 OLAP 中广义上的度量概念一致,部分 OLAP 数据库中又称为 Metric

Tag

 和 OLAP 中广义上的维度概念一致,部分 OLAP 数据库中又称为 TagKV

Field

 数值

Timestamp

 时间戳

Points

 数据点

Series

 数据点组成的序列

Retention Policy

 数据过期策略,即 TTL

特性

优点

  • 无任何额外的依赖(例如 ZooKeeperHDFS 等)
  • 支持类 SQL 查询(InfluxQL
  • 支持多租户和简单的鉴权功能
  • 开源社区活跃,且常年霸榜 DB-Engines 时序数据库的第一名

缺点

  • 未开源集群版本
  • 不支持懒加载(启动需要扫描所有 TSM 文件,会导致节点故障恢复慢,而 Apache Druid 是支持的)
  • 无法跨 Measurement 进行 Join 操作
  • 无法存储相同的数据点,会发生覆盖写
  • 不支持冷热数据的分层存储

环境部署

鉴于目前 InfluxDB 2.x 生态圈还没有足够完善,所以这里我们使用的 InfluxDB 版本是 1.x

源码版

编译

1
2
3
4
5
6
$ go get github.com/influxdata/influxdb
$ cd $GOPATH/src/github.com/influxdata/influxdb
$ go clean ./...
$ go get -t -v ./...
$ go install -ldflags="-X main.version=1.8.2" ./...
$ ll $GOPATH/bin

启动

1
$ $GOPATH/bin/influxd

容器版

1
2
3
4
5
# 拉取镜像
$ docker pull influxdb:1.8.10

# 启动
$ docker run -idt --name influxdb -p 8086:8086 -v ~/influxdb:/var/lib/influxdb influxdb:1.8.10

校验

1
$ docker ps
1
2
CONTAINER ID   IMAGE             COMMAND                  CREATED          STATUS          PORTS                    NAMES
48120b502a48 influxdb:1.8.10 "/entrypoint.sh infl…" 32 seconds ago Up 32 seconds 0.0.0.0:8086->8086/tcp influxdb

使用

进入容器
1
$ docker exec -it influxdb bash
连接 InfluxDB 实例
1
$ influx
1
2
Connected to http://localhost:8086 version 1.8.10
InfluxDB shell version: 1.8.10
InfluxQL
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
# 创建数据库
> CREATE DATABASE yuzhouwan

# 使用数据库
> USE yuzhouwan
Using database yuzhouwan

# 展示数据库
> SHOW DATABASES
name: databases
name
----
_internal
yuzhouwan

# 创建用户
> CREATE USER asdf2014 WITH PASSWORD 'yuzhouwan.com'

# 展示用户
> SHOW USERS
user admin
---- -----
asdf2014 false

# 赋权
> GRANT ALL PRIVILEGES ON yuzhouwan TO asdf2014

Helm 云原生

1
2
$ helm repo add influxdata https://helm.influxdata.com/
$ helm install influxdata/influxdb --version 4.8.2 --generate-name
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
NAME: influxdb-1598258562
LAST DEPLOYED: Mon Aug 24 16:42:47 2020
NAMESPACE: default
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
InfluxDB can be accessed via port 8086 on the following DNS name from within your cluster:

http://influxdb-1598258562.default:8086

You can connect to the remote instance with the influx CLI. To forward the API port to localhost:8086, run the following:

kubectl port-forward --namespace default $(kubectl get pods --namespace default -l app=influxdb-1598258562 -o jsonpath='{ .items[0].metadata.name }') 8086:8086

You can also connect to the influx CLI from inside the container. To open a shell session in the InfluxDB pod, run the following:

kubectl exec -i -t --namespace default $(kubectl get pods --namespace default -l app=influxdb-1598258562 -o jsonpath='{.items[0].metadata.name}') /bin/sh

To view the logs for the InfluxDB pod, run the following:

kubectl logs -f --namespace default $(kubectl get pods --namespace default -l app=influxdb-1598258562 -o jsonpath='{ .items[0].metadata.name }')

Grafana

1
2
3
4
5
6
7
8
9
10
11
# 安装
$ wget https://dl.grafana.com/oss/release/grafana-6.0.0-beta2.x86_64.rpm
$ sudo yum localinstall grafana-6.0.0-beta2.x86_64.rpm

# 启动
$ systemctl daemon-reload
$ systemctl start grafana-server
$ systemctl status grafana-server

# 自启动
$ sudo systemctl enable grafana-server.service

Telegraf

1
2
3
4
$ wget https://dl.influxdata.com/telegraf/releases/telegraf-1.12.3-1.x86_64.rpm
$ sudo yum localinstall telegraf-1.12.3-1.x86_64.rpm
$ telegraf config > telegraf.conf
$ telegraf --config telegraf.conf

TICK

Collectively, Telegraf, InfluxDB, Chronograf and Kapacitor are known as the TICK Stack.

The TICK Stack is a loosely coupled yet tightly integrated set of open source projects designed to handle massive amounts of time-stamped information to support your metrics analysis needs.

InfluxDB TICK Stack

(图片来源:InfluxDB™ 官网)

基本操作

版本

1
$ curl -sL -I localhost:8086/ping | grep -i version
1
X-Influxdb-Version: v1.8.2

控制台

1
2
# 输入如下命令,即可进入到 InfluxDB 命令交互的控制台
$ influx -host 'localhost' -port '8086'

建表

1
> show databases;
1
2
3
4
name: databases
name
----
_internal
1
2
# CREATE DATABASE <database_name> [WITH [DURATION <duration>] [REPLICATION <n>] [SHARD DURATION <duration>] [NAME <retention-policy-name>]]
> create database "yuzhouwan";
1
> show databases;
1
2
3
4
5
name: databases
name
----
_internal
yuzhouwan
1
> use yuzhouwan;
1
Using database yuzhouwan

写入

1
> insert blog,protocol=https,name=yuzhouwan value=666

查询

明细查询

1
> select * from blog
1
2
3
4
name: blog
time name protocol value
---- ---- -------- -----
1556438552229094000 yuzhouwan https 666

聚合查询

1
> select mean(value) from blog
1
2
3
4
name: blog
time mean
---- ----
0 666

范围查询

1
> select * from blog WHERE time > '2019-04-01T00:00:00Z' OR time < '2019-05-01T00:00:00Z'
1
2
3
4
name: blog
time name protocol value
---- ---- -------- -----
1556438552229094000 yuzhouwan https 666

数据导出

方法 优缺点
通过 HTTP 接口直接查询 简单易用
influx_tools 命令行工具里的 exporter 功能 并不能导出原始数据点,只能操作 shard
influx_inspect 命令行工具里的 export 功能 支持导出原始数据点,直接操作底层 TSM、WAL 文件

query 接口

1
$ curl -G 'http://localhost:8086/query?pretty=true' --data-urlencode "db=yuzhouwan" --data-urlencode 'q=select * from blog'
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
{
"results": [
{
"series": [
{
"columns": [
"time",
"name",
"protocol",
"value"
],
"name": "blog",
"values": [
[
"2019-04-28T08:02:32.229094Z",
"yuzhouwan",
"https",
666
]
]
}
],
"statement_id": 0
}
]
}

influx_inspect 命令

1
$ influx_inspect export -database yuzhouwan -start 2019-01-01T00:00:00+00:00 -end 2019-12-01T00:00:00+00:00 -out yuzhouwan.out
1
writing out tsm file data for yuzhouwan/autogen...complete.
1
$ cat yuzhouwan.out
1
2
3
4
5
6
7
8
# INFLUXDB EXPORT: 2019-01-01T08:00:00+08:00 - 2019-12-01T08:00:00+08:00
# DDL
CREATE DATABASE yuzhouwan WITH NAME autogen
# DML
# CONTEXT-DATABASE:yuzhouwan
# CONTEXT-RETENTION-POLICY:autogen
# writing tsm data
blog,name=yuzhouwan,protocol=https value=666 1556438552229094000

资料

Doc

Github

欢迎加入我们的技术群,一起交流学习

群名称 群号
人工智能(高级)
人工智能(进阶)
BigData
算法