mongo-shake使用
简介:
Mongo-Shake由阿里云Nosql团队开发和维护,是一个基于MongoDB的oplog的通用服务平台。从源mongo数据库中获取oplog,并在目标mongo数据库中重放或通过不同的tunnels发送到其他终端。如果目标端是mongo数据库,这意味着直接重放oplog,它就像一个同步工具,用于将数据从源MongoDB复制到另一个MongoDB,以构建冗余复制或active-active复制。
除了这种直接方式,还有其他tunnels类型,如rpc、file、tcp、kafka。用户编写的接收者必须分别定义自己的接口来连接这些tunnels。用户还可以定义自己的可插拔tunnels类型。如果连接到像kafka这样的第三方消息中间件,消费者可以灵活地在pub/sub模块中以异步方式获取订阅者数据。
一、使用流程
1、下载
cd /data/mongo-shake
wget https://github.com/tair-opensource/RedisShake/releases/download/v3.1.11/redis-shake-linux-amd64.tar.gz
tar xf redis-shake-linux-amd64.tar.gz
2、更新配置文件
cat collector.conf
conf.version = 4
id = mongoshake
master_quorum = false
full_sync.http_port = 29801 #全量同步端口
incr_sync.http_port = 29800 #
system_profile_port = 29800
log.level = info
log.dir =
log.file = collector.log
log.flush = false
#同步模式 全量+增量
sync_mode = all
#原集群地址
mongo_urls = mongodb://admin:ssssssssssssssss@10.1.0.1:27017,10.1.0.2:27017,10.1.0.3:27017
mongo_cs_url =
mongo_s_url =
tunnel = direct
#目标集群地址
tunnel.address = mongodb://admin:ppppppppppppppppp@10.2.0.1:27017,10.2.0.2:27017,10.2.0.3:27017
tunnel.message = raw
#同步 一般推荐原实例从节点优先
mongo_connect_mode = secondaryPreferred
filter.namespace.black =
filter.pass.special.db =
filter.ddl_enable = false
checkpoint.storage.url =
checkpoint.storage.db = mongoshake
checkpoint.storage.collection = ckpt_default
checkpoint.start_position = 1970-01-01T00:00:00Z
transform.namespace =
full_sync.reader.collection_parallel = 6
full_sync.reader.write_document_parallel = 8
full_sync.reader.document_batch_size = 128
full_sync.collection_exist_drop = false
#是否同步索引
full_sync.create_index = background
full_sync.executor.insert_on_dup_update = false
full_sync.executor.filter.orphan_document = false
full_sync.executor.majority_enable = false
incr_sync.mongo_fetch_method = oplog
incr_sync.oplog.gids =
incr_sync.shard_key = collection
incr_sync.worker = 8
incr_sync.worker.oplog_compressor = none
incr_sync.target_delay = 0
incr_sync.worker.batch_queue_size = 64
incr_sync.adaptive.batching_max_size = 1024
incr_sync.fetcher.buffer_capacity = 256
incr_sync.executor.upsert = false
incr_sync.executor.insert_on_dup_update = false
incr_sync.conflict_write_to = none
incr_sync.executor.majority_enable = false
3、启动同步任务
#启动同步任务
nohup ./collector.linux -conf=./collector.conf &
#查看进度
curl 127.0.0.1:29801/progress|python -m json.tool
{
"collection_metric": {
...
},
"finished_collection_number": 116,
"processing_collection_number": 0,
"progress": "100.00%", #同步进度
"total_collection_number": 116,
"wait_collection_number": 0
}
#查看日志是否进入增量同步阶段(incr)
tail -f logs/collector.log
#注意中断后或者 二次同步启动同步前, 建议清理临时数据、日志
rm -f diagnostic/* logs/*
二、常见问题和注意事项
1、问题
· 原数据库同步账号,所需权限
{
"_id" : "admin.mongoadmin",
"userId" : UUID("434e906f-2f3d-46b1-8eff-d4c5e7d4ece2"),
"user" : "mongoadmin",
"db" : "admin",
"roles" : [
{
"role" : "dbAdminAnyDatabase",
"db" : "admin"
},
{
"role" : "readWriteAnyDatabase",
"db" : "admin"
},
{
"role" : "userAdminAnyDatabase",
"db" : "admin"
},
{
"role" : "clusterMonitor",
"db" : "admin"
}
],
"mechanisms" : [
"SCRAM-SHA-1",
"SCRAM-SHA-256"
]
}
2、注意
1、原集群数据量过大的情况mongo-shake同步时间会过长,对新集群主库压力较大
2、如果内网有专线的话,建议使用通过加新节点的原生主从方式同步数据