Python 简单实现 获取 malicious ip
最近有一个很简单的需求,今天趁着下班赶紧写一下。
需求描述如下:log日志中存在ip,url,time三个元素,需要获取 x 秒内访问不同URL超过 max次的恶意ip。
0x01 实现思路
问题的本质在于获取恶意ip,x秒只是一个限制条件,因此问题可以简化为获取{ip:[url1,url2]}这样字典,算一下url_list的len即可获取恶意ip。
0x02 show me the code
这里我以nginx log为例,做下简单实现。默认的nginx log格式如下:
67.218.129.173 - - [15/Jul/2019:16:41:53 +0000] "GET /atom.xml HTTP/1.1" 304 0 "-" "Tiny Tiny RSS/19.02 (http://tt-rss.org/)"
Python实现
#!/usr/bin/env python
# coding=utf-8
# date: 20190715 night, about 1.5 hours
# author: thinkycx
# desciription:
# try to get malicious ip from nginx log.
# malicious ip: the number of different urls accessed is larger than MAX within x seconds
# usage:
# python find_malicious_ip.py <path>/access.log
import time
import json
import sys
def parse_nginx_log(file_path):
"""
:param filepath: nginx log file path
:return: list of (url, ip, time)
"""
result = list()
try:
with open(file_path, 'r') as f:
line = f.readline() # read each line of a file
while line:
ttime = line.split("[")[1].split(" ")[0] # 15/Jul/2019:02:47:40
ip = line.split(" ")[0] # 47.254.91.114
url = str(line.split("] \"")[1]).split(" ")[1] # /posts/2018-08-08-CVE-2017-8890-analysis.html
result.append([url, ip, ttime])
line = f.readline()
except IOError as err:
print("[*] Failed to open file.\n " + str(err))
return result
def find_malicious_ip(input_list, within_time, max_count):
"""
:param input_list: list of [url, ip, time]
:param within_time: time
:param max_count: the max number of different urls
:return: { ip1: [[url1,time1], [url2,time2]], ip2: ...}
"""
tmp_dict = dict()
now_time = time.time()
for i in input_list:
url = i[0]
ip = i[1]
ttime = i[2]
timestamp = time.mktime(time.strptime(ttime, '%d/%b/%Y:%H:%M:%S')) # 1563130060.0
if now_time - timestamp > within_time: # time check
continue
if ip not in tmp_dict: # check unique ip
tmp_dict[ip] = [[url, ttime]]
else:
unique = 1
url_ttime_list = tmp_dict[ip]
for j in url_ttime_list: # append unique url into url_time_list
if url == j[0]:
unique = 0
break
if unique == 1:
url_ttime_list.append([url, ttime])
tmp_dict[ip] = url_ttime_list
result_dict = dict() # check the number of urls
for ip in tmp_dict:
if len(tmp_dict[ip]) > max_count:
result_dict[ip] = tmp_dict[ip]
return json.dumps(result_dict)
if __name__ == '__main__':
if len(sys.argv) < 2:
print("python find_malicious_ip.py <path>/access.log")
file_path = sys.argv[1]
input_list = parse_nginx_log(file_path)
within_time = 60*60*24*30
max_count = 30
result_dict = find_malicious_ip(input_list, within_time, max_count)
print(result_dict)
0x03 result
起初写了一个只获取IP和URL的版本,效果如图所示:
后来考虑到TIME这个维度其实有保留的必要,增加了TIME之后,在线格式化的效果如下:
0x04 总结
问题简化的能力、冷静思考的能力...都很重要。