For marketers
who love technology
Home » , , , , , , , » Python geopip2 and Maxmind tutorial: example code to geolocate the IP of Google secret datacenters

Python geopip2 and Maxmind tutorial: example code to geolocate the IP of Google secret datacenters

Have you ever wondered how to geolocate an IP address? It's very easy using Python and Maxmind database. You can find below an example IP address geolocation function that I wrote in Python to geolocate all the IP addresses of a BGP Autonomous System.

The function relies on Python geoip2. The code is heavily commented; so, I should be pretty easy to understand. I run it with all the Google autonomous system numbers as of 2014, it then gets the corresponding IP prefixes, and finally itreturns their geolocation city and country.

It is a simple way to identify the location of the datacenters from Google. It's fun, isn't it? :)



'''
Created on 13 mars 2014

This program relies on GeoLite2 data created by MaxMind, available from
http://www.maxmind.com

'''
import unittest


class Test(unittest.TestCase):
    def setUp(self):
        pass

    def tearDown(self):
        pass

    def testQuery(self):
        get_location('1.0.0.0/24')
        get_location('10.0.0.1')
        get_location('192.168.0.1')
        get_location('2607:f8b0::/32')
        get_location('fe80:f8b0::/32')

import re, json, pprint    
import geoip2.database, ipaddr
from urllib.request import urlopen

google_as = [15169, 16591, 19448, 22577, 22859, 24424, 36039, 36384, 36385, 
             36492, 36987, 41264, 45566, 36040, 36561, 43515]


def ripe_get_prefixes_per_asn(asn):
    """ Uses RIPE's API 
    (https://stat.ripe.net/data/announced-prefixes/data.?)
    to list the prefixes associated to a given Autonomous System Number (ASN)  
    
    This API is documented on https://stat.ripe.net/docs/data_api
    """
    url_base='https://stat.ripe.net/data/announced-prefixes/data.json?resource=AS'
    rep = urlopen(url_base+str(asn)+'&starttime='+'2011-12-12T12:00')
    # data is a binary string, not a normal string: we need to decode it 
    data= str(rep.read().decode(encoding='UTF-8'))
    rep.close()       
    js_data= json.loads(data)         
    pref_list=[]
    for record in js_data['data']['prefixes']:
        pref_list.append(record['prefix'])                       
    return pref_list


def maxmind_db_get(address):
    """Uses Maxmind's database to locate an IP address. Returns a location string """
    # check that the input is really an address    
    address_obj= ipaddr.IPAddress(address) 
    reader = geoip2.database.Reader('H:\Mes developpements\Map of prefixes\GeoLite2-City.mmdb')
    response = reader.city(str(address_obj))
    """ here is the list of interesting elements in the response
    cf. https://pypi.python.org/pypi/geoip2
    response.country.iso_code
    response.country.name
    response.subdivisions.most_specific.name
    response.city.name
    response.postal.code
    response.location.latitude
    response.location.longitude
    """    
    return str(response.city.name)+" ("+str(response.country.iso_code)+')'

def get_location(prefix):
    """takes an IPv4 or IPv6 prefix and gets its location
    """ 
    addr=   re.sub(r'0*/[0-9]{1,2}', "1", prefix)    
    data =maxmind_db_get(addr)
    if ('unknown' in str(data)) or ('None' in str(data)):
        #If the address is unknown, try other addresses in the prefix
        addr=   re.sub(r'0*/[0-9]{1,2}', "120", prefix)    
        data =maxmind_db_get(addr)        
        #If the address is still unknown, try another geolocation service
        if ('unknown' in str(data)) or ('None' in str(data)):
            # If the address is still unknown try an online IP geolocation database
            # it takes longer, so it's the last resort
            f = urlopen("http://api.hostip.info/get_html.php?ip="+addr)
            data = f.read().decode(encoding='UTF-8')
            # if the DB cannot geolocate the IP, return an empty string
            if ('Private' in str(data)) or ('Unknown' in str(data)):
                data=""     
            f.close()       
    return str(data)


if __name__ == "__main__":
    #import sys;sys.argv = ['', 'Test.testName']
    #unittest.main()
    # we get all prefixes from Google ASes    
    res={}
    for asn in google_as:
        res["AS"+str(asn)]= set()
        for prefix in ripe_get_prefixes_per_asn(asn):
            try:
                city = get_location(str(prefix))
                if city != "":
                    res["AS"+str(asn)].add(city)
            except:
                pass
    #we print the results
    pp = pprint.PrettyPrinter(indent=4)
    pp.pprint(res) 


If you like that post about Python, have a look to the python books by O'Reilly. I learned all I know about Python in them!


SHARE

About Gilles

0 comments :

Post a Comment