Python之Boss直聘职位数据获取

请注意:本篇文章仅用于个人学习研究使用,不涉及任何商业以及获取利益的活动,请各位小伙伴在学习与使用过程中遵守法律法规谨防侵权行为

起因:

其实就是因为最近在找工作,所以根据以前百度热搜的思路想看下是否能够以同样的方式实现获取Boss直聘职位信息

结果,显然是不行的,因为热搜作为实时新闻,是没有做访问与请求限制的,但是在Boss直聘中有反爬机制,所以用热搜的办法是无法获取职位数据的,准确来说说,无法获取到所有页的数据,于是问了下度娘,看了一个csdn上的例子,但是发现还是无法绕过反爬机制,存在的问题是,当使用一个带有分页的查询地址做查询时需要在请求头中添加cookie,但是这个cookie只能使用几次且有时间间隔限制,频次太快也会出发反爬机制,又找了一下发现了现状使用的方案。(目前因IP异常原因,IP被临时封禁)

使用Selenium模拟浏览器行为:

也就是说,通过Selenium库,调用浏览器驱动,通过驱动查找浏览器所在位置,打开浏览器并对写入的地址进行读取与解析,然后返回页面结果。这样就可以对页面中的内容进行操作了。

我使用这种方法是可行的,但是也有一些问题:

在使用selenium的webdriver库时,可以选择多种浏览器的驱动

    driver = webdriver.Chrome()
    driver = webdriver.Firefox()
    driver = webdriver.Safari()
    driver = webdriver.Edge()
    driver = webdriver.Ie()

但是需要注意的是这些驱动并不是电脑中就自带的,需要自行下载这些驱动并配置环境变量且需要对应浏览器的版本。这里就出现了一个问题。首先提供各浏览器驱动下载地址,根据自己的驱动类型和版本下载相对应的驱动。

BrowserDownload URL
Chrome(淘宝镜像)https://registry.npmmirror.com/binary.html?path=chromedriver/
Edgehttps://developer.microsoft.com/en-us/microsoft-edge/tools/webdriver/
Firefoxhttps://github.com/mozilla/geckodriver/releases
Safarihttps://webkit.org/blog/6900/webdriver-support-in-safari-10/

我这里使用的是chrome,可以打开自己的浏览器-帮助-关于 Chrome中查询当前电脑中的浏览器版本

因为一直是最新版本的所以在文档中并没有找到相应的版本,显示最新的是114版本的,如果版本不一致会导致报错

后来通过查找发现chrome浏览器在114版本之后就不再维护这个资源了,转到了另外一个网址:https://googlechromelabs.github.io/chrome-for-testing/known-good-versions-with-downloads.json

这是一个json,如果觉得看起来比较费劲的话可以安装一个json格式化的插件,我这里没有找到完全相同的版本,但是找到了一个大版本相同的,虽然有报错,但是不影响使用,我这里就使用的这个驱动

下载好后需要配置驱动的环境变量,以便程序调用时可以找到。把解压后的驱动chromedriver.exe文件放在一个合适的位置(位置可以自定义),这是我的路径

粘贴这个路径,(win10系统)打开我的电脑-此电脑右键-属性-高级系统设置

点击环境变量

在上方用户变量中找到path,双击后打开

新建一条,把刚才的地址粘贴进去,然后确定

配置好环境变量后就可以考虑如何做了,首先就是先通过模拟浏览器的操作获取每一页的数据,获取到每一页数据后,对每一页的信息解析,最终获取到我们需要的信息,汇总获取到的结果信息后将这些信息落库。明确怎么做后剩下的就是代码实现部分了。

首先给出需要用到的库,这里用到了两个文件,一个是之前做微信公众号推送信息用到的城市信息文件还有一个是配置文件,后面会讲配置文件中放了哪些内容

import pymysql
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import config
import cityinfo

代码部分分为了三部分:1.模拟浏览器操作获取页面数据集合 2.解析页面数据获取职位信息数据 3.保存数据

# 获取直聘数据
bossInfo = get_boss_info(max_pages=config.page_num)
# 获取企业数据集合
bossInfoList = []
# 获取企业数据
get_corporate_info(bossInfo, bossInfoList)
# 保存数据
save_data(bossInfoList)

首先看第一部分,第一部分先通过build_job_search_url获取到存放在配置文件中的省份和城市获取url地址(获取地址部分的方法还会通过配置文件获取搜索内容,后面会说,这里传入的page_num是配置文件中的页数信息),获取到地址后通过webdriver获取到驱动的get方法打开这个地址,这里增加了一个等待步骤,如果不添加这一步页面只会有请稍后的内容,并没有需要的内容,所以这里添加的等待页面加载完成的动作,在页面加载完成后可以通过检测页面中是否存在search-job-result这个class,如果存在则说明页面已经加载完毕,这里还添加了异常处理,暂时没有做过多操作只是打印了出来并跳过了这次循环,将获取到的页面信息添加到创建的集合中,最终循环走完返回这个集合。在返回之前关闭打开的浏览器窗口释放资源。

def get_boss_info(max_pages=10):
    driver = webdriver.Chrome()
    base_url = build_job_search_url(config.province, config.city)
    all_html = []

    for page in range(1, max_pages + 1):
        url = base_url + str(page)
        driver.get(url)

        # 等待页面加载完成,这里页面加载完后会存在search-job-result这个class,用于展示查询结果
        try:
            WebDriverWait(driver, 10).until(
                EC.presence_of_element_located((By.CLASS_NAME, "search-job-result"))
            )
        except Exception as e:
            print(f"Error waiting for page {page} to load: {e}")
            continue  # 或者可以选择退出循环

        # 获取页面源码
        html = driver.page_source
        all_html.append(html)

    # 关闭浏览器
    driver.quit()
    return all_html

在上面方法中使用到了build_job_search_url这个方法来获取url地址,这部分代码页比较简单,首先通过城市信息文件根据传入的省份和城市获取城市id并拼接到路径中,再通过配置文件中的search_position拼接搜索的内容,最终返回拼接好的url地址。

def build_job_search_url(province, city):
    # 获取城市id
    city_id = cityinfo.cityInfo[province][city]["AREAID"]
    url = "https://www.zhipin.com/web/geek/job?query={}&city={}&page=".format(config.search_position, city_id)
    return url

第二步是创建一个集合用来存放从页面中解析出需要的数据,并调用get_corporate_info方法进行解析,参数为创建的集合和页面信息集合。在这部分代码中因为页面是分页后的页面结果集,所以不能直接从结果集中获取数据,所以增加了get_boss_message方法获取每页的信息。

def get_corporate_info(bossInfo, bossInfoList):
    for html in bossInfo:
        get_boss_message(html, bossInfoList)

在get_boss_message这个方法中首先解析html内容,然后根据html中的内容获取每一个职位中的具体数据,这里除了外层职位数据外其实还做了内层详情页的数据获取,但是如果在循环内再去操作浏览器查询详情页信息效率会非常低,也可能会存在ip封禁的风险,所以可以考虑用其他的方式实现这部分内容的添加,先获取外层数据后可以根据详情页的url再通过其他方式获取详情数据对数据库中的内容进行补充。最后将获取到的数据添加到bossInfoList中,用于后面数据入库。

def get_boss_message(html, bossInfoList):
    # 解析HTML内容,获取到的数据是列表
    soup = BeautifulSoup(html, 'html.parser')
    # 找到职位列表
    job_list = soup.find_all('li', class_='job-card-wrapper')
    if job_list:
        for job in job_list:
            # 提取职位名称
            job_name = job.find('span', class_='job-name').text.strip()
            # 提取公司名称
            company_name = job.find('h3', class_='company-name').a.text.strip()
            # 提取职位薪资
            salary = job.find('span', class_='salary').text.strip()
            # 学历要求
            edu_tags_list = job.find('div', class_='job-info clearfix').find_all('li')
            edu_tags_str = ', '.join([tag.text for tag in edu_tags_list])
            # 提取工作地点
            address = job.find('span', class_='job-area').text.strip()
            # 提取岗位福利
            job_welfare = job.find('div', class_='info-desc').text.strip()
            # 提取技能标签
            skill_tags_list = job.find('div', class_='job-card-footer clearfix').find_all('li')
            skill_tags_str = ', '.join([tag.text for tag in skill_tags_list])
            # 提取企业标签
            company_tags_list = job.find('ul', class_='company-tag-list').find_all('li')
            company_tags_str = ', '.join([tag.text for tag in company_tags_list])
            # 提取企业logo
            company_logo = job.find('div', class_='company-logo').img.get('src')
            # 提取岗位详情
            job_detail = job.find('div', class_='job-card-body clearfix').a.get('href')
            # 拼接岗位详情链接
            job_detail_url = f"https://www.zhipin.com{job_detail}"
            # ================获取岗位详情信息================
            # # 可以做异步处理
            # job_detail_html = get_job_detail_info(job_detail_url)
            # # 解析岗位详情信息
            # job_detail_soup = BeautifulSoup(job_detail_html, 'html.parser')
            # # 提取职位描述信息
            # job_info = job_detail_soup.find('div', class_='job-sec-text').text
            # # 获取职位活跃情况信息
            # active_info = job_detail_soup.find('span', class_='boss-active-time').text
            # # 获取职位招聘状态信息
            # job_status = job_detail_soup.find('div', class_='job-status').text

            # 打印数据
            print(f"职位名称: {job_name}")
            print(f"公司名称: {company_name}")
            print(f"薪资: {salary}")
            print(f"学历: {edu_tags_str}")
            print(f"地址: {address}")
            print(f"技能标签: {skill_tags_str}")
            print(f"岗位福利: {job_welfare}")
            print(f"岗位详情链接: {job_detail_url}")
            print(f"企业标签: {company_tags_str}")
            print(f"企业logo: {company_logo}")
            # print(f"职位描述: {job_info}")
            # print(f"职位活跃情况: {active_info}")
            # print(f"职位招聘状态: {job_status}")
            print("-" * 20)
            bossInfoList.append({
                "job_name": job_name,
                "company_name": company_name,
                "salary": salary,
                "edu_tags_str": edu_tags_str,
                "address": address,
                "skill_tags_str": skill_tags_str,
                "job_welfare": job_welfare,
                "job_detail_url": job_detail_url,
                "company_tags_str": company_tags_str,
                "company_logo": company_logo})
            # "job_info": job_info,
            # "active_info": active_info,
            # "job_status": job_status})

    else:
        print("Cookie失效,请重新登录")
        return 0

最后一步就是保存解析到的数据,这部分其实和百度热搜差不太多,先是建立数据库的连接,数据库相对应的信息都放到了config文件中,建立连接后首先查询数据库中是否有相同数据(公司名,薪资,学历,地址,技能这几项都是相同的视为一个岗位,因为考虑在招聘过程中会对详情连接以及其他内容做修改,所以这样做的去重,其实也可以去掉几个条件,我这里是以这几个条件视为是一个职位信息的,所以我是这样做的),如果存在相同数据则更新这条记录,如果不存在则做新增操作,相当于是增量添加,如果插入失败的话回滚这条插入操作。最后关闭数据库连接,释放资源。

def save_data(bossInfoList):
    conn = pymysql.connect(host=config.datasource_ip, port=config.datasource_port, user=config.datasource_user,
                           passwd=config.datasource_password, db=config.database)
    cursor = conn.cursor()
    # 准备查询SQL,根据工作名,公司名,薪资,学历,地址,技能查询记录是否存在
    query_sql = "SELECT COUNT(*) FROM boss_info WHERE job_name = %s AND company_name = %s AND salary = %s AND edu_tags = %s AND address = %s AND skill_tags = %s"
    # 准备插入SQL
    insert_sql = "INSERT INTO boss_info(job_name, company_name, salary, edu_tags, address, skill_tags, job_welfare, job_detail_url, company_tags, company_logo) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)"
    for item in bossInfoList:
        # 检查记录是否已存在
        cursor.execute(query_sql, (item['job_name'], item['company_name'], item['salary'], item['edu_tags_str'], item['address'], item['skill_tags_str']))
        exists = cursor.fetchone()[0]
        if exists == 0:
            # 如果不存在,则插入
            try:
                cursor.execute(insert_sql, (
                    item['job_name'], item['company_name'], item['salary'], item['edu_tags_str'], item['address'],
                    item['skill_tags_str'], item['job_welfare'], item['job_detail_url'], item['company_tags_str'],
                    item['company_logo']))
                conn.commit()
                print("插入成功")
            except Exception as e:
                print(e)
                # 如果插入失败,则回滚
                conn.rollback()
        else:
            # 如果存在,则更新数据
            print("数据已存在")
            # 更新SQL
            update_sql = "UPDATE boss_info SET job_name=%s, company_name=%s, salary=%s, edu_tags=%s, address=%s, skill_tags=%s, job_welfare=%s, company_tags=%s, company_logo=%s WHERE job_name = %s AND company_name = %s AND salary = %s AND edu_tags = %s AND address = %s AND skill_tags = %s"
            try:
                cursor.execute(update_sql, (item['job_name'], item['company_name'], item['salary'], item['edu_tags_str'],
                                            item['address'], item['skill_tags_str'], item['job_welfare'], item['company_tags_str'],
                                          item['company_logo'], item['job_name'], item['company_name'], item['salary'],
                                            item['edu_tags_str'], item['address'], item['skill_tags_str']))
                conn.commit()
                print("更新成功")
            except Exception as e:
                print(e)
                # 如果插入失败,则回滚
                conn.rollback()
    cursor.close()
    conn.close()

然后就是配置文件,从上面代码的讲解中其实也说的差不多了,将自己的需要搜索的内容,数据库信息,需要查询的页数以及城市信息填进去,其他的执行方法可以不用改动,这里需要注意城市信息需要和文件中的能够对应上,不然会有问题(查询不到城市信息)。

# 要搜索的职位
search_position = "xxxxxx"
# 数据库ip地址
datasource_ip = 'xxxxxx'
# 数据库端口号
datasource_port = xxxxxx
# 数据库用户名
datasource_user = 'xxxxxx'
# 数据库密码
datasource_password = 'xxxxxx'
# 数据库名称
database = 'xxxxxx'
# 爬虫配置
# 爬取的页面数
page_num = 10
# 信息配置
# 所在省份
province = "xx"
# 所在城市
city = "xx"

最后就是完整的代码:

BossInfo.py
import pymysql
from bs4 import BeautifulSoup
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import config
import cityinfo

def build_job_search_url(province, city):
    # 获取城市id
    city_id = cityinfo.cityInfo[province][city]["AREAID"]
    url = "https://www.zhipin.com/web/geek/job?query={}&city={}&page=".format(config.search_position, city_id)
    return url
def get_boss_info(max_pages=10):
    driver = webdriver.Chrome()
    base_url = build_job_search_url(config.province, config.city)
    all_html = []

    for page in range(1, max_pages + 1):
        url = base_url + str(page)
        driver.get(url)

        # 等待页面加载完成,这里页面加载完后会存在search-job-result这个class,用于展示查询结果
        try:
            WebDriverWait(driver, 10).until(
                EC.presence_of_element_located((By.CLASS_NAME, "search-job-result"))
            )
        except Exception as e:
            print(f"Error waiting for page {page} to load: {e}")
            continue  # 或者可以选择退出循环

        # 获取页面源码
        html = driver.page_source
        all_html.append(html)

    # 关闭浏览器
    driver.quit()
    return all_html


def get_job_detail_info(url):
    driver = webdriver.Chrome()
    # 获取岗位详情信息
    driver.get(url)
    try:
        WebDriverWait(driver, 10).until(
            EC.presence_of_element_located((By.CLASS_NAME, "job-detail-section"))
        )
    except Exception as e:
        print(f"Error waiting to load: {e}")
        return 0
    html = driver.page_source
    driver.quit()
    return html


def get_boss_message(html, bossInfoList):
    # 解析HTML内容,获取到的数据是列表
    soup = BeautifulSoup(html, 'html.parser')
    # 找到职位列表
    job_list = soup.find_all('li', class_='job-card-wrapper')
    if job_list:
        for job in job_list:
            # 提取职位名称
            job_name = job.find('span', class_='job-name').text.strip()
            # 提取公司名称
            company_name = job.find('h3', class_='company-name').a.text.strip()
            # 提取职位薪资
            salary = job.find('span', class_='salary').text.strip()
            # 学历要求
            edu_tags_list = job.find('div', class_='job-info clearfix').find_all('li')
            edu_tags_str = ', '.join([tag.text for tag in edu_tags_list])
            # 提取工作地点
            address = job.find('span', class_='job-area').text.strip()
            # 提取岗位福利
            job_welfare = job.find('div', class_='info-desc').text.strip()
            # 提取技能标签
            skill_tags_list = job.find('div', class_='job-card-footer clearfix').find_all('li')
            skill_tags_str = ', '.join([tag.text for tag in skill_tags_list])
            # 提取企业标签
            company_tags_list = job.find('ul', class_='company-tag-list').find_all('li')
            company_tags_str = ', '.join([tag.text for tag in company_tags_list])
            # 提取企业logo
            company_logo = job.find('div', class_='company-logo').img.get('src')
            # 提取岗位详情
            job_detail = job.find('div', class_='job-card-body clearfix').a.get('href')
            # 拼接岗位详情链接
            job_detail_url = f"https://www.zhipin.com{job_detail}"
            # ================获取岗位详情信息================
            # # 可以做异步处理
            # job_detail_html = get_job_detail_info(job_detail_url)
            # # 解析岗位详情信息
            # job_detail_soup = BeautifulSoup(job_detail_html, 'html.parser')
            # # 提取职位描述信息
            # job_info = job_detail_soup.find('div', class_='job-sec-text').text
            # # 获取职位活跃情况信息
            # active_info = job_detail_soup.find('span', class_='boss-active-time').text
            # # 获取职位招聘状态信息
            # job_status = job_detail_soup.find('div', class_='job-status').text

            # 打印数据
            print(f"职位名称: {job_name}")
            print(f"公司名称: {company_name}")
            print(f"薪资: {salary}")
            print(f"学历: {edu_tags_str}")
            print(f"地址: {address}")
            print(f"技能标签: {skill_tags_str}")
            print(f"岗位福利: {job_welfare}")
            print(f"岗位详情链接: {job_detail_url}")
            print(f"企业标签: {company_tags_str}")
            print(f"企业logo: {company_logo}")
            # print(f"职位描述: {job_info}")
            # print(f"职位活跃情况: {active_info}")
            # print(f"职位招聘状态: {job_status}")
            print("-" * 20)
            bossInfoList.append({
                "job_name": job_name,
                "company_name": company_name,
                "salary": salary,
                "edu_tags_str": edu_tags_str,
                "address": address,
                "skill_tags_str": skill_tags_str,
                "job_welfare": job_welfare,
                "job_detail_url": job_detail_url,
                "company_tags_str": company_tags_str,
                "company_logo": company_logo})
            # "job_info": job_info,
            # "active_info": active_info,
            # "job_status": job_status})

    else:
        print("Cookie失效,请重新登录")
        return 0


def get_corporate_info(bossInfo, bossInfoList):
    for html in bossInfo:
        get_boss_message(html, bossInfoList)


def save_data(bossInfoList):
    conn = pymysql.connect(host=config.datasource_ip, port=config.datasource_port, user=config.datasource_user,
                           passwd=config.datasource_password, db=config.database)
    cursor = conn.cursor()
    # 准备查询SQL,根据工作名,公司名,薪资,学历,地址,技能查询记录是否存在
    query_sql = "SELECT COUNT(*) FROM boss_info WHERE job_name = %s AND company_name = %s AND salary = %s AND edu_tags = %s AND address = %s AND skill_tags = %s"
    # 准备插入SQL
    insert_sql = "INSERT INTO boss_info(job_name, company_name, salary, edu_tags, address, skill_tags, job_welfare, job_detail_url, company_tags, company_logo) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s)"
    for item in bossInfoList:
        # 检查记录是否已存在
        cursor.execute(query_sql, (item['job_name'], item['company_name'], item['salary'], item['edu_tags_str'], item['address'], item['skill_tags_str']))
        exists = cursor.fetchone()[0]
        if exists == 0:
            # 如果不存在,则插入
            try:
                cursor.execute(insert_sql, (
                    item['job_name'], item['company_name'], item['salary'], item['edu_tags_str'], item['address'],
                    item['skill_tags_str'], item['job_welfare'], item['job_detail_url'], item['company_tags_str'],
                    item['company_logo']))
                conn.commit()
                print("插入成功")
            except Exception as e:
                print(e)
                # 如果插入失败,则回滚
                conn.rollback()
        else:
            # 如果存在,则更新数据
            print("数据已存在")
            # 更新SQL
            update_sql = "UPDATE boss_info SET job_name=%s, company_name=%s, salary=%s, edu_tags=%s, address=%s, skill_tags=%s, job_welfare=%s, company_tags=%s, company_logo=%s WHERE job_name = %s AND company_name = %s AND salary = %s AND edu_tags = %s AND address = %s AND skill_tags = %s"
            try:
                cursor.execute(update_sql, (item['job_name'], item['company_name'], item['salary'], item['edu_tags_str'],
                                            item['address'], item['skill_tags_str'], item['job_welfare'], item['company_tags_str'],
                                          item['company_logo'], item['job_name'], item['company_name'], item['salary'],
                                            item['edu_tags_str'], item['address'], item['skill_tags_str']))
                conn.commit()
                print("更新成功")
            except Exception as e:
                print(e)
                # 如果插入失败,则回滚
                conn.rollback()
    cursor.close()
    conn.close()


if __name__ == '__main__':
    # 获取直聘数据
    bossInfo = get_boss_info(max_pages=config.page_num)
    # 获取企业数据集合
    bossInfoList = []
    # 获取企业数据
    get_corporate_info(bossInfo, bossInfoList)
    # print(corporateInfo)
    # 保存数据
    save_data(bossInfoList)
config.py
# 要搜索的职位
search_position = "xxxxxx"
# 数据库ip地址
datasource_ip = 'xxxxxx'
# 数据库端口号
datasource_port = xxxxxx
# 数据库用户名
datasource_user = 'xxxxxx'
# 数据库密码
datasource_password = 'xxxxxx'
# 数据库名称
database = 'xxxxxx'
# 爬虫配置
# 爬取的页面数
page_num = 10
# 信息配置
# 所在省份
province = "xx"
# 所在城市
city = "xx"
cityinfo.py
#coding=UTF-8
cityInfo = {
  "北京": {
    "北京": {
      "AREAID": "101010100"
    }
  },
  "上海": {
    "上海": {
      "AREAID": "101020100"
    }
  },
  "天津": {
    "天津": {
      "AREAID": "101030100"
    }
  },
  "重庆": {
    "重庆": {
      "AREAID": "101040100"
    }
  },
  "黑龙江": {
    "哈尔滨": {
      "AREAID": "101050101"
    },
    "齐齐哈尔": {
      "AREAID": "101050201"
    },
    "牡丹江": {
      "AREAID": "101050301"
    },
    "佳木斯": {
      "AREAID": "101050401"
    },
    "绥化": {
      "AREAID": "101050501"
    },
    "黑河": {
      "AREAID": "101050601"
    },
    "大兴安岭": {
      "AREAID": "101050701"
    },
    "伊春": {
      "AREAID": "101050801"
    },
    "大庆": {
      "AREAID": "101050901"
    },
    "七台河": {
      "AREAID": "101051002"
    },
    "鸡西": {
      "AREAID": "101051101"
    },
    "鹤岗": {
      "AREAID": "101051201"
    },
    "双鸭山": {
      "AREAID": "101051301"
    }
  },
  "吉林": {
    "长春": {
      "AREAID": "101060101"
    },
    "吉林": {
      "AREAID": "101060201"
    },
    "延边": {
      "AREAID": "101060306"
    },
    "四平": {
      "AREAID": "101060401"
    },
    "通化": {
      "AREAID": "101060501"
    },
    "白城": {
      "AREAID": "101060601"
    },
    "辽源": {
      "AREAID": "101060701"
    },
    "松原": {
      "AREAID": "101060801"
    },
    "白山": {
      "AREAID": "101060901"
    }
  },
  "辽宁": {
    "沈阳": {
      "AREAID": "101070101"
    },
    "大连": {
      "AREAID": "101070201"
    },
    "鞍山": {
      "AREAID": "101070301"
    },
    "抚顺": {
      "AREAID": "101070401"
    },
    "本溪": {
      "AREAID": "101070501"
    },
    "丹东": {
      "AREAID": "101070601"
    },
    "锦州": {
      "AREAID": "101070701"
    },
    "营口": {
      "AREAID": "101070801"
    },
    "阜新": {
      "AREAID": "101070901"
    },
    "辽阳": {
      "AREAID": "101071001"
    },
    "铁岭": {
      "AREAID": "101071101"
    },
    "朝阳": {
      "AREAID": "101071201"
    },
    "盘锦": {
      "AREAID": "101071301"
    },
    "葫芦岛": {
      "AREAID": "101071401"
    }
  },
  "内蒙古": {
    "呼和浩特": {
      "AREAID": "101080101"
    },
    "包头": {
      "AREAID": "101080201"
    },
    "乌海": {
      "AREAID": "101080301"
    },
    "乌兰察布": {
      "AREAID": "101080405"
    },
    "通辽": {
      "AREAID": "101080501"
    },
    "赤峰": {
      "AREAID": "101080601"
    },
    "鄂尔多斯": {
      "AREAID": "101080701"
    },
    "巴彦淖尔": {
      "AREAID": "101080811"
    },
    "锡林郭勒": {
      "AREAID": "101080902"
    },
    "呼伦贝尔": {
      "AREAID": "101081013"
    },
    "兴安盟": {
      "AREAID": "101081108"
    },
    "阿拉善盟": {
      "AREAID": "101081213"
    }
  },
  "河北": {
    "石家庄": {
      "AREAID": "101090101"
    },
    "保定": {
      "AREAID": "101090201"
    },
    "张家口": {
      "AREAID": "101090301"
    },
    "承德": {
      "AREAID": "101090402"
    },
    "唐山": {
      "AREAID": "101090501"
    },
    "廊坊": {
      "AREAID": "101090601"
    },
    "沧州": {
      "AREAID": "101090701"
    },
    "衡水": {
      "AREAID": "101090801"
    },
    "邢台": {
      "AREAID": "101090901"
    },
    "邯郸": {
      "AREAID": "101091001"
    },
    "秦皇岛": {
      "AREAID": "101091101"
    },
    "雄安新区": {
      "AREAID": "101091201"
    }
  },
  "山西": {
    "太原": {
      "AREAID": "101100101"
    },
    "大同": {
      "AREAID": "101100201"
    },
    "阳泉": {
      "AREAID": "101100301"
    },
    "晋中": {
      "AREAID": "101100401"
    },
    "长治": {
      "AREAID": "101100501"
    },
    "晋城": {
      "AREAID": "101100601"
    },
    "临汾": {
      "AREAID": "101100701"
    },
    "运城": {
      "AREAID": "101100801"
    },
    "朔州": {
      "AREAID": "101100901"
    },
    "忻州": {
      "AREAID": "101101001"
    },
    "吕梁": {
      "AREAID": "101101100"
    }
  },
  "陕西": {
    "西安": {
      "AREAID": "101110101"
    },
    "咸阳": {
      "AREAID": "101110200"
    },
    "延安": {
      "AREAID": "101110300"
    },
    "榆林": {
      "AREAID": "101110401"
    },
    "渭南": {
      "AREAID": "101110501"
    },
    "商洛": {
      "AREAID": "101110601"
    },
    "安康": {
      "AREAID": "101110701"
    },
    "汉中": {
      "AREAID": "101110801"
    },
    "宝鸡": {
      "AREAID": "101110901"
    },
    "铜川": {
      "AREAID": "101111001"
    },
    "杨凌": {
      "AREAID": "101111101"
    }
  },
  "山东": {
    "济南": {
      "AREAID": "101120101"
    },
    "青岛": {
      "AREAID": "101120201"
    },
    "淄博": {
      "AREAID": "101120301"
    },
    "德州": {
      "AREAID": "101120401"
    },
    "烟台": {
      "AREAID": "101120501"
    },
    "潍坊": {
      "AREAID": "101120601"
    },
    "济宁": {
      "AREAID": "101120701"
    },
    "泰安": {
      "AREAID": "101120801"
    },
    "临沂": {
      "AREAID": "101120901"
    },
    "菏泽": {
      "AREAID": "101121001"
    },
    "滨州": {
      "AREAID": "101121101"
    },
    "东营": {
      "AREAID": "101121201"
    },
    "威海": {
      "AREAID": "101121301"
    },
    "枣庄": {
      "AREAID": "101121401"
    },
    "日照": {
      "AREAID": "101121501"
    },
    "莱芜": {
      "AREAID": "101121601"
    },
    "聊城": {
      "AREAID": "101121701"
    }
  },
  "新疆": {
    "乌鲁木齐": {
      "AREAID": "101130101"
    },
    "克拉玛依": {
      "AREAID": "101130201"
    },
    "石河子": {
      "AREAID": "101130301"
    },
    "昌吉": {
      "AREAID": "101130401"
    },
    "吐鲁番": {
      "AREAID": "101130501"
    },
    "巴音郭楞": {
      "AREAID": "101130609"
    },
    "阿拉尔": {
      "AREAID": "101130701"
    },
    "阿克苏": {
      "AREAID": "101130801"
    },
    "喀什": {
      "AREAID": "101130901"
    },
    "伊犁": {
      "AREAID": "101131012"
    },
    "塔城": {
      "AREAID": "101131101"
    },
    "哈密": {
      "AREAID": "101131201"
    },
    "和田": {
      "AREAID": "101131301"
    },
    "阿勒泰": {
      "AREAID": "101131401"
    },
    "克州": {
      "AREAID": "101131505"
    },
    "博尔塔拉": {
      "AREAID": "101131604"
    },
    "图木舒克": {
      "AREAID": "101131701"
    },
    "五家渠": {
      "AREAID": "101131801"
    },
    "铁门关": {
      "AREAID": "101131901"
    },
    "昆玉": {
      "AREAID": "101131920"
    },
    "北屯": {
      "AREAID": "101132101"
    },
    "双河": {
      "AREAID": "101132201"
    },
    "可克达拉": {
      "AREAID": "101132301"
    }
  },
  "西藏": {
    "拉萨": {
      "AREAID": "101140101"
    },
    "日喀则": {
      "AREAID": "101140201"
    },
    "山南": {
      "AREAID": "101140301"
    },
    "林芝": {
      "AREAID": "101140401"
    },
    "昌都": {
      "AREAID": "101140501"
    },
    "那曲": {
      "AREAID": "101140601"
    },
    "阿里": {
      "AREAID": "101140701"
    }
  },
  "青海": {
    "西宁": {
      "AREAID": "101150101"
    },
    "海东": {
      "AREAID": "101150207"
    },
    "黄南": {
      "AREAID": "101150305"
    },
    "海南": {
      "AREAID": "101150402"
    },
    "果洛": {
      "AREAID": "101150507"
    },
    "玉树": {
      "AREAID": "101150601"
    },
    "海西": {
      "AREAID": "101150702"
    },
    "海北": {
      "AREAID": "101150804"
    }
  },
  "甘肃": {
    "兰州": {
      "AREAID": "101160101"
    },
    "定西": {
      "AREAID": "101160201"
    },
    "平凉": {
      "AREAID": "101160301"
    },
    "庆阳": {
      "AREAID": "101160401"
    },
    "武威": {
      "AREAID": "101160501"
    },
    "金昌": {
      "AREAID": "101160601"
    },
    "张掖": {
      "AREAID": "101160701"
    },
    "酒泉": {
      "AREAID": "101160801"
    },
    "天水": {
      "AREAID": "101160901"
    },
    "陇南": {
      "AREAID": "101161010"
    },
    "临夏": {
      "AREAID": "101161101"
    },
    "甘南": {
      "AREAID": "101161209"
    },
    "白银": {
      "AREAID": "101161301"
    },
    "嘉峪关": {
      "AREAID": "101161401"
    }
  },
  "宁夏": {
    "银川": {
      "AREAID": "101170101"
    },
    "石嘴山": {
      "AREAID": "101170201"
    },
    "吴忠": {
      "AREAID": "101170301"
    },
    "固原": {
      "AREAID": "101170401"
    },
    "中卫": {
      "AREAID": "101170501"
    }
  },
  "河南": {
    "郑州": {
      "AREAID": "101180101"
    },
    "安阳": {
      "AREAID": "101180201"
    },
    "新乡": {
      "AREAID": "101180301"
    },
    "许昌": {
      "AREAID": "101180401"
    },
    "平顶山": {
      "AREAID": "101180501"
    },
    "信阳": {
      "AREAID": "101180601"
    },
    "南阳": {
      "AREAID": "101180701"
    },
    "开封": {
      "AREAID": "101180801"
    },
    "洛阳": {
      "AREAID": "101180901"
    },
    "商丘": {
      "AREAID": "101181001"
    },
    "焦作": {
      "AREAID": "101181101"
    },
    "鹤壁": {
      "AREAID": "101181201"
    },
    "濮阳": {
      "AREAID": "101181301"
    },
    "周口": {
      "AREAID": "101181401"
    },
    "漯河": {
      "AREAID": "101181501"
    },
    "驻马店": {
      "AREAID": "101181601"
    },
    "三门峡": {
      "AREAID": "101181701"
    },
    "济源": {
      "AREAID": "101181801"
    }
  },
  "江苏": {
    "南京": {
      "AREAID": "101190101"
    },
    "无锡": {
      "AREAID": "101190201"
    },
    "镇江": {
      "AREAID": "101190301"
    },
    "苏州": {
      "AREAID": "101190401"
    },
    "南通": {
      "AREAID": "101190501"
    },
    "扬州": {
      "AREAID": "101190601"
    },
    "盐城": {
      "AREAID": "101190701"
    },
    "徐州": {
      "AREAID": "101190801"
    },
    "淮安": {
      "AREAID": "101190901"
    },
    "连云港": {
      "AREAID": "101191001"
    },
    "常州": {
      "AREAID": "101191101"
    },
    "泰州": {
      "AREAID": "101191201"
    },
    "宿迁": {
      "AREAID": "101191301"
    }
  },
  "湖北": {
    "武汉": {
      "AREAID": "101200101"
    },
    "襄阳": {
      "AREAID": "101200201"
    },
    "鄂州": {
      "AREAID": "101200301"
    },
    "孝感": {
      "AREAID": "101200401"
    },
    "黄冈": {
      "AREAID": "101200501"
    },
    "黄石": {
      "AREAID": "101200601"
    },
    "咸宁": {
      "AREAID": "101200701"
    },
    "荆州": {
      "AREAID": "101200801"
    },
    "宜昌": {
      "AREAID": "101200901"
    },
    "恩施": {
      "AREAID": "101201001"
    },
    "十堰": {
      "AREAID": "101201101"
    },
    "神农架": {
      "AREAID": "101201201"
    },
    "随州": {
      "AREAID": "101201301"
    },
    "荆门": {
      "AREAID": "101201401"
    },
    "天门": {
      "AREAID": "101201501"
    },
    "仙桃": {
      "AREAID": "101201601"
    },
    "潜江": {
      "AREAID": "101201701"
    }
  },
  "浙江": {
    "杭州": {
      "AREAID": "101210101"
    },
    "湖州": {
      "AREAID": "101210201"
    },
    "嘉兴": {
      "AREAID": "101210301"
    },
    "宁波": {
      "AREAID": "101210401"
    },
    "绍兴": {
      "AREAID": "101210507"
    },
    "台州": {
      "AREAID": "101210601"
    },
    "温州": {
      "AREAID": "101210701"
    },
    "丽水": {
      "AREAID": "101210801"
    },
    "金华": {
      "AREAID": "101210901"
    },
    "衢州": {
      "AREAID": "101211001"
    },
    "舟山": {
      "AREAID": "101211101"
    }
  },
  "安徽": {
    "合肥": {
      "AREAID": "101220101"
    },
    "蚌埠": {
      "AREAID": "101220201"
    },
    "芜湖": {
      "AREAID": "101220301"
    },
    "淮南": {
      "AREAID": "101220401"
    },
    "马鞍山": {
      "AREAID": "101220501"
    },
    "安庆": {
      "AREAID": "101220601"
    },
    "宿州": {
      "AREAID": "101220701"
    },
    "阜阳": {
      "AREAID": "101220801"
    },
    "亳州": {
      "AREAID": "101220901"
    },
    "黄山": {
      "AREAID": "101221001"
    },
    "滁州": {
      "AREAID": "101221101"
    },
    "淮北": {
      "AREAID": "101221201"
    },
    "铜陵": {
      "AREAID": "101221301"
    },
    "宣城": {
      "AREAID": "101221401"
    },
    "六安": {
      "AREAID": "101221501"
    },
    "池州": {
      "AREAID": "101221701"
    }
  },
  "福建": {
    "福州": {
      "AREAID": "101230101"
    },
    "厦门": {
      "AREAID": "101230201"
    },
    "宁德": {
      "AREAID": "101230301"
    },
    "莆田": {
      "AREAID": "101230401"
    },
    "泉州": {
      "AREAID": "101230501"
    },
    "漳州": {
      "AREAID": "101230601"
    },
    "龙岩": {
      "AREAID": "101230701"
    },
    "三明": {
      "AREAID": "101230801"
    },
    "南平": {
      "AREAID": "101230901"
    },
    "钓鱼岛": {
      "AREAID": "101231001"
    }
  },
  "江西": {
    "南昌": {
      "AREAID": "101240101"
    },
    "九江": {
      "AREAID": "101240201"
    },
    "上饶": {
      "AREAID": "101240301"
    },
    "抚州": {
      "AREAID": "101240401"
    },
    "宜春": {
      "AREAID": "101240501"
    },
    "吉安": {
      "AREAID": "101240601"
    },
    "赣州": {
      "AREAID": "101240701"
    },
    "景德镇": {
      "AREAID": "101240801"
    },
    "萍乡": {
      "AREAID": "101240901"
    },
    "新余": {
      "AREAID": "101241001"
    },
    "鹰潭": {
      "AREAID": "101241101"
    }
  },
  "湖南": {
    "长沙": {
      "AREAID": "101250101"
    },
    "湘潭": {
      "AREAID": "101250201"
    },
    "株洲": {
      "AREAID": "101250301"
    },
    "衡阳": {
      "AREAID": "101250401"
    },
    "郴州": {
      "AREAID": "101250501"
    },
    "常德": {
      "AREAID": "101250601"
    },
    "益阳": {
      "AREAID": "101250700"
    },
    "娄底": {
      "AREAID": "101250801"
    },
    "邵阳": {
      "AREAID": "101250901"
    },
    "岳阳": {
      "AREAID": "101251001"
    },
    "张家界": {
      "AREAID": "101251101"
    },
    "怀化": {
      "AREAID": "101251201"
    },
    "永州": {
      "AREAID": "101251401"
    },
    "湘西": {
      "AREAID": "101251509"
    }
  },
  "贵州": {
    "贵阳": {
      "AREAID": "101260101"
    },
    "遵义": {
      "AREAID": "101260201"
    },
    "安顺": {
      "AREAID": "101260301"
    },
    "黔南": {
      "AREAID": "101260413"
    },
    "黔东南": {
      "AREAID": "101260506"
    },
    "铜仁": {
      "AREAID": "101260601"
    },
    "毕节": {
      "AREAID": "101260701"
    },
    "六盘水": {
      "AREAID": "101260803"
    },
    "黔西南": {
      "AREAID": "101260906"
    }
  },
  "四川": {
    "成都": {
      "AREAID": "101270101"
    },
    "攀枝花": {
      "AREAID": "101270201"
    },
    "自贡": {
      "AREAID": "101270301"
    },
    "绵阳": {
      "AREAID": "101270401"
    },
    "南充": {
      "AREAID": "101270501"
    },
    "达州": {
      "AREAID": "101270601"
    },
    "遂宁": {
      "AREAID": "101270701"
    },
    "广安": {
      "AREAID": "101270801"
    },
    "巴中": {
      "AREAID": "101270901"
    },
    "泸州": {
      "AREAID": "101271001"
    },
    "宜宾": {
      "AREAID": "101271101"
    },
    "内江": {
      "AREAID": "101271201"
    },
    "资阳": {
      "AREAID": "101271301"
    },
    "乐山": {
      "AREAID": "101271401"
    },
    "眉山": {
      "AREAID": "101271501"
    },
    "凉山": {
      "AREAID": "101271601"
    },
    "雅安": {
      "AREAID": "101271701"
    },
    "甘孜": {
      "AREAID": "101271801"
    },
    "阿坝": {
      "AREAID": "101271901"
    },
    "德阳": {
      "AREAID": "101272001"
    },
    "广元": {
      "AREAID": "101272101"
    }
  },
  "广东": {
    "广州": {
      "AREAID": "101280101"
    },
    "韶关": {
      "AREAID": "101280201"
    },
    "惠州": {
      "AREAID": "101280301"
    },
    "梅州": {
      "AREAID": "101280401"
    },
    "汕头": {
      "AREAID": "101280501"
    },
    "深圳": {
      "AREAID": "101280601"
    },
    "珠海": {
      "AREAID": "101280701"
    },
    "佛山": {
      "AREAID": "101280800"
    },
    "肇庆": {
      "AREAID": "101280901"
    },
    "湛江": {
      "AREAID": "101281001"
    },
    "江门": {
      "AREAID": "101281101"
    },
    "河源": {
      "AREAID": "101281201"
    },
    "清远": {
      "AREAID": "101281301"
    },
    "云浮": {
      "AREAID": "101281401"
    },
    "潮州": {
      "AREAID": "101281501"
    },
    "东莞": {
      "AREAID": "101281601"
    },
    "中山": {
      "AREAID": "101281701"
    },
    "阳江": {
      "AREAID": "101281801"
    },
    "揭阳": {
      "AREAID": "101281901"
    },
    "茂名": {
      "AREAID": "101282001"
    },
    "汕尾": {
      "AREAID": "101282101"
    }
  },
  "云南": {
    "昆明": {
      "AREAID": "101290101"
    },
    "大理": {
      "AREAID": "101290201"
    },
    "红河": {
      "AREAID": "101290301"
    },
    "曲靖": {
      "AREAID": "101290401"
    },
    "保山": {
      "AREAID": "101290501"
    },
    "文山": {
      "AREAID": "101290601"
    },
    "玉溪": {
      "AREAID": "101290701"
    },
    "楚雄": {
      "AREAID": "101290801"
    },
    "普洱": {
      "AREAID": "101290901"
    },
    "昭通": {
      "AREAID": "101291001"
    },
    "临沧": {
      "AREAID": "101291101"
    },
    "怒江": {
      "AREAID": "101291201"
    },
    "迪庆": {
      "AREAID": "101291305"
    },
    "丽江": {
      "AREAID": "101291401"
    },
    "德宏": {
      "AREAID": "101291501"
    },
    "西双版纳": {
      "AREAID": "101291602"
    }
  },
  "广西": {
    "南宁": {
      "AREAID": "101300101"
    },
    "崇左": {
      "AREAID": "101300201"
    },
    "柳州": {
      "AREAID": "101300301"
    },
    "来宾": {
      "AREAID": "101300401"
    },
    "桂林": {
      "AREAID": "101300501"
    },
    "梧州": {
      "AREAID": "101300601"
    },
    "贺州": {
      "AREAID": "101300701"
    },
    "贵港": {
      "AREAID": "101300801"
    },
    "玉林": {
      "AREAID": "101300901"
    },
    "百色": {
      "AREAID": "101301001"
    },
    "钦州": {
      "AREAID": "101301101"
    },
    "河池": {
      "AREAID": "101301201"
    },
    "北海": {
      "AREAID": "101301301"
    },
    "防城港": {
      "AREAID": "101301401"
    }
  },
  "海南": {
    "海口": {
      "AREAID": "101310101"
    },
    "三亚": {
      "AREAID": "101310201"
    },
    "东方": {
      "AREAID": "101310202"
    },
    "临高": {
      "AREAID": "101310203"
    },
    "澄迈": {
      "AREAID": "101310204"
    },
    "儋州": {
      "AREAID": "101310205"
    },
    "昌江": {
      "AREAID": "101310206"
    },
    "白沙": {
      "AREAID": "101310207"
    },
    "琼中": {
      "AREAID": "101310208"
    },
    "定安": {
      "AREAID": "101310209"
    },
    "屯昌": {
      "AREAID": "101310210"
    },
    "琼海": {
      "AREAID": "101310211"
    },
    "文昌": {
      "AREAID": "101310212"
    },
    "保亭": {
      "AREAID": "101310214"
    },
    "万宁": {
      "AREAID": "101310215"
    },
    "陵水": {
      "AREAID": "101310216"
    },
    "乐东": {
      "AREAID": "101310221"
    },
    "五指山": {
      "AREAID": "101310222"
    },
    "三沙": {
      "AREAID": "101310301"
    }
  },
  "香港": {
    "香港": {
      "AREAID": "101320101"
    }
  },
  "澳门": {
    "澳门": {
      "AREAID": "101330101"
    }
  },
  "台湾": {
    "台北": {
      "AREAID": "101340101"
    },
    "高雄": {
      "AREAID": "101340201"
    },
    "台中": {
      "AREAID": "101340401"
    }
  }
}

如果成功调用后就可以在数据库中查询到数据了

至此整个过程就结束了。

“Python之Boss直聘职位数据获取”的一个回复

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

*

202 次浏览