Selenium / Python Notes June 2016

For the past 24 hours I’ve been jumping into some Selenium fun in Python.


  • Firefox 47 doesn’t work, however Firefox 47.0.1 does.
Selenium Setup

from selenium import webdriver
from import By
from selenium.webdriver.common.proxy import *
from import WebDriverWait
from import expected_conditions as EC
from selenium.common.exceptions import TimeoutException

def init_driver():
driver = webdriver.Firefox()
driver.wait = WebDriverWait(driver, 5)
return driver

def lookup(driver):

if __name__ == “__main__”:
driver = init_driver()

Two ways to interact with an element on the page:

textbox = driver.wait.until(EC.presence_of_element_located((By.ID, "formTxtBox")))
except TimeoutException:
print("Login Form Not Found!")


textbox = driver.find_element_by_id("formTxtBox")

Taking HTML and feeding it into BeautifulSoup

div = driver.find_element_by_id("id_of_div_here")
soup = BeautifulSoup(propertyDropDown.get_attribute('innerHTML'), 'html.parser')
#Do something with the soup....

Creating a list of dates (requires pandas)

import pandas as pd
dates = []
index = pd.date_range('2015-7-1', periods=52, freq='W-WED')
for i in index:
return dates

Convert Table into CSV with BeauitfulSoup

(there are probably 100 better ways to do this…)

#Get the element of the table using CSS Selector
table = driver.wait.until(EC.presence_of_element_located((By.CSS_SELECTOR, "#table > tbody")))
#Take our table and pass it into BeautifulSoup so we can easily transverse it.
soup = BeautifulSoup(table.get_attribute('innerHTML'), 'html.parser')
#Make an empty list
rows = []
#Loop over the soup
for row in soup.find_all('tr'):
rows.append([val.text.strip('\n').encode('utf8').strip('\xc2\xa0') for val in row.find_all('td')])
#Save out to CSV.
with open('output_file.csv', 'ab') as f:
writer = csv.writer(f)
writer.writerows(row for row in rows if row)

Disclaimer: I’ve been programming in Python for about 5 minutes and using Selenium for about 2….while this code worked for my project, it’s most likely littered with errors and bad practices. This is mostly just a reference for myself for a project I might need to reefer back to this stuff in 6-12 months time. I am definitely not the person to ask for help from when it comes to this stuff.

Leave a Reply

Your email address will not be published. Required fields are marked *