Database Connectivity: SQL Databases

Many times programs will share data with each other using databases on remote computers over the Internet or local area network. The vast majority of these databases use a Structured Query Language called SQL.

Python 3 actually has a built in “simple SQL” module called sqlite3 that allows you to keep a database of information as a file on the local filesystem. This allows your program to use an SQL database to store data without actually connecting over the Internet to a remote database server. Any program that can access the local database file can use or modify the data in it.

When you want to access a database stored on a remote server, you will need to install a 3rd party database access module. Typically different modules all follow the same Python Database API Specification (PEP249), but they each connect to a different type of SQL database (Oracle, MySQL, PostgreSQL, etc). In this book we will use a branch of the pymysql module that has been adapted to work with Python3. This module connects to the open source MySQL database back-end.

SQL Fundamentals

RDBMS (Relational Database Management Systems) hold data in tables, much like a spreadsheet. A particular database may have one or more tables of data. The columns of the table are named, and each column may have a specific data type, such as “text” or “integer”. All data in that column must be of the specified type. In addition, SQL allows some columns to be “special” and are automatically filled in (for example, with a unique, auto-incrementing id number) by the database engine.

Designing a databases well is a complex procedure, and you will typically be asked to work with pre-existing databases. Should you need to create a table, here is an example SQL statement that will create a table holding information about a person:

CREATE TABLE people
       (id INTEGER PRIMARY KEY AUTO_INCREMENT UNIQUE NOT NULL,
       name TEXT NOT NULL,
       address TEXT,
       age  INTEGER)

This statement creates a table that has four columns. The first column is a unique ID that the database engine will automatically create as you add data. It will automatically add one to the last id and use that for the next piece of data you create. Later, you can use these ID’s to uniquely select a particular row of data from the table. Each row (data item) also has a name, address, and age for a person. The name is required (NOT NULL). The name and address are TEXT, while the age is an INTEGER. NOTE: The AUTOINCREMENT directive works for sqlite3, but if you are connecting to a MYSQL database, you must use AUTO_INCREMENT instead!

To display a list of databases, and the tables that comprise them you can use the SHOW DATABASES and SHOW TABLES commands. You can show the columns in each table with a SHOW COLUMNS in TABLE command. Note that you can append “IN database_name” to the end of the TABLES and COLUMNS commands if you have more than one database.

mysql> SHOW TABLES;
+---------------------+
| Tables_in_cs2316    |
+---------------------+
| people              |
+---------------------+
1 row in set (0.00 sec)

mysql> SHOW COLUMNS IN people;
+---------+---------+------+-----+---------+----------------+
| Field   | Type    | Null | Key | Default | Extra          |
+---------+---------+------+-----+---------+----------------+
| id      | int(11) | NO   | PRI | NULL    | auto_increment |
| name    | text    | NO   |     | NULL    |                |
| address | text    | YES  |     | NULL    |                |
| age     | int(11) | YES  |     | NULL    |                |
+---------+---------+------+-----+---------+----------------+
4 rows in set (0.00 sec)

To add a database record for a person (Named Jay, who Lives at 801 Atlantic Drive and is 70 years old) to the database, you would issue the following SQL statement:

INSERT INTO people (name, address, age) VALUES ("Jay", "801 Atlantic Drive", 70)

To retrieve a specific piece of data (such as the ID for a record, or all data about a record) we use a SELECT statement. SQL statements are generally referred to as a query, in the sense that you are asking the database server to give you some information. Some SQL statements add information or update pre-existing information, but SQL statements sent to the server are still referred to as queries.

mysql> SELECT id FROM people WHERE name="Jay";
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

mysql> SELECT * FROM people WHERE id=1;
+----+------+--------------------+------+
| id | name | address            | age  |
+----+------+--------------------+------+
|  1 | Jay  | 801 Atlantic Drive |   70 |
+----+------+--------------------+------+
1 row in set (0.00 sec)

To update data in a pre-existing record, we use the UPDATE statement. (This example assumes the ID returned by the previous SELECT statement was a 1):

mysql> UPDATE people SET age=12, address="123 Any Street" WHERE id=1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> SELECT * FROM people WHERE id=1;
+----+------+----------------+------+
| id | name | address        | age  |
+----+------+----------------+------+
|  1 | Jay  | 123 Any Street |   12 |
+----+------+----------------+------+
1 row in set (0.00 sec)

Finally, to delete a record, we use the DELETE statement, being careful to specify a single data element uniquely (unless you want to delete more than one record):

DELETE FROM people WHERE id = 1

Typically unless you are running a mysql terminal, you do not type these commands yourself. Instead, you execute them using Python code. You can learn more about SQL at the W3Schools SQL Tutorial.

Using a Local SQL database: sqlite3

We will use the sqlite3 module (built into Python 3) which allows you to create a database stored in a local file to demonstrate how to issue SQL commands using PYTHON and retrieve the data results. Later on, we will install and use the pymysql module (a special version for Python3) that will allow us to connect to a remote MySQL database running on a server.

If you are creating a stand-alone program that needs an easy way to save data in a structured format to make it easily accessible and searchable, the sqlite3 module is a great way to easily implement a local database.

Creating a Table

The following code snippet will create a database in a local file called “myDBfile.sqlite3”.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
import sqlite3

db = sqlite3.connect("myDBfile.sqlite3")
cursor = db.cursor()
cursor.execute("""CREATE TABLE people
     (id INTEGER PRIMARY KEY AUTOINCREMENT UNIQUE NOT NULL,
     name TEXT NOT NULL,
     address TEXT,
     age  INTEGER)""" )     # AUTOINCREMENT is correct for sqlite3
db.commit()                 # Does nothing for a database without
                            # Transactions.

db.close()                  # Close the connection to the database

Typically you won’t just close the database immediately after creating it. Instead, you would use the db variable to add some data, perhaps select and remove some data, etc....

Adding Data

Assuming you have already created your database file and your table, the following code will add some data to your database:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
import sqlite3

def addPerson(db, Name, Address, Age):
   cursor = db.cursor()
   cursor.execute("""INSERT INTO people (name, address, age) VALUES
                  (?, ?, ?)""", (Name, Address, Age) )
   db.commit()



db = sqlite3.connect("myDBfile.sqlite3")
addPerson(db, "Jay", "801 Atlantic Drive", 70)

db.close()

Note that our addPerson function accepts a reference to an open database, and simply adds the data to that database. Also note how we are using question marks to indicate to the cursor.execute() method where we want to insert data from python variables. Then, as the second parameter to the execute method, we pass in a tuple of the python variables. Although you could use string formatting to insert data directly into the SQL command string yourself, it is recommended to use place-holders and allow the cursor object to do the substitution for you. (Different databases may escape data in different ways, and by allowing the cursor object to do it for you, you know it will be done correctly.)

Reading Data

Once you have data in the database, you can read it back into python. The following code will print out all data about each person in the database:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
import sqlite3

def displayPeople(db):
   cursor = db.cursor()
   sql = "SELECT * from people"
   cursor.execute(sql)

   for record in cursor:
       s = "ID: {}, Name: {}\n Address:{}\n Age: {}\n\n".format(
               record[0],record[1],record[2],record[3] )
       print(s)



db = sqlite3.connect("myDBfile.sqlite3")
displayPeople(db)
db.close()

Notice how we are using a for loop to iterate through the items of the cursor. In this case, the cursor is acting as a generator, giving us each row one at a time. (The for loop is essentially calling the .fetchone() method of the cursor and assigning each row to the record variable.)

Installing pymysql for Python3

To connect to a remote MySQL database, you will need to download and install a pure python (written entirely in python code) module called pymysql.

Unfortunately, the current verson of pymysql has some problems with it’s Python3 suport. I have placed the last known good version in a zip file here. (December 24th, 2011)

The latest version of PyMySQL3 (version 0.5) supports Python 3.x. You can find it here . Note that you will need an archive manager that can extract tar.gz files.

However, if you are using python 3.2, the 0.5 version released on November 8th has some bugs that are fixed in the main git repository. To get the latest version for python3, you have to do the following steps:

  1. Download the current working version of PyMySql using git: git clone https://github.com/PyMySQL/PyMySQL.git
  2. Change into the PyMySQL directory that git crated: cd PyMySQL
  3. Run the build-py3k.sh tool to convert the python 2.x code into 3.x code: ./build-py3k.sh
  4. Change into the py3k directory that was created: cd py3k
  5. Run the Python3 installer: sudo python3 setup.py install

You can avoid the first 3 steps by downloading the zip file I have placed here. (December 24th, 2011)

The zip file above contains a directory named py3k. Inside this directory is a setup.py file. To install the module for your python3 system, you need to go to a command line, navigate to that directory, and run the command:

python3 setup.py install

Note that instead of python3 you may need to use the specific name or path of your python executable, such as python3.2 or c:\\python32\\bin\\python.exe. Note that your python3 executable must have permissions to edit files in the system’s dist-packages directory. You may have to open the command line or terminal with administrative or root privileges. On a Linux system, you can do this with the sudo program, and it will produce output as follows:

summetj@constantine:~/temp/py3k$ sudo python3 setup.py install
[sudo] password for summetj:
running install
running build
running build_py
running install_lib
running install_egg_info
Writing /usr/local/lib/python3.1/dist-packages/PyMySQL3-0.5.egg-info

Once you have successfully installed the PyMySQL module into your dist-packages directory, test it by opening Python and trying to import the pymysql module. If it does not give you an error, you have successfully installed the module!

Python 3.1.1+ (r311:74480, Nov  2 2009, 14:49:22)
[GCC 4.4.1] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymysql
>>>

MySQL Bindings with pymysql

Because both the sqlite3 and pymysql modules conform to the PEP 249 - Python Database API Specification v 2.0 using them is quite similar. The two main differences involve how you connect to the remote MySQL database, and how values are substituted into the SQL queries.

You must give the pymysql connect function a hostname, username, password, and database name instead of a file when you want to create a database connection object. In addition, the pymysql module uses a ANSI C “format” paramstyle, not the question mark (“qmark”) paramstyle that sqlite3 uses. As an example, here is how you would connect to a remote MySQL database and add some data using pymysql:

1
2
3
4
5
6
7
8
9
import pymysql
db = pymysql.connect( host = 'sqlserver.example.com',
      passwd = 'SECRET12345', user = 'dbuser', db='myDatabase')
cursor = db.cursor()
sql = "INSERT INTO people (name, email, age) VALUES (%s, %s, %s)"
cursor.execute( sql, ( "Jay", "jay@example.com", '77') )
cursor.close()
db.commit()   #Makes sure the DB saves your changes!
db.close()

Note how we pass the various required parameters to the pymysql.connect function that identify the remote host running the MySQL server, the name of the database we want to connect to, and our user authorization credentials. Also note how we are passing in a tuple of strings to the cursor.execute() method which are substituted for the %s patterns in the actual SQL statement.