The SQLite3 C++ API

I was looking for some practice with DBs, and my choice fell on the SQLite project, maybe only because it looked lighter than PostgreSQL, which I also plan to play with. My first general impression about the API is that it looks a little too much C-oriented, even though I am not sure whether this is a good or a bad thing, and I am ignorant about the latitude of choice an engineer has when designing an API that has to interact with applications written in either C or C++.

In this post, I am going to dissect a basic utilization of the API, from setting up a ludicrously small database to running a query on the data.

Creating the database

Running sqlite3 from the console, we are prompted by the SQLite interpreter. I have created a database of actors and movies, of which the following is the piece that I used to create the Movies table, plus an example insert statement:

1
2
3
4
create table if not exists Movies(movieId primary key,
movieTitle, movieYear);
insert or ignore into Movies values('NIR97', 'Nirvana',
'1997');

With this saved in a file movies.sql, from within the sqlite3 prompt you can run the command .read movies.sql to execute the statements of table creation and insertion of the data. I have saved then the DB with the command .backup Gergelim.db. At this point, I have started coding some C++.

Querying

The first thing we need in order to access a SQLite database is a handle, which is a pointer to an opaque structure called sqlite3. The way to do it is quite intuitive:

1
2
3
4
5
6
sqlite3* handle = nullptr;
int retval = sqlite3_open(argv[1], &handle);
if (retval != SQLITE_OK) {
cerr << "open: " << sqlite3_errstr(retval) << '\n';
return retval;
}

in which I assume to pass the name of the DB as the first and only parameter to the program. Most of the functions of the API return an integer that can be interpreted as an error code. The function sqlite3_errstr(code) provides a human-readable version of the error code.

Next, we must prepare the statement that we want to execute. The preparation is apparently a sort of “compilation” of the statement into something that is interpretable in later function calls (particularly the sqlite3_step, see below). There are currently two main versions of the prepare statement, and the docs encourage the use of version 2:

1
2
3
4
5
6
7
8
9
10
11
12
string zSql(
"select movieTitle, movieYear from Movies where movieYear > 2000;");
const char* zTail = nullptr;
const char** pzTail = &zTail;
sqlite3_stmt* statement = nullptr;
retval = sqlite3_prepare_v2(handle, zSql.c_str(), -1,
&statement,
pzTail);
if (retval != SQLITE_OK) {
cerr << "prepare: " << sqlite3_errstr(retval) << '\n';
return retval;
}

The syntax of the prepare statement feels a little awkward. It requires:

  1. a handle to the DB
  2. the statement, as a const char*
  3. the maximum length of the abovementioned const char*
  4. a statement handle, as an output parameter
  5. an output pointer to the unused portion of zSql

There are a couple of concepts here that need explaining. The following is the declaration of the sqlite3_prepare_v2 function from the SQLite API documentation:

1
2
3
4
5
6
7
int sqlite3_prepare_v2(
sqlite3 *db, /* Database handle */
const char *zSql, /* SQL statement, UTF-8 encoded */
int nByte, /* Maximum length of zSql in bytes. */
sqlite3_stmt **ppStmt, /* OUT: Statement handle */
const char **pzTail /* OUT: Pointer to unused portion of zSql */
);

The parameter nByte, if set to a negative number, has the API implementation read zSql until the zero terminator; otherwise, if we had, say, a text file with many statements, we could pass the length of the statement to interpret, starting from what is pointed by zSql. If pzTail is not null, then it is filled with the pointer to the position of the zSql string that has not been interpreted: only one SQL statement at a time is interpreted by this function.

After we have prepared the statement, we can execute it. This is done via calling sqlite3_step(). This function returns a row of the query, if successful; otherwise, it may return either the code SQLITE_DONE, to indicate that the set of rows has finished; or another type of error. This means that to read the result set, we need to call sqlite3_step() until we get a SQLITE_DONE, or we get an error. Reading the result set can be done this way:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
const int num_columns = sqlite3_column_count(statement);
while (true) {
retval = sqlite3_step(statement);
if (retval == SQLITE_ROW) {
// There is another row after this one: read this row
// for now
for (int i = 0; i < num_columns; ++i) {
const unsigned char* row_element =
sqlite3_column_text(statement, i);
cout << row_element << ' ';
}
cout << '\n';
} else if (retval == SQLITE_DONE) {
// no more rows: exit
cout << "Done\n";
break;
} else {
cerr << "step: " << sqlite3_errstr(retval) << '\n';
return cleanup(handle, statements);
}
}

Keeping the house clean

Some of the mentioned API calls allocate memory, that has to be freed when the execution aborts for some error (e.g., an empty result set). In particular, we have to free the memory for each statement we have prepared with sqlite3_prepare_v2(), by calling sqlite3_finalize(); and we have to close the connection to the DB, with a sqlite3_close(). Since we might need to free the resources in more than one place, I thought it would be a good idea to group the statements in a function:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
int cleanup(sqlite3* handle, vector<sqlite3_stmt*> statements) {
int retval = -1;
// Destroy statements
for (auto s : statements) {
retval = sqlite3_finalize(s);
if (retval != SQLITE_OK) {
cerr << "finalize: " <<
sqlite3_errstr(retval) << '\n';
} else {
cerr << "finalize: OK\n";
}
}

// Close connection
retval = sqlite3_close(handle);
if (retval != SQLITE_OK) {
cerr << "close: " << sqlite3_errstr(retval) << '\n';
} else {
cerr << "close: OK\n";
}
return retval;
}

Memory management considerations

With all the talking about RAII and being able to be in control exactly of the life cycle of objects and resources in general, it feels a bit scary to manage memory in this way; but I am sure there is some better way of doing this with the C++11 smart pointers. For example, we could enclose the pointer to the prepared statement returned by sqlite3_prepare_v2() in a smart pointer, if we could provide the sqlite3_finalize() function as callback to be used when destroying the memory allocated. The same can be said of sqlite3_open()/sqlite3_close().

I haven’t found a way to use smart pointers to achieve this. I will keep looking, and I’d be glad if anyone could point me to a solution.

Summary

I have created a minimal SQLite3 database, and executed a query on it using the C/C++ API documented on sqlite.org. As I have said, things feel a little too C, but I might be using it naively. If I discover anything that can be done nicer, I’ll get back to this post and update it, as usual.