Driver

The driver module defines sync and async driver adapters, transaction helpers, and the shared data dictionary mixins.

Example

driver usage
from sqlspec import SQLSpec
from sqlspec.adapters.sqlite import SqliteConfig

db_path = tmp_path / "driver_api.db"
spec = SQLSpec()
config = spec.add_config(SqliteConfig(connection_config={"database": str(db_path)}))

with spec.provide_session(config) as session:
    session.execute("create table if not exists users (id integer primary key, name text)")
    session.execute("insert into users (name) values ('Ada')")
    row = session.select_one_or_none("select name from users where id = ?", (1,))
    print(row)

Base Driver Classes

Synchronous Driver

class sqlspec.driver.SyncDriverAdapterBase[source]

Bases: CommonDriverAttributesMixin

Base class for synchronous database drivers.

This class includes flattened storage and SQL translation methods that were previously in StorageDriverMixin and SQLTranslatorMixin. The flattening eliminates cross-trait attribute access that caused mypyc segmentation faults.

Method Organization:
  1. Core dispatch methods (the execution engine)

  2. Transaction management (abstract methods)

  3. Public API - execution methods

  4. Public API - query methods (select/fetch variants)

  5. Arrow API methods

  6. Stack execution

  7. Storage API methods

  8. Utility methods

  9. Private/internal methods

dialect: DialectType | None = None
abstract property data_dictionary: SyncDataDictionaryBase

Get the data dictionary for this driver.

Returns:

Data dictionary instance for metadata queries

final dispatch_statement_execution(statement, connection)[source]

Central execution dispatcher using the Template Method Pattern.

Parameters:
  • statement (SQL) – The SQL statement to execute

  • connection (typing.Any) – The database connection to use

Return type:

SQLResult

Returns:

The result of the SQL execution

abstractmethod dispatch_execute(cursor, statement)[source]

Execute a single SQL statement.

Must be implemented by each driver for database-specific execution logic.

Parameters:
  • cursor (Any) – Database cursor/connection object

  • statement (SQL) – SQL statement object with all necessary data and configuration

Return type:

ExecutionResult

Returns:

ExecutionResult with execution data

abstractmethod dispatch_execute_many(cursor, statement)[source]

Execute SQL with multiple parameter sets (executemany).

Must be implemented by each driver for database-specific executemany logic.

Parameters:
  • cursor (Any) – Database cursor/connection object

  • statement (SQL) – SQL statement object with all necessary data and configuration

Return type:

ExecutionResult

Returns:

ExecutionResult with execution data for the many operation

dispatch_execute_script(cursor, statement)[source]

Execute a SQL script containing multiple statements.

Default implementation splits the script and executes statements individually. Drivers can override for database-specific script execution methods.

Parameters:
  • cursor (Any) – Database cursor/connection object

  • statement (SQL) – SQL statement object with all necessary data and configuration

Return type:

ExecutionResult

Returns:

ExecutionResult with script execution data including statement counts

dispatch_special_handling(cursor, statement)[source]

Hook for database-specific special operations (e.g., PostgreSQL COPY, bulk operations).

This method is called first in dispatch_statement_execution() to allow drivers to handle special operations that don’t follow the standard SQL execution pattern.

Parameters:
  • cursor (Any) – Database cursor/connection object

  • statement (SQL) – SQL statement to analyze

Return type:

SQLResult | None

Returns:

SQLResult if the special operation was handled and completed, None if standard execution should proceed

abstractmethod begin()[source]

Begin a database transaction on the current connection.

Return type:

None

abstractmethod commit()[source]

Commit the current transaction on the current connection.

Return type:

None

abstractmethod rollback()[source]

Rollback the current transaction on the current connection.

Return type:

None

abstractmethod with_cursor(connection)[source]

Create and return a context manager for cursor acquisition and cleanup.

Returns a context manager that yields a cursor for database operations. Concrete implementations handle database-specific cursor creation and cleanup.

Return type:

Any

abstractmethod handle_database_exceptions()[source]

Handle database-specific exceptions and wrap them appropriately.

Return type:

SyncExceptionHandler

Returns:

Exception handler with deferred exception pattern for mypyc compatibility. The handler stores mapped exceptions in pending_exception rather than raising from __exit__ to avoid ABI boundary violations.

execute(statement, /, *parameters, statement_config=None, **kwargs)[source]

Execute a statement with parameter handling.

execute_many(statement, /, parameters, *filters, statement_config=None, **kwargs)[source]

Execute statement multiple times with different parameters.

Parameters passed will be used as the batch execution sequence.

execute_script(statement, /, *parameters, statement_config=None, **kwargs)[source]

Execute a multi-statement script.

By default, validates each statement and logs warnings for dangerous operations. Use suppress_warnings=True for migrations and admin scripts.

select(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return all rows.

fetch(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return all rows.

This is an alias for select() provided for users familiar with asyncpg’s fetch() naming convention.

See also

select(): Primary method with identical behavior

select_one(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return exactly one row.

Raises an exception if no rows or more than one row is returned.

fetch_one(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return exactly one row.

This is an alias for select_one() provided for users familiar with asyncpg’s fetch_one() naming convention.

Raises an exception if no rows or more than one row is returned.

See also

select_one(): Primary method with identical behavior

select_one_or_none(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return at most one row.

Returns None if no rows are found. Raises ValueError if more than one row is returned. Any database or SQL execution errors raised by the driver are propagated unchanged.

fetch_one_or_none(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return at most one row.

This is an alias for select_one_or_none() provided for users familiar with asyncpg’s fetch_one_or_none() naming convention.

Returns None if no rows are found. Raises an exception if more than one row is returned.

See also

select_one_or_none(): Primary method with identical behavior

select_value(statement, /, *parameters, value_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return a single scalar value.

Expects exactly one row with one column. Raises an exception if no rows or more than one row/column is returned.

Parameters:
  • statement – SQL statement or query builder to execute.

  • *parameters – Positional parameters for the statement.

  • value_type – Optional type to convert the result to. When provided, the return value is converted to this type and the return type is narrowed for type checkers. Supports int, float, str, bool, datetime, date, time, Decimal, UUID, Path, dict, and list.

  • statement_config – Optional statement configuration.

  • **kwargs – Additional keyword arguments.

Returns:

The scalar value, optionally converted to the specified type.

Raises:
  • ValueError – If no rows or more than one row/column is returned.

  • TypeError – If value_type is provided and conversion fails.

Examples

Basic usage (returns Any):

>>> count = driver.select_value("SELECT COUNT(*) FROM users")

With type hint (returns int):

>>> count = driver.select_value(
...     "SELECT COUNT(*) FROM users", value_type=int
... )

With UUID conversion:

>>> user_id = driver.select_value(
...     "SELECT id FROM users WHERE name = :name",
...     {"name": "alice"},
...     value_type=UUID,
... )
fetch_value(statement, /, *parameters, value_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return a single scalar value.

This is an alias for select_value() provided for users familiar with asyncpg’s fetch_value() naming convention.

Expects exactly one row with one column. Raises an exception if no rows or more than one row/column is returned.

See also

select_value(): Primary method with identical behavior

select_value_or_none(statement, /, *parameters, value_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return a single scalar value or None.

Returns None if no rows are found. Expects at most one row with one column. Raises an exception if more than one row is returned.

Parameters:
  • statement – SQL statement or query builder to execute.

  • *parameters – Positional parameters for the statement.

  • value_type – Optional type to convert the result to. When provided, the return value is converted to this type and the return type is narrowed to T | None for type checkers. Supports int, float, str, bool, datetime, date, time, Decimal, UUID, Path, dict, and list.

  • statement_config – Optional statement configuration.

  • **kwargs – Additional keyword arguments.

Returns:

The scalar value (optionally converted), or None if no rows found.

Raises:
  • ValueError – If more than one row is returned.

  • TypeError – If value_type is provided and conversion fails.

Examples

Basic usage:

>>> email = driver.select_value_or_none(
...     "SELECT email FROM users WHERE id = :id", {"id": 123}
... )
>>> if email is not None:
...     print(email)

With type hint:

>>> count = driver.select_value_or_none(
...     "SELECT COUNT(*) FROM users WHERE active = true",
...     value_type=int,
... )
>>> if count is not None and count > 0:
...     print(f"Found {count} active users")
fetch_value_or_none(statement, /, *parameters, value_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return a single scalar value or None.

This is an alias for select_value_or_none() provided for users familiar with asyncpg’s fetch_value_or_none() naming convention.

Returns None if no rows are found. Expects at most one row with one column. Raises an exception if more than one row is returned.

See also

select_value_or_none(): Primary method with identical behavior

select_with_total(statement, /, *parameters, schema_type=None, statement_config=None, count_with_window=False, **kwargs)[source]

Execute a select statement and return both the data and total count.

This method is designed for pagination scenarios where you need both the current page of data and the total number of rows that match the query.

Parameters:
  • statement – The SQL statement, QueryBuilder, or raw SQL string

  • *parameters – Parameters for the SQL statement

  • schema_type – Optional schema type for data transformation

  • statement_config – Optional SQL configuration

  • count_with_window – If True, use a single query with COUNT(*) OVER() window function instead of two separate queries. This can be more efficient for some databases but adds a column to each row. Default False.

  • **kwargs – Additional keyword arguments

Returns:

  • List of data rows (transformed by schema_type if provided)

  • Total count of rows matching the query (ignoring LIMIT/OFFSET)

Return type:

A tuple containing

Example

>>> # Two-query approach (default):
>>> data, total = driver.select_with_total(
...     sql
...     .select("*")
...     .from_("users")
...     .where_eq("status", "active")
...     .limit(10)
... )
>>> # Single-query with window function:
>>> data, total = driver.select_with_total(
...     sql
...     .select("*")
...     .from_("users")
...     .where_eq("status", "active")
...     .limit(10),
...     count_with_window=True,
... )
fetch_with_total(statement, /, *parameters, schema_type=None, statement_config=None, count_with_window=False, **kwargs)[source]

Execute a select statement and return both the data and total count.

This is an alias for select_with_total() provided for users familiar with asyncpg’s fetch() naming convention.

This method is designed for pagination scenarios where you need both the current page of data and the total number of rows that match the query.

See also

select_with_total(): Primary method with identical behavior and full documentation

select_to_arrow(statement, /, *parameters, statement_config=None, return_format='table', native_only=False, batch_size=None, arrow_schema=None, **kwargs)[source]

Execute query and return results as Apache Arrow format.

This base implementation uses the conversion path: execute() → dict → Arrow. Adapters with native Arrow support (ADBC, DuckDB, BigQuery) override this method to use zero-copy native paths for 5-10x performance improvement.

Parameters:
  • statement – SQL query string, Statement, or QueryBuilder

  • *parameters – Query parameters (same format as execute()/select())

  • statement_config – Optional statement configuration override

  • return_format – “table” for pyarrow.Table (default), “batch” for single RecordBatch, “batches” for iterator of RecordBatches, “reader” for RecordBatchReader

  • native_only – If True, raise error if native Arrow unavailable (default: False)

  • batch_size – Rows per batch for “batch”/”batches” format (default: None = all rows)

  • arrow_schema – Optional pyarrow.Schema for type casting

  • **kwargs – Additional keyword arguments

Returns:

ArrowResult containing pyarrow.Table, RecordBatchReader, or RecordBatches

Raises:

ImproperConfigurationError – If native_only=True and adapter doesn’t support native Arrow

Examples

>>> result = driver.select_to_arrow(
...     "SELECT * FROM users WHERE age > ?", 18
... )
>>> df = result.to_pandas()
>>> print(df.head())
>>> # Force native Arrow path (raises error if unavailable)
>>> result = driver.select_to_arrow(
...     "SELECT * FROM users", native_only=True
... )
fetch_to_arrow(statement, /, *parameters, statement_config=None, return_format='table', native_only=False, batch_size=None, arrow_schema=None, **kwargs)[source]

Execute query and return results as Apache Arrow format.

This is an alias for select_to_arrow() provided for users familiar with asyncpg’s fetch() naming convention.

See also

select_to_arrow(): Primary method with identical behavior and full documentation

execute_stack(stack, *, continue_on_error=False)[source]

Execute a StatementStack sequentially using the adapter’s primitives.

Return type:

tuple[StackResult, ...]

select_to_storage(statement, destination, /, *parameters, statement_config=None, partitioner=None, format_hint=None, telemetry=None)[source]

Stream a SELECT statement directly into storage.

Parameters:
  • statement – SQL statement to execute.

  • destination – Storage destination path.

  • parameters – Query parameters.

  • statement_config – Optional statement configuration.

  • partitioner – Optional partitioner configuration.

  • format_hint – Optional format hint for storage.

  • telemetry – Optional telemetry dict to merge.

Returns:

StorageBridgeJob with execution telemetry.

Raises:

StorageCapabilityError – If not implemented.

load_from_arrow(table, source, *, partitioner=None, overwrite=False)[source]

Load Arrow data into the target table.

Parameters:
  • table – Target table name.

  • source – Arrow data source.

  • partitioner – Optional partitioner configuration.

  • overwrite – Whether to overwrite existing data.

Returns:

StorageBridgeJob with execution telemetry.

Raises:

StorageCapabilityError – If not implemented.

load_from_storage(table, source, *, file_format, partitioner=None, overwrite=False)[source]

Load artifacts from storage into the target table.

Parameters:
  • table (str) – Target table name.

  • source (str | Path) – Storage source path.

  • file_format (Literal['jsonl', 'json', 'parquet', 'arrow-ipc']) – File format of source.

  • partitioner (dict[str, object] | None) – Optional partitioner configuration.

  • overwrite (bool) – Whether to overwrite existing data.

Return type:

StorageBridgeJob

Returns:

StorageBridgeJob with execution telemetry.

Raises:

StorageCapabilityError – If not implemented.

stage_artifact(request)[source]

Provision staging metadata for adapters that require remote URIs.

Parameters:

request (dict[str, typing.Any]) – Staging request configuration.

Return type:

dict[str, typing.Any]

Returns:

Staging metadata dict.

Raises:

StorageCapabilityError – If not implemented.

flush_staging_artifacts(artifacts, *, error=None)[source]

Clean up staged artifacts after a job completes.

Parameters:
  • artifacts (list[dict[str, typing.Any]]) – List of staging artifacts to clean up.

  • error (Exception | None) – Optional error that triggered cleanup.

Return type:

None

get_storage_job(job_id)[source]

Fetch a previously created job handle.

Parameters:

job_id (str) – Job identifier.

Return type:

StorageBridgeJob | None

Returns:

StorageBridgeJob if found, None otherwise.

convert_to_dialect(statement, to_dialect=None, pretty=True)[source]

Convert a statement to a target SQL dialect.

Parameters:
  • statement (str | Expression | SQL) – SQL statement to convert.

  • to_dialect (Union[str, Dialect, Type[Dialect], None]) – Target dialect (defaults to current dialect).

  • pretty (bool) – Whether to format the output SQL.

Return type:

str

Returns:

SQL string in target dialect.

Asynchronous Driver

class sqlspec.driver.AsyncDriverAdapterBase[source]

Bases: CommonDriverAttributesMixin

Base class for asynchronous database drivers.

This class includes flattened storage and SQL translation methods that were previously in StorageDriverMixin and SQLTranslatorMixin. The flattening eliminates cross-trait attribute access that caused mypyc segmentation faults.

Method Organization:
  1. Core dispatch methods (the execution engine)

  2. Transaction management (abstract methods)

  3. Public API - execution methods

  4. Public API - query methods (select/fetch variants)

  5. Arrow API methods

  6. Stack execution

  7. Storage API methods

  8. Utility methods

  9. Private/internal methods

dialect: DialectType | None = None
property is_async: bool

Return whether the driver executes asynchronously.

Returns:

True for async drivers.

abstract property data_dictionary: AsyncDataDictionaryBase

Get the data dictionary for this driver.

Returns:

Data dictionary instance for metadata queries

final async dispatch_statement_execution(statement, connection)[source]

Central execution dispatcher using the Template Method Pattern.

Parameters:
  • statement (SQL) – The SQL statement to execute

  • connection (typing.Any) – The database connection to use

Return type:

SQLResult

Returns:

The result of the SQL execution

abstractmethod async dispatch_execute(cursor, statement)[source]

Execute a single SQL statement.

Must be implemented by each driver for database-specific execution logic.

Parameters:
  • cursor (Any) – Database cursor/connection object

  • statement (SQL) – SQL statement object with all necessary data and configuration

Return type:

ExecutionResult

Returns:

ExecutionResult with execution data

abstractmethod async dispatch_execute_many(cursor, statement)[source]

Execute SQL with multiple parameter sets (executemany).

Must be implemented by each driver for database-specific executemany logic.

Parameters:
  • cursor (Any) – Database cursor/connection object

  • statement (SQL) – SQL statement object with all necessary data and configuration

Return type:

ExecutionResult

Returns:

ExecutionResult with execution data for the many operation

async dispatch_execute_script(cursor, statement)[source]

Execute a SQL script containing multiple statements.

Default implementation splits the script and executes statements individually. Drivers can override for database-specific script execution methods.

Parameters:
  • cursor (Any) – Database cursor/connection object

  • statement (SQL) – SQL statement object with all necessary data and configuration

Return type:

ExecutionResult

Returns:

ExecutionResult with script execution data including statement counts

async dispatch_special_handling(cursor, statement)[source]

Hook for database-specific special operations (e.g., PostgreSQL COPY, bulk operations).

This method is called first in dispatch_statement_execution() to allow drivers to handle special operations that don’t follow the standard SQL execution pattern.

Parameters:
  • cursor (Any) – Database cursor/connection object

  • statement (SQL) – SQL statement to analyze

Return type:

SQLResult | None

Returns:

SQLResult if the special operation was handled and completed, None if standard execution should proceed

abstractmethod async begin()[source]

Begin a database transaction on the current connection.

Return type:

None

abstractmethod async commit()[source]

Commit the current transaction on the current connection.

Return type:

None

abstractmethod async rollback()[source]

Rollback the current transaction on the current connection.

Return type:

None

abstractmethod with_cursor(connection)[source]

Create and return an async context manager for cursor acquisition and cleanup.

Returns an async context manager that yields a cursor for database operations. Concrete implementations handle database-specific cursor creation and cleanup.

Return type:

Any

abstractmethod handle_database_exceptions()[source]

Handle database-specific exceptions and wrap them appropriately.

Return type:

AsyncExceptionHandler

Returns:

Exception handler with deferred exception pattern for mypyc compatibility. The handler stores mapped exceptions in pending_exception rather than raising from __aexit__ to avoid ABI boundary violations.

async execute(statement, /, *parameters, statement_config=None, **kwargs)[source]

Execute a statement with parameter handling.

async execute_many(statement, /, parameters, *filters, statement_config=None, **kwargs)[source]

Execute statement multiple times with different parameters.

Parameters passed will be used as the batch execution sequence.

async execute_script(statement, /, *parameters, statement_config=None, **kwargs)[source]

Execute a multi-statement script.

By default, validates each statement and logs warnings for dangerous operations. Use suppress_warnings=True for migrations and admin scripts.

async select(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return all rows.

async fetch(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return all rows.

This is an alias for select() provided for users familiar with asyncpg’s fetch() naming convention.

See also

select(): Primary method with identical behavior

async select_one(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return exactly one row.

Raises an exception if no rows or more than one row is returned.

async fetch_one(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return exactly one row.

This is an alias for select_one() provided for users familiar with asyncpg’s fetch_one() naming convention.

Raises an exception if no rows or more than one row is returned.

See also

select_one(): Primary method with identical behavior

async select_one_or_none(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return at most one row.

Returns None if no rows are found. Raises ValueError if more than one row is returned. Any database or SQL execution errors raised by the driver are propagated unchanged.

async fetch_one_or_none(statement, /, *parameters, schema_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return at most one row.

This is an alias for select_one_or_none() provided for users familiar with asyncpg’s fetch_one_or_none() naming convention.

Returns None if no rows are found. Raises an exception if more than one row is returned.

See also

select_one_or_none(): Primary method with identical behavior

async select_value(statement, /, *parameters, value_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return a single scalar value.

Expects exactly one row with one column. Raises an exception if no rows or more than one row/column is returned.

Parameters:
  • statement – SQL statement or query builder to execute.

  • *parameters – Positional parameters for the statement.

  • value_type – Optional type to convert the result to. When provided, the return value is converted to this type and the return type is narrowed for type checkers. Supports int, float, str, bool, datetime, date, time, Decimal, UUID, Path, dict, and list.

  • statement_config – Optional statement configuration.

  • **kwargs – Additional keyword arguments.

Returns:

The scalar value, optionally converted to the specified type.

Raises:
  • ValueError – If no rows or more than one row/column is returned.

  • TypeError – If value_type is provided and conversion fails.

Examples

Basic usage (returns Any):

>>> count = await driver.select_value(
...     "SELECT COUNT(*) FROM users"
... )

With type hint (returns int):

>>> count = await driver.select_value(
...     "SELECT COUNT(*) FROM users", value_type=int
... )

With UUID conversion:

>>> user_id = await driver.select_value(
...     "SELECT id FROM users WHERE name = :name",
...     {"name": "alice"},
...     value_type=UUID,
... )
async fetch_value(statement, /, *parameters, value_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return a single scalar value.

This is an alias for select_value() provided for users familiar with asyncpg’s fetch_value() naming convention.

Expects exactly one row with one column. Raises an exception if no rows or more than one row/column is returned.

See also

select_value(): Primary method with identical behavior

async select_value_or_none(statement, /, *parameters, value_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return a single scalar value or None.

Returns None if no rows are found. Expects at most one row with one column. Raises an exception if more than one row is returned.

Parameters:
  • statement – SQL statement or query builder to execute.

  • *parameters – Positional parameters for the statement.

  • value_type – Optional type to convert the result to. When provided, the return value is converted to this type and the return type is narrowed to T | None for type checkers. Supports int, float, str, bool, datetime, date, time, Decimal, UUID, Path, dict, and list.

  • statement_config – Optional statement configuration.

  • **kwargs – Additional keyword arguments.

Returns:

The scalar value (optionally converted), or None if no rows found.

Raises:
  • ValueError – If more than one row is returned.

  • TypeError – If value_type is provided and conversion fails.

Examples

Basic usage:

>>> email = await driver.select_value_or_none(
...     "SELECT email FROM users WHERE id = :id", {"id": 123}
... )
>>> if email is not None:
...     print(email)

With type hint:

>>> count = await driver.select_value_or_none(
...     "SELECT COUNT(*) FROM users WHERE active = true",
...     value_type=int,
... )
>>> if count is not None and count > 0:
...     print(f"Found {count} active users")
async fetch_value_or_none(statement, /, *parameters, value_type=None, statement_config=None, **kwargs)[source]

Execute a select statement and return a single scalar value or None.

This is an alias for select_value_or_none() provided for users familiar with asyncpg’s fetch_value_or_none() naming convention.

Returns None if no rows are found. Expects at most one row with one column. Raises an exception if more than one row is returned.

See also

select_value_or_none(): Primary method with identical behavior

async select_with_total(statement, /, *parameters, schema_type=None, statement_config=None, count_with_window=False, **kwargs)[source]

Execute a select statement and return both the data and total count.

This method is designed for pagination scenarios where you need both the current page of data and the total number of rows that match the query.

Parameters:
  • statement – The SQL statement, QueryBuilder, or raw SQL string

  • *parameters – Parameters for the SQL statement

  • schema_type – Optional schema type for data transformation

  • statement_config – Optional SQL configuration

  • count_with_window – If True, use a single query with COUNT(*) OVER() window function instead of two separate queries. This can be more efficient for some databases but adds a column to each row. Default False.

  • **kwargs – Additional keyword arguments

Returns:

  • List of data rows (transformed by schema_type if provided)

  • Total count of rows matching the query (ignoring LIMIT/OFFSET)

Return type:

A tuple containing

Example

>>> # Two-query approach (default):
>>> data, total = await driver.select_with_total(
...     sql
...     .select("*")
...     .from_("users")
...     .where_eq("status", "active")
...     .limit(10)
... )
>>> # Single-query with window function:
>>> data, total = await driver.select_with_total(
...     sql
...     .select("*")
...     .from_("users")
...     .where_eq("status", "active")
...     .limit(10),
...     count_with_window=True,
... )
async fetch_with_total(statement, /, *parameters, schema_type=None, statement_config=None, count_with_window=False, **kwargs)[source]

Execute a select statement and return both the data and total count.

This is an alias for select_with_total() provided for users familiar with asyncpg’s fetch() naming convention.

This method is designed for pagination scenarios where you need both the current page of data and the total number of rows that match the query.

See also

select_with_total(): Primary method with identical behavior and full documentation

async select_to_arrow(statement, /, *parameters, statement_config=None, return_format='table', native_only=False, batch_size=None, arrow_schema=None, **kwargs)[source]

Execute query and return results as Apache Arrow format (async).

This base implementation uses the conversion path: execute() → dict → Arrow. Adapters with native Arrow support (ADBC, DuckDB, BigQuery) override this method to use zero-copy native paths for 5-10x performance improvement.

Parameters:
  • statement – SQL query string, Statement, or QueryBuilder

  • *parameters – Query parameters (same format as execute()/select())

  • statement_config – Optional statement configuration override

  • return_format – “table” for pyarrow.Table (default), “batch” for single RecordBatch, “batches” for iterator of RecordBatches, “reader” for RecordBatchReader

  • native_only – If True, raise error if native Arrow unavailable (default: False)

  • batch_size – Rows per batch for “batch”/”batches” format (default: None = all rows)

  • arrow_schema – Optional pyarrow.Schema for type casting

  • **kwargs – Additional keyword arguments

Returns:

ArrowResult containing pyarrow.Table, RecordBatchReader, or RecordBatches

Raises:

ImproperConfigurationError – If native_only=True and adapter doesn’t support native Arrow

Examples

>>> result = await driver.select_to_arrow(
...     "SELECT * FROM users WHERE age > ?", 18
... )
>>> df = result.to_pandas()
>>> print(df.head())
>>> # Force native Arrow path (raises error if unavailable)
>>> result = await driver.select_to_arrow(
...     "SELECT * FROM users", native_only=True
... )
async fetch_to_arrow(statement, /, *parameters, statement_config=None, return_format='table', native_only=False, batch_size=None, arrow_schema=None, **kwargs)[source]

Execute query and return results as Apache Arrow format (async).

This is an alias for select_to_arrow() provided for users familiar with asyncpg’s fetch() naming convention.

See also

select_to_arrow(): Primary method with identical behavior and full documentation

async execute_stack(stack, *, continue_on_error=False)[source]

Execute a StatementStack sequentially using the adapter’s primitives.

Return type:

tuple[StackResult, ...]

async select_to_storage(statement, destination, /, *parameters, statement_config=None, partitioner=None, format_hint=None, telemetry=None)[source]

Stream a SELECT statement directly into storage.

Parameters:
  • statement – SQL statement to execute.

  • destination – Storage destination path.

  • parameters – Query parameters.

  • statement_config – Optional statement configuration.

  • partitioner – Optional partitioner configuration.

  • format_hint – Optional format hint for storage.

  • telemetry – Optional telemetry dict to merge.

Returns:

StorageBridgeJob with execution telemetry.

async load_from_arrow(table, source, *, partitioner=None, overwrite=False)[source]

Load Arrow data into the target table.

Parameters:
  • table – Target table name.

  • source – Arrow data source.

  • partitioner – Optional partitioner configuration.

  • overwrite – Whether to overwrite existing data.

Returns:

StorageBridgeJob with execution telemetry.

Raises:

NotImplementedError – If not implemented.

async load_from_storage(table, source, *, file_format, partitioner=None, overwrite=False)[source]

Load artifacts from storage into the target table.

Parameters:
  • table (str) – Target table name.

  • source (str | Path) – Storage source path.

  • file_format (Literal['jsonl', 'json', 'parquet', 'arrow-ipc']) – File format of source.

  • partitioner (dict[str, object] | None) – Optional partitioner configuration.

  • overwrite (bool) – Whether to overwrite existing data.

Return type:

StorageBridgeJob

Returns:

StorageBridgeJob with execution telemetry.

stage_artifact(request)[source]

Provision staging metadata for adapters that require remote URIs.

Parameters:

request (dict[str, typing.Any]) – Staging request configuration.

Return type:

dict[str, typing.Any]

Returns:

Staging metadata dict.

flush_staging_artifacts(artifacts, *, error=None)[source]

Clean up staged artifacts after a job completes.

Parameters:
  • artifacts (list[dict[str, typing.Any]]) – List of staging artifacts to clean up.

  • error (Exception | None) – Optional error that triggered cleanup.

Return type:

None

get_storage_job(job_id)[source]

Fetch a previously created job handle.

Parameters:

job_id (str) – Job identifier.

Return type:

StorageBridgeJob | None

Returns:

StorageBridgeJob if found, None otherwise.

convert_to_dialect(statement, to_dialect=None, pretty=True)[source]

Convert a statement to a target SQL dialect.

Parameters:
  • statement (str | Expression | SQL) – SQL statement to convert.

  • to_dialect (Union[str, Dialect, Type[Dialect], None]) – Target dialect (defaults to current dialect).

  • pretty (bool) – Whether to format the output SQL.

Return type:

str

Returns:

SQL string in target dialect.

Data Dictionary

class sqlspec.driver.DataDictionaryMixin[source]

Bases: object

Mixin providing common data dictionary functionality.

Includes version caching to avoid repeated database queries when checking feature flags or optimal types.

__init__()[source]
get_cached_version(driver_id)[source]

Get cached version info for a driver.

Parameters:

driver_id (int) – The id() of the driver instance.

Return type:

tuple[bool, VersionInfo | None]

Returns:

Tuple of (was_cached, version_info). If was_cached is False, the caller should fetch the version and call cache_version().

cache_version(driver_id, version)[source]

Cache version info for a driver.

Parameters:
  • driver_id (int) – The id() of the driver instance.

  • version (VersionInfo | None) – The version info to cache (can be None if detection failed).

Return type:

None

get_cached_version_for_driver(driver)[source]

Get cached version info for a driver instance.

Parameters:

driver (Any) – Database driver instance.

Return type:

tuple[bool, VersionInfo | None]

Returns:

Tuple of (was_cached, version_info).

cache_version_for_driver(driver, version)[source]

Cache version info for a driver instance.

Parameters:
  • driver (Any) – Database driver instance.

  • version (VersionInfo | None) – Parsed version info or None.

Return type:

None

parse_version_string(version_str)[source]

Parse version string into VersionInfo.

Parameters:

version_str (str) – Raw version string from database

Return type:

VersionInfo | None

Returns:

VersionInfo instance or None if parsing fails

parse_version_with_pattern(pattern, version_str)[source]

Parse version string using a specific regex pattern.

Parameters:
  • pattern (Pattern[str]) – Compiled regex pattern for the version format

  • version_str (str) – Raw version string from database

Return type:

VersionInfo | None

Returns:

VersionInfo instance or None if parsing fails

detect_version_with_queries(driver, queries)[source]

Try multiple version queries to detect database version.

Parameters:
  • driver (HasExecuteProtocol) – Database driver with execute support

  • queries (list[str]) – List of SQL queries to try

Return type:

VersionInfo | None

Returns:

Version information or None if detection fails

get_default_type_mapping()[source]

Get default type mappings for common categories.

Return type:

dict[str, str]

Returns:

Dictionary mapping type categories to generic SQL types

get_default_features()[source]

Get default feature flags supported by most databases.

Return type:

list[str]

Returns:

List of commonly supported feature names

sort_tables_topologically(tables, foreign_keys)[source]

Sort tables topologically based on foreign key dependencies using Python.

Parameters:
  • tables (list[str]) – List of table names.

  • foreign_keys (list[ForeignKeyMetadata]) – List of foreign key metadata.

Return type:

list[str]

Returns:

List of table names in topological order (dependencies first).

Notes

Self-referencing foreign keys are ignored to avoid simple cycles, and every dependency is added with the referencing table depending on its referenced table.

class sqlspec.driver.AsyncDataDictionaryBase[source]

Bases: object

Base class for asynchronous data dictionary implementations.

Uses Python-compatible class layouts for cross-module inheritance. Child classes define dialect as a class attribute.

dialect: ClassVar[str]

Dialect identifier. Must be defined by subclasses as a class attribute.

__init__()[source]
get_dialect_config()[source]

Return the dialect configuration for this data dictionary.

Return type:

DialectConfig

get_query(name)[source]

Return a named SQL query for this dialect.

Return type:

SQL

get_query_text(name)[source]

Return raw SQL text for a named query for this dialect.

Return type:

str

get_query_text_or_none(name)[source]

Return raw SQL text for a named query or None if missing.

Return type:

str | None

resolve_schema(schema)[source]

Return a schema name using dialect defaults when missing.

Return type:

str | None

resolve_feature_flag(feature, version)[source]

Resolve a feature flag using dialect config and version info.

Return type:

bool

get_cached_version(driver_id)[source]

Get cached version info for a driver.

Parameters:

driver_id (int) – The id() of the driver instance.

Return type:

object

Returns:

Tuple of (was_cached, version_info). If was_cached is False, the caller should fetch the version and call cache_version().

cache_version(driver_id, version)[source]

Cache version info for a driver.

Parameters:
  • driver_id (int) – The id() of the driver instance.

  • version (VersionInfo | None) – The version info to cache (can be None if detection failed).

Return type:

None

parse_version_string(version_str)[source]

Parse version string into VersionInfo.

Parameters:

version_str (str) – Raw version string from database

Return type:

VersionInfo | None

Returns:

VersionInfo instance or None if parsing fails

parse_version_with_pattern(pattern, version_str)[source]

Parse version string using a specific regex pattern.

Parameters:
  • pattern (Pattern[str]) – Compiled regex pattern for the version format

  • version_str (str) – Raw version string from database

Return type:

VersionInfo | None

Returns:

VersionInfo instance or None if parsing fails

detect_version_with_queries(driver, queries)[source]

Try multiple version queries to detect database version.

Parameters:
  • driver (HasExecuteProtocol) – Database driver with execute support

  • queries (list[str]) – List of SQL queries to try

Return type:

VersionInfo | None

Returns:

Version information or None if detection fails

get_default_type_mapping()[source]

Get default type mappings for common categories.

Return type:

dict[str, str]

Returns:

Dictionary mapping type categories to generic SQL types

get_default_features()[source]

Get default feature flags supported by most databases.

Return type:

list[str]

Returns:

List of commonly supported feature names

sort_tables_topologically(tables, foreign_keys)[source]

Sort tables topologically based on foreign key dependencies.

Parameters:
  • tables (list[str]) – List of table names.

  • foreign_keys (list[ForeignKeyMetadata]) – List of foreign key metadata.

Return type:

list[str]

Returns:

List of table names in topological order (dependencies first).

get_cached_version_for_driver(driver)[source]

Get cached version info for a driver instance.

Parameters:

driver (Any) – Async database driver instance.

Return type:

object

Returns:

Tuple of (was_cached, version_info).

cache_version_for_driver(driver, version)[source]

Cache version info for a driver instance.

Parameters:
  • driver (Any) – Async database driver instance.

  • version (VersionInfo | None) – Parsed version info or None.

Return type:

None

abstractmethod async get_version(driver)[source]

Get database version information.

Parameters:

driver (Any) – Async database driver instance

Return type:

VersionInfo | None

Returns:

Version information or None if detection fails

abstractmethod async get_feature_flag(driver, feature)[source]

Check if database supports a specific feature.

Parameters:
  • driver (Any) – Async database driver instance

  • feature (str) – Feature name to check

Return type:

bool

Returns:

True if feature is supported, False otherwise

abstractmethod async get_optimal_type(driver, type_category)[source]

Get optimal database type for a category.

Parameters:
  • driver (Any) – Async database driver instance

  • type_category (str) – Type category (e.g., ‘json’, ‘uuid’, ‘boolean’)

Return type:

str

Returns:

Database-specific type name

abstractmethod async get_tables(driver, schema=None)[source]

Get list of tables in schema.

Parameters:
  • driver (Any) – Async database driver instance

  • schema (str | None) – Schema name (None for default)

Return type:

list[TableMetadata]

Returns:

List of table metadata dictionaries

abstractmethod async get_columns(driver, table=None, schema=None)[source]

Get column information for a table or schema.

Parameters:
  • driver (Any) – Async database driver instance

  • table (str | None) – Table name (None to fetch columns for all tables in schema)

  • schema (str | None) – Schema name (None for default)

Return type:

list[ColumnMetadata]

Returns:

List of column metadata dictionaries

abstractmethod async get_indexes(driver, table=None, schema=None)[source]

Get index information for a table or schema.

Parameters:
  • driver (Any) – Async database driver instance

  • table (str | None) – Table name (None to fetch indexes for all tables in schema)

  • schema (str | None) – Schema name (None for default)

Return type:

list[IndexMetadata]

Returns:

List of index metadata dictionaries

abstractmethod async get_foreign_keys(driver, table=None, schema=None)[source]

Get foreign key metadata.

Parameters:
  • driver (Any) – Async database driver instance

  • table (str | None) – Optional table name filter

  • schema (str | None) – Optional schema name filter

Return type:

list[ForeignKeyMetadata]

Returns:

List of foreign key metadata

list_available_features()[source]

List all features that can be checked via get_feature_flag.

Return type:

list[str]

Returns:

List of feature names this data dictionary supports

class sqlspec.driver.SyncDataDictionaryBase[source]

Bases: object

Base class for synchronous data dictionary implementations.

Uses Python-compatible class layouts for cross-module inheritance. Child classes define dialect as a class attribute.

dialect: ClassVar[str]

Dialect identifier. Must be defined by subclasses as a class attribute.

__init__()[source]
get_dialect_config()[source]

Return the dialect configuration for this data dictionary.

Return type:

DialectConfig

get_query(name)[source]

Return a named SQL query for this dialect.

Return type:

SQL

get_query_text(name)[source]

Return raw SQL text for a named query for this dialect.

Return type:

str

get_query_text_or_none(name)[source]

Return raw SQL text for a named query or None if missing.

Return type:

str | None

resolve_schema(schema)[source]

Return a schema name using dialect defaults when missing.

Return type:

str | None

resolve_feature_flag(feature, version)[source]

Resolve a feature flag using dialect config and version info.

Return type:

bool

get_cached_version(driver_id)[source]

Get cached version info for a driver.

Parameters:

driver_id (int) – The id() of the driver instance.

Return type:

object

Returns:

Tuple of (was_cached, version_info). If was_cached is False, the caller should fetch the version and call cache_version().

cache_version(driver_id, version)[source]

Cache version info for a driver.

Parameters:
  • driver_id (int) – The id() of the driver instance.

  • version (VersionInfo | None) – The version info to cache (can be None if detection failed).

Return type:

None

parse_version_string(version_str)[source]

Parse version string into VersionInfo.

Parameters:

version_str (str) – Raw version string from database

Return type:

VersionInfo | None

Returns:

VersionInfo instance or None if parsing fails

parse_version_with_pattern(pattern, version_str)[source]

Parse version string using a specific regex pattern.

Parameters:
  • pattern (Pattern[str]) – Compiled regex pattern for the version format

  • version_str (str) – Raw version string from database

Return type:

VersionInfo | None

Returns:

VersionInfo instance or None if parsing fails

detect_version_with_queries(driver, queries)[source]

Try multiple version queries to detect database version.

Parameters:
  • driver (HasExecuteProtocol) – Database driver with execute support

  • queries (list[str]) – List of SQL queries to try

Return type:

VersionInfo | None

Returns:

Version information or None if detection fails

get_default_type_mapping()[source]

Get default type mappings for common categories.

Return type:

dict[str, str]

Returns:

Dictionary mapping type categories to generic SQL types

get_default_features()[source]

Get default feature flags supported by most databases.

Return type:

list[str]

Returns:

List of commonly supported feature names

sort_tables_topologically(tables, foreign_keys)[source]

Sort tables topologically based on foreign key dependencies.

Parameters:
  • tables (list[str]) – List of table names.

  • foreign_keys (list[ForeignKeyMetadata]) – List of foreign key metadata.

Return type:

list[str]

Returns:

List of table names in topological order (dependencies first).

get_cached_version_for_driver(driver)[source]

Get cached version info for a driver instance.

Parameters:

driver (Any) – Sync database driver instance.

Return type:

object

Returns:

Tuple of (was_cached, version_info).

cache_version_for_driver(driver, version)[source]

Cache version info for a driver instance.

Parameters:
  • driver (Any) – Sync database driver instance.

  • version (VersionInfo | None) – Parsed version info or None.

Return type:

None

abstractmethod get_version(driver)[source]

Get database version information.

Parameters:

driver (Any) – Sync database driver instance

Return type:

VersionInfo | None

Returns:

Version information or None if detection fails

abstractmethod get_feature_flag(driver, feature)[source]

Check if database supports a specific feature.

Parameters:
  • driver (Any) – Sync database driver instance

  • feature (str) – Feature name to check

Return type:

bool

Returns:

True if feature is supported, False otherwise

abstractmethod get_optimal_type(driver, type_category)[source]

Get optimal database type for a category.

Parameters:
  • driver (Any) – Sync database driver instance

  • type_category (str) – Type category (e.g., ‘json’, ‘uuid’, ‘boolean’)

Return type:

str

Returns:

Database-specific type name

abstractmethod get_tables(driver, schema=None)[source]

Get list of tables in schema.

Parameters:
  • driver (Any) – Sync database driver instance

  • schema (str | None) – Schema name (None for default)

Return type:

list[TableMetadata]

Returns:

List of table metadata dictionaries

abstractmethod get_columns(driver, table=None, schema=None)[source]

Get column information for a table or schema.

Parameters:
  • driver (Any) – Sync database driver instance

  • table (str | None) – Table name (None to fetch columns for all tables in schema)

  • schema (str | None) – Schema name (None for default)

Return type:

list[ColumnMetadata]

Returns:

List of column metadata dictionaries

abstractmethod get_indexes(driver, table=None, schema=None)[source]

Get index information for a table or schema.

Parameters:
  • driver (Any) – Sync database driver instance

  • table (str | None) – Table name (None to fetch indexes for all tables in schema)

  • schema (str | None) – Schema name (None for default)

Return type:

list[IndexMetadata]

Returns:

List of index metadata dictionaries

abstractmethod get_foreign_keys(driver, table=None, schema=None)[source]

Get foreign key metadata.

Parameters:
  • driver (Any) – Sync database driver instance

  • table (str | None) – Optional table name filter

  • schema (str | None) – Optional schema name filter

Return type:

list[ForeignKeyMetadata]

Returns:

List of foreign key metadata

list_available_features()[source]

List all features that can be checked via get_feature_flag.

Return type:

list[str]

Returns:

List of feature names this data dictionary supports

Feature Flag Types

class sqlspec.data_dictionary.FeatureFlags[source]

Bases: TypedDict

Typed feature flags for data dictionary dialects.

supports_arrays: bool
supports_clustering: bool
supports_cte: bool
supports_generators: bool
supports_geography: bool
supports_in_memory: bool
supports_index_clustering: bool
supports_interleaved_tables: bool
supports_json: bool
supports_maps: bool
supports_partitioning: bool
supports_prepared_statements: bool
supports_returning: bool
supports_schemas: bool
supports_structs: bool
supports_transactions: bool
supports_upsert: bool
supports_uuid: bool
supports_window_functions: bool
class sqlspec.data_dictionary.FeatureVersions[source]

Bases: TypedDict

Typed feature version requirements for data dictionary dialects.

supports_cte: VersionInfo
supports_json: VersionInfo
supports_jsonb: VersionInfo
supports_partitioning: VersionInfo
supports_returning: VersionInfo
supports_upsert: VersionInfo
supports_window_functions: VersionInfo

Driver Protocols