Model Definition Language

The models generated by by this package form a Model Definition Language for writing JSON Schema in Python. This syntax can be used to write JSON Schema directly, or can be generated from JSON Schema documents on the fly.

Primitives

Each JSON Schema is described by an Element object. The simplest possible schema (equivalent to {} or true) is expressed as

>>> from statham.schema.elements import Element
>>> element = Element()

This element will accept any value:

>>> element(1)
1
>>> element("a string!")
"a string"

Validation can be added by using typed elements and keyword arguments:

>>> Element(minimum=3)
>>> String(maxLength=20)
>>> Boolean(default=True)

The following primitive elements are available:

String Format Validation

Element and String both support the "format" validation keyword. statham validates two formats out-of-the-box: "date-time" and "uuid".

Custom string formats may may be added, by registering them. The following example shows how to register format validation for an RFC 3986 URI, as well as a completely custom format:

from rfc3986_validator import validate_rfc3986
from statham.schema.validation import format_checker

format_checker.register("uri")(validate_rfc3986)

@format_checker.register("no_bad_words")
def _validate_custom_format(value: str) -> bool:
    """Make sure there are no bad words in the string."""
    for bad_word in ("bad", "words"):
        if bad_word in value:
            return False
    return True

statham will not fail validation if it finds an unknown format, but it will raise a warning.

Containers

Elements accepting list and dict values include schemas for validating their contained items. When called, these elements will recursively validate both the container and its contained items.

Array

Array accepts an Element as its only positional argument. This corresponds to the "items" JSON Schema keyword.

>>> from statham.schema.elements import Array, String
>>> array = Array(String())
>>> array(["a", "string"])
["a", "string"]
>>> array([1, 2])
ValidationError: Failed validating `1`. Must be of type (str).

Array will also accept a list of elements as its "items". In this case, each list item will be validated against the Element at the corresponding index:

>>> from statham.schema.elements import Array, Integer, String
>>> array = Array([Integer(), String()])
>>> array([1, "a string"])
[1, "a string"]
>>> array(["two", "strings"])
ValidationError: Failed validating `'two'`. Must be of type (int).

When items schemas are declared in this way, subsequent elements are validated by the additionalItems option, which by default allows anything.

>>> array([1, "a string", 23.0])  # Accepts any additional items
[1, "a string", 23.0]
>>> array = Array([Integer(), String()], additionalItems=False)
>>> array([1, "a string", 23.0])
ValidationError: Failed validating `[1, 'string', 23.0]`. Must not contain additional items. Accepts: [Integer(), String()]
>>> array = Array([Integer(), String()], additionalItems=Number())
>>> array([1, "a string", 23.0])
[1, "a string", 23.0]
>>> array([1, "a string", "an unexpected string"])
ValidationError: Failed validating `'an unexpected string'`. Must be of type (float,int).

Object

Object is a special case, and key to leveraging type-checking with statham models. Object-typed schemas are declared as sub-classes of Object.

>>> from statham.schema.constants import Maybe
>>> from statham.schema.elements import Object, String
>>> from statham.schema.property import Property
>>>
>>> class StringWrapper(Object):
...     value: Maybe[str] = Property(String())
>>>
>>> StringWrapper({"value": "a string"})
StringWrapper(value='a string')

The Property descriptor is used to declare which properties are required, and to rename properties which aren’t valid python attributes:

>>> class CustomObject(Object):
...     class_: str = Property(String(), required=True, source="class")
>>>
>>> CustomObject({"class": "ABC"})
CustomObject(class_='ABC')

By default, properties are not required, and do not need to be present when instantiating the class. The statham.schema.constants.Maybe generic type is used to annotate this (see first example).

Additional keywords may be set on the schema via class arguments:

>>> class StringWrapper(Object, additionalProperties=False):
...     value: str = Property(String())
>>>
>>> StringWrapper({"other": "a string"})
ValidationError: Failed validating `{'other': 'a string'}`. Must not contain unspecified properties. Accepts: {'value'}

Properties which are accepted via additionalProperties or patternProperties are accessible via __getitem__():

>>> class StringWrapper(Object):
...     value = Property(String())
>>>
>>> value = StringWrapper({"value": "a string", "other": "another string"})
>>> value["other"]
"another string"

Object elements may also be declared via an inline constructor as follows:

>>> StringWrapper = Object.inline("StringWrapper", properties={"value": Property(String())})
>>> StringWrapper({"value": "a string"})
StringWrapper(value='a string')

However, elements declared this way will not have the same type hinting support as those declared using class notation.

Note

It is possible to pass "object" values to Element. Assuming all validation passes, the return value will be a instance of a dict subclass allowing attribute access to its keys. This allows a consistent interface with Object instances.

>>> element = Element()
>>> instance = element({"value": "foo"})
>>> instance.value
'bar'
>>> instance.value == instance["value"]
True
>>> instance
{'value': 'foo'}

Composition

Elements for composition keywords (e.g. "not", "anyOf", "oneOf", "allOf") break from the standard JSON Schema structure. statham does not allow outer keywords when a composition keyword is present, with the exception of the "default" keyword. This reduces the number of possible ways to write the same schema, without making any schema impossible.

For example, consider the following schema which allows any string, provided it is not a UUID.

{
    "type": "string",
    "not": {"format": "uuid"}
}

The equivalent form is achieved in statham with AllOf:

from statham.schema.elements import (
    AllOf,
    Element,
    Not,
    String,
)

element = AllOf(String(), Not(Element(format="uuid")))

Similarly, schemas with multiple types are achieved with AnyOf:

{
    "type": ["string", "integer"]
}

may be expressed as

from statham.schema.elements import AnyOf, Integer, String

element = AnyOf(String(), Integer())

There are four composition elements available:

Parsing JSON Schema Documents

JSON Schema documents can be directly parsed to statham elements, without generating any code. This reduces the benefit gained by type hints, but can still be useful for inspecting JSON Schemas in Python, and using functionality like "default".

For simple schemas, with no definitions, parse_element() can be used.

>>> from statham.schema.parser import parse_element
>>> parse_element({"type": "string", "maxLength": 20})
String(maxLength=20)

If your schema contains multiple definitions, and you’d like to parse all of them, then use parse(). This will return a list of elements, starting with the top-level schema, followed by schemas found in definitions. Be aware that leaving the top-level empty will be parsed (correctly) as a blank schema, or Element().

Note

These parsing tools make the following assumptions:

  1. The schema has already been dereferenced

  2. Any "object" schemas have a "title" annotation

statham uses another library to do this automatically when performing code generation, you can do it yourself like so:

>>> from json_ref_dict import materialize, RefDict
>>> from statham.titles import title_labeller
>>>
>>> schema = materialize(
...     RefDict.from_uri(<uri>), context_labeller=title_labeller()
>>> )

For more information about what this is doing, look at json-ref-dict.