Collections: namedtuple¶
Namedtuple: Introduction¶
collections.namedtuple
is a simple factory function for building more advanced tuples. Just
like tuples they permit indexing, are iterable and the main benefit they offer over a standard
tuple is that attributes can be accessed by name. Here is a basic introduction to named tuples:
from collections import namedtuple Foo = namedtuple("Foo", "bar, baz", defaults=(100, 200)) print(Foo()) # Foo(bar=100, baz=200)
As you can see from this small snippet, namedtuple also offer a nice dunder __repr__ implementation right off the bat. We will discuss more throughout this article, especially the factory function arguments in great detail, but for now just know that namedtuples create tuple subclasses with attribute access.
Namedtuple: Factory Function¶
The namedtuple(...)
factory function offers a lot of additional arguments, often overlooked and
to be honest, rarely used, however for a full overview, we will discuss each argument and what it does
with an example.
collections.namedtuple( typename: str, field_names: Iterable[str], *, rename: bool = False, defaults: Optional[Any] = None, module: Optional[Any] = None )
Namedtuple: typename¶
typename is the new tuple subclass name. In order for pickling to be natively supported the typename should match the name of the variable assigned to the tuple subclass. This is briefly shown below:
import pickle from collections import namedtuple Foo = namedtuple("Bar", "a,b,c", defaults=(200,300)) f = Foo(a=25) print(f) # Bar(a=25, b=200, c=300) pickle.dumps(f) # PicklingError: Can't pickle <class '__main__.Bar'> Foo2 = namedtuple("Foo2", "a,b,c", defaults=(200, 300)) f = Foo2(25) fbytes = pickle.dumps(f) # b'\x80\x04\x95 \x00\x00\x00\x00\x00\x00\x00\x8c\x08__main__\x94\x8c\x04Foo2\x94\x93\x94K\x19K\xc8M,\x01\x87\x94\x81\x94...."
Namedtuple: field_names¶
field_names
is a sequence of strings or an individual string of the attribute names to be
assigned to the underlying tuple subclass. In the latter, attribute names are automatically
resolved by splitting on either a comma, or whitespace, named tuples do not have an
underlying __dict__
instance (think __slots__
) which is what allows them to compete
with standard tuples on memory.
from collections import namedtuple f = namedtuple("f", ["one", "two", "Three"]) f2 = namedtuple("f2", "one two three") f3 = namedtuple("f3", "one, two, three")
field names can be any valid python identifier except for anything starting with an underscore. named tuple has an extra param we will discuss later rename= which is used for rewriting illegal field names automatically with a prefixed underscore.
Namedtuple: rename¶
As previously outlined, rename works in tandem with the field_names argument in order to automatically rewrite name violations with a prefixed _ positional names, where each violation is incremented += 1. For example:
from collections import namedtuple One = namedtuple("One", "one, def, two, class, three, return", rename=True) one = One(10, 20, 30, 40, 50, 60) # One(one=10, _1=20, two=30, _3=40, three=50, _5=60)
As you can see in the example, field names def, class and return are also python core builtin reserved keywords, these have automatically been rewritten with _<n> for each violation in the sequence passed to field_names.
Namedtuple: defaults¶
namedtuple defaults is an iterable
of names to unpack into the fields when a value is omitted.
By default, the values are unpacked from <- right to left, so if there are three field names
defined a,b,c and two defaults defaults=(100, 200) then b == 100 and c == 200, a
is a required field in this instance. defaults=
can also be None
in which case, all
field_name attributes are required.
from collections import namedtuple Foo = namedtuple("Foo", "a,b,c", defaults=(10, 20)) f = Foo() # __new__() missing 1 required positional argument: 'a' f = Foo(2000) print(f) # Foo(a=2000, b=10, c=20)
namedtuple: module¶
namedtuple allows you to customise the module of the tuple subclass, if module= is assigned
then the dunder __module__
of the namedtuple will be set to that. __module__
is a writable
field defining the name of the module the function was defined in. This is shown below (using an
interactive ipython shell where by default the module would be __main__
.
from collections import namedtuple T1 = namedtuple("T1", "a,b,c", defaults=(1,2,3), module="foomod") T2 = namedtuple("T2", "a,b,c", defaults=(3,2,1)) # T1 has a custom module name assigned; let's inspect its instances: T1().__module__ # foomod # T2 omits the module attribute from the sig T2().__module__ # __main__
Namedtuple: misc¶
In order for namedtuple instances to be a core part of the python language, they need to retain some of the benefits of standard tuple types. Namedtuples do not have a per instance dictionary (only a class one) this is how they are able to retain the same memory footprint as normal tuples. They are of course also immutable and in order to support pickling by default, the variable named assigned to the namedtuple instance should match that of the defined typename. These are outlined below:
# -- Memory Footprint from sys import getsizeof t = (100, 200, 300) nt = namedtuple("Foo", "a b c", defaults=(100,200,300))() # Create the instance! getsizeof(t) # 64 bytes getsizeof(nt) # 64 bytes # -- Immutability Immutability = namedtuple("Immutability", "one, two", defaults=(500, 600)) immutable = Immutability() immutable.__dict__ # `Immutability object has no attribute __dict__` immutable.one, immutable.two # (500, 600) immutable.one = 2 # AttributeError: Cannot set attribute # -- Pickle capabilities import pickle Works = namedtuple("Works", "a") w = Works(10) pickle.dumps(w) # bytes no problem. DoesntWork = namedtuple("Different", "a") d = DoesntWork(20) pickle.dumps(d) # PicklingError: Can't pickle <class '__main__.Different'>: attribute lookup Different on __main__ failed
Namedtuple: _make¶
The first of the three main methods that are bolted onto namedtuple instances. _make is a @classmethod. that uses tuple.__new__ under the hood to create a new namedtuple instance from an iterable.
from collections import namedtuple T = namedtuple("T", "a b c", defaults=(100,150, 200)) t = T() # T(a=100, b=150, c=200) t2 = t._make((5,15,25)) # T(a=5, b=15, c=25)
Namedtuple: _asdict¶
The second of the three main methods that are bolted onto namedtuple instances. _asdict returns a dictionary of the namedtuple instance attributes and corresponding values. As of python 3.8 the _asdict function returns a normal dictionary, if you need the benefits of an OrderedDict consider instantiating one directly using this _asdict function:
from collections import namedtuple from collections import OrderedDict T = namedtuple("T", "a,b", defaults=(50, 100)) t1 = T() mapping = t1._asdict() # {"a": 50, "b": 100} order = OrderDict(t1._asdict()) # OrderedDict([('a', 50), ('b', 100)])
Namedtuple: _replace¶
The third of the three main methods bolted onto namedtuple instances is _replace. This allows you to create a new instance of the tuple subclass, replacing fields of the existing instance with keys and respective values from the **kwargs mapping:
from collections import namedtuple T = namedtuple("T", "a,b,c", defaults=(12, 24, 36)) t = T() # T(a=12, b=24, c=36) mapping = {"c": 4000} t2 = t._replace(**mapping) t2 # T(a=12, b=24, c=4000)
Namedtuple: _fields¶
The _fields instance attribute is used for a simple tuple of the namedtuple instance field names. This is useful for introspection and new namedtuple instances containing a subset of an existing instances fields, this recipe is outlined below:
from collections import namedtuple T = namedtuple("T", "a") t = T(100) # T(a=100) T2 = namedtuple("T2", t._fields + ("b", "c"), defaults=(50, 60,70)) t2 = T2() t2 # T2(a=50, b=60, c=70)
Namedtuple: _field_defaults¶
Namedtuple _field_defaults returns a mapping of fields to their respective default values:
from collections import namedtuple T = namedtuple("T", "one, two, three", defaults=("three", "two")) t = T(500) t._field_defaults # {"two": "three", "three": "two"}