matrix_io

data.matrix_io

Low-level matrix read/write for HDF5 groups and Zarr groups.

Supports dense, CSR, and COO formats. Both HDF5 (h5py.Group) and Zarr (zarr.Group) implement the same array-store interface, so a single pair of read/write functions handles both backends.

See §12.1 of the tvbo HDF5 format proposal v0.7.

Classes

Name Description
LazyArrayStore Lazy-loading wrapper for companion binary files (HDF5/Zarr/CSV).

LazyArrayStore

data.matrix_io.LazyArrayStore(companion_path, meta_dict)

Lazy-loading wrapper for companion binary files (HDF5/Zarr/CSV).

Stores a reference to the companion path and sidecar metadata. Arrays are loaded on first access and cached thereafter.

Parameters

companion_path : Path Path to the companion binary file (.h5, .zarr, .csv). meta_dict : dict Raw sidecar dict (from yaml_loader.load_as_dict) for edge traversal.

Attributes

Name Description
arrays Edge matrices, loaded lazily on first access.
edge_params Edge parameters, loaded lazily on first access.

Methods

Name Description
read_dataset Read an arbitrary dataset by path (e.g. "nodes/parent_index").
read_dataset
data.matrix_io.LazyArrayStore.read_dataset(key)

Read an arbitrary dataset by path (e.g. "nodes/parent_index").

Functions

Name Description
auto_format Select optimal storage format based on empirical analysis (§11).
read_matrix Read a matrix from an HDF5/Zarr group, returning dense numpy array.
write_matrix Write a matrix to an HDF5/Zarr group in the specified format.

auto_format

data.matrix_io.auto_format(matrix)

Select optimal storage format based on empirical analysis (§11).

Rules (data-driven from tvbo corpus measurements): - N < 500 or fill > 30%: dense + gzip wins - Otherwise: CSR

Handles both dense arrays and scipy sparse matrices without densifying the input.

Parameters

matrix : array-like or scipy.sparse matrix Matrix to analyze.

Returns

str “dense” or “csr”

read_matrix

data.matrix_io.read_matrix(grp)

Read a matrix from an HDF5/Zarr group, returning dense numpy array.

Parameters

grp : h5py.Group or zarr.Group Source group containing format/shape attrs and data datasets.

Returns

np.ndarray Dense numpy array.

write_matrix

data.matrix_io.write_matrix(grp, matrix, fmt='dense')

Write a matrix to an HDF5/Zarr group in the specified format.

Parameters

grp : h5py.Group or zarr.Group Target group. matrix : array-like Matrix data to write. fmt : str Storage format: “dense”, “csr”, or “coo”.