Standard Name Table

A so-called “standard name table” defines “standard names”, which is a concept used by the CF Convention.

Those standard names are used to define the meaning of a numerical variable in files (typically netCDF4 files).

With this library, we can describe a standard name table using JSON-LD. Note, that only a simplified version of the original CF Conventions is modelled!

This notebook walks you through the main steps of building such a table yourself using Python:

import ssnolib
from ssnolib.namespace import SSNO
from ssnolib.prov import Person, Organization, Attribution
from ontolutils.namespacelib.m4i import M4I
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
Cell In[1], line 3
      1 import ssnolib
      2 from ssnolib.namespace import SSNO
----> 3 from ssnolib.prov import Person, Organization, Attribution
      4 from ontolutils.namespacelib.m4i import M4I

ModuleNotFoundError: No module named 'ssnolib.prov'

Create a new table

Let’s start by instantiate a table. We add a title and one or multiple associated “agents”, which can be persons or organizations. More details on how to work with agents can be found here.

# Create to "Agents", which are Persons in this case:
agent1 = ssnolib.Person(
    id="https://orcid.org/0000-0001-8729-0482",
    firstName="Matthias",
    lastName="Probst",
    orcidId="https://orcid.org/0000-0001-8729-0482"
)
# Agent 2 is affiliated with an organization:
orga1 = ssnolib.Organization(name="Awesome Institute")
agent2 = ssnolib.Person(
    firstName="John",
    lastName="Doe",
    mbox="john@doe.com",
    affiliation=orga1
)

# instantiate the table:
snt = ssnolib.StandardNameTable(
    title='SNT from scratch',
    description="A table defined as part of a tutorial",
    version='v1',
    qualifiedAttribution=[
        Attribution(agent=agent1, hadRole=M4I.ContactPerson),
        Attribution(agent=agent2, hadRole=M4I.Supervisor),
        Attribution(agent=orga1)
    ]
)
snt.to_html(folder="tmp")
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[3], line 1
----> 1 snt.to_html(folder="tmp")

File ~\Documents\GitHub\ssnolib\ssnolib\ssno\standard_name_table.py:1344, in StandardNameTable.to_html(self, folder, filename)
   1342     filename = f"{self.title}.html"
   1343 html_filename = pathlib.Path(filename)
-> 1344 markdown_filename = self.to_markdown(html_filename.with_suffix('.tmp.md'))
   1345 template_filename = __this_dir__ / 'templates' / 'standard_name_table.html'
   1347 if not template_filename.exists():

File ~\Documents\GitHub\ssnolib\ssnolib\ssno\standard_name_table.py:1187, in StandardNameTable.to_markdown(self, filename)
   1185 if qa.hadRole:
   1186     role = ROLE_LOOKUP.get(str(qa.hadRole), str(qa.hadRole).rsplit("/", 1)[-1])
-> 1187     lines.append(f"{role}: {qa.agent.to_text()}")
   1188 else:
   1189     lines.append(f"Contact: {qa.agent.to_text()}")

File ~\Documents\GitHub\ssnolib\ssnolib\prov\attribution.py:131, in Person.to_text(self)
    129     parts.append(f"ORCID: {self.orcidId}")
    130 if self.affiliation:
--> 131     parts.append(f"{self.affiliation.to_text()}")
    132 return '; '.join(parts)

File ~\Documents\GitHub\ssnolib\ssnolib\prov\attribution.py:79, in Organization.to_text(self)
     77 if self.hasRorId:
     78     parts.append(f"ROR ID: {self.hasRorId}")
---> 79 return '; '.join(parts)

TypeError: sequence item 0: expected str instance, LangString found

Let’s add some standard names to the table:

Add Standard Names

snt.standardNames = [
    ssnolib.StandardName(
        standard_name="air_density",
        description="The density of air",
        unit="kg/m^3"
    ),
    ssnolib.StandardName(
        standard_name="coordinate",
        description="The spatial coordinate vector.",
        unit="m"
    ),
    ssnolib.StandardName(
        standard_name="velocity",
        description="The velocity vector of an object or fluid.",
        unit="m/s"
    )
]

So far we only have two standard names. We can define modification rules, to build new, verified standard names. For example, “x_velocity” would be a reasonable new standard name for the table.

So let’s define such a modification rule. We call it a Qualification. The one we would like to define should be used directly of an already existing standard name, e.g. “SSNO:AnyStandardName”:

component = ssnolib.VectorQualification(
    name="component",
    hasValidValues=["x", "y", "z"],
    description="The component of a vector",
    before=SSNO.AnyStandardName
)

transformation = ssnolib.Transformation(
    name="C_derivative_of_X",
    description="derivative of X with respect to distance in the component direction, which may be x, y or z.",
    altersUnit="[X]/[C]",
    hasCharacter=[
        ssnolib.Character(character="X", associatedWith=SSNO.AnyStandardName),
        ssnolib.Character(character="C", associatedWith=component.id),
    ]
)

Add it to the SNT:

snt.hasModifier = [component, transformation]

We can check standard name strings, whether they apply to the modification rule:

snt.verify_name("vertical_velocity")
snt.verify_name("x_velocity")
snt.verify_name("x_component")

Also, adding new standard names can go through a verification:

#snt.add_new_standard_name("x_coordinate", verify=True) # verify=False will just add the standard name and interpret it as a core standard name

Export standard name tables

We can export to various formats such as JSON-LD or TTL. We can also generate an HTML file:

Serialize TTL:

print(snt.serialize(format="ttl", ba))

Write HTML file

snt.to_html(folder="tmp")
with open(f"tmp/{snt.title}.jsonld", "w", encoding="utf-8") as f:
    f.write(snt.model_dump_jsonld())
snt.title
snt_loaded = ssnolib.StandardNameTable.parse(f"tmp/{snt.title}.jsonld", context={"ssno": "https://example.org/"})
snt_loaded.qualifiedAttribution[0].agent.model_dump(exclude_none=True)
snt_loaded.hasModifier

Parse a table from an online resource

Let’s pare the CF Convention, which is the model role for the library: CF Convention table.

Well, it does not need the SSNO ontology for that, just use DCAT:

distribution = ssnolib.dcat.Distribution(
    title='XML Table',
    download_URL='https://cfconventions.org/Data/cf-standard-names/current/src/cf-standard-name-table.xml',
    media_type='application/xml'
)
dataset = ssnolib.dcat.Dataset(
    distribution=distribution
)
print(dataset.model_dump_ttl())

But let’s associate out schema:ResearchProject to it:

from ssnolib.schema import Project
proj = Project(name="My Project", usesStandardnameTable=dataset)

Maybe we would like to get all the standard names. We can do this by calling fetch() or instantiate the standard name table using parse():

from ontolutils import QUDT_UNIT

additional_qudts = {
    # other:
    'kg m-1 s-1': QUDT_UNIT.KiloGM_PER_M_SEC,
    'm-2 s-1': QUDT_UNIT.M2_PER_SEC,
    'K s': QUDT_UNIT.K_SEC,
    'W s m-2': QUDT_UNIT.W_SEC_PER_M2,
    'N m-1': QUDT_UNIT.N_PER_M,
    'mol mol-1': QUDT_UNIT.MOL_PER_MOL,
    'mol/mol': QUDT_UNIT.MOL_PER_MOL,
    'm4 s-1': QUDT_UNIT.M4_PER_SEC,
    'K Pa s-1': QUDT_UNIT.K_PA_PER_SEC,
    'Pa m s-1': QUDT_UNIT.PA_M_PER_SEC,
    'radian': QUDT_UNIT.RAD,
    'degree s-1': QUDT_UNIT.DEG_PER_SEC,
    'Pa m s-2': QUDT_UNIT.PA_M_PER_SEC2,
    'sr': QUDT_UNIT.SR,
    'sr-1': QUDT_UNIT.PER_SR,
    'm year-1': QUDT_UNIT.M_PER_YR,
    'mol m-2 s-1 sr-1': QUDT_UNIT.MOL_PER_M2_SEC_SR,
    'mol m-2 s-1 m-1 sr-1': QUDT_UNIT.MOL_PER_M2_SEC_M_SR,
    'Pa-1 s-1': QUDT_UNIT.PA_PER_SEC,
    'm-1 s-1': QUDT_UNIT.PER_M_SEC,
    'm2 s rad-1': QUDT_UNIT.M2_SEC_PER_RAD,
    'W/m2': QUDT_UNIT.W_PER_M2,
    'dbar': QUDT_UNIT.DeciBAR
}
snt = ssnolib.StandardNameTable.parse(dataset.distribution[0], make_standard_names_lowercase=True, qudt_lookup=additional_qudts)
snt.to_html(folder="tmp")

Write to JSON-LD file:

with open(f"tmp/{snt.title}.jsonld", "w", encoding="utf-8") as f:
    f.write(snt.model_dump_jsonld())

Instantiate a Standard name table from a JSON-LD:

snt = ssnolib.parse_table(f"tmp/{snt.title}.jsonld")