Resource Validation

New validation functionality has recently been added to the BabelFSH app. This has not yet been documented.

Validation Architecture

BabelFSH is quite defensive in what it accepts from users. First, the FSH code is checked for high-level compliance with the FSH Grammar that comes from SUSHI. Next, during parsing, some semantic rules are enforced in the FSH code. The terminology comments are checked for high-level compliance with the syntax of command line arguments. When this is done, the rules in the FSH code are applied and validated for correctness against the FHIR resource definitions. The overall process is shown below:

FSH-Parsing Validation

Besides validating the compliance of the provided FSH code with the FSH grammar, the parser (listener) also validates a number of high-level rules.

  1. Unsupported states are caught here. The FSH grammar supports all resource types defined in FSH, but using listener methods, those that are not supported by BabelFSH will be rejected. The same applies for ConceptMap, which is implemented using the Resource FSH item. Only the resource type ConceptMap is supported for Resource (using InstanceOf).

  2. Alias: All alias names have to start with a $ (a stricter rule than FSH, but normal practice always uses a Dollar, so that this rule avoids surprises).

  3. Name: Warning if the name doesn't comply with the FHIR-provided regex, hard error if longer than 255 characters.

  4. CodeSystem must have a Name declaration (not a rule), same for ValueSet.

  5. IDs must be unique across all FSH files.

  6. RuleSet names must be unique across all FSH files.

  7. CodeSystem and ValueSet declarations must either have a terminology plugin comment or reference a RuleSet (which in turn should have a terminology plugin comment, but that's checked when the RuleSet inserts are resolved). The same applies to ConceptMap.

  8. Id must comply with the FHIR regex

  9. In parametrized rule sets, any insertion of a rule set may not reference a parameter from the outer rule set (see below)

  10. ConceptMap must have set it's Usage set to #definition .

Regarding number 9, the following insertion is illegal:

The provided FSH is not semantically correct. InsertRule parameters must not reference parameters from the ruleset (at file: ./foo.babel.fsh, at line: 6, )

FSH Generation Validation

BabelFSH has no concept of FHIR resource definitions. It does not know, for example, that a CodeSystem has a url and version parameter, and does not have a sourceUri. Neither does it know the data types of these elements. However, through the integration of the HAPI FHIR library, it can validate the compliance of generated resources against the definitions for R4(B) and R5.

To ensure that users are pointed towards errors, the resources aren't validated in a single block, but rather iteratively. For that, the validator creates an Abstract Syntax Tree-ish representation of the FSH rules in the source set to group related rules appropriately. Consider this example:

This CodeSystem is parsed into the following AST representation:

Using this AST, the resource is validated iteratively, and since every rule "knows" where it was declared, precise error messages can be generated. This is done by reducing the parse tree to a minimum, and evaluating those rules to serialize a FHIR resource, which is then passed to HAPI FHIR. For top-level simple rules, the reduced parse tree looks very simple (GraphViz DOT format)

For array elements, this is more complicated, since arrays can be nested:

In this example, there is only a single contact array element. For properties, however, there are multiple property declarations, which are all validated one after the other:

In these cases, it's hard to pin a HAPI message to a specific line. While the above examples all validate fine, consider a typo in a property declaration, with the description element being called desciption:

Due to the batched validation, BabelFSH will tell the user that the error must have occured in file X.babel.fsh, lines 2 through 4, in FSH Item RuleSet properties, but not be able to tell the user that the error is in line 4 exacly. This is a worthwile tradeoff for not being forced to correctly re-implement the HAPI FHIR wheel though, and most HAPI errors are actually quite human-readable too.

Parse Tree Reduction Algorithm

To reduce the parse tree (as shown above), the "root nodes" in the AST have to be identifed. Those are the nodes that are terminal (e.g. url in the above example), and terminal array elements:

AST for the above CodeSystem

In the above AST, the terminal objects are highlighted in pink. Those are trivial. However, the objects that are below ARRAY_ELEMENT nodes are not considerered "terminal", since the validation might need more than the single element to validate correctly (there are some invariants defined on the respective data types that might result in warnings where appropriate), so that e.g. contact.telecom has to be handled as a single unit.

The "root" nodes of the AST are identified at the beginning of ResourceFactory#validateRules , using RuleParseTree#isTerminalOrTerminalArray . The pink nodes are those that are identified as terminal, since they are direct descendents of the RESOURCE node, and the green array elements are those that are identified as terminal arrays. contact[0].telecom[0] isn't identified as "terminal", since the contact[0] node is in the path from RESOURCE to the second ARRAY_ELEMENT . The objects below ARRAY_ELEMENTS are excluded since their parents are ARRAY_ELEMENT .

Based on the list of identified terminal nodes (url, valueSet , contact[0] and identifier[0] ), the graph is pruned to only include the RESOURCE and the nodes between the terminal nodes and below it.

From this pruned graphs, the rules are then evaluated and a JSON representation of the resulting resource is written to RAM.

TODO

CodeSystem Post-Generation Validation

For CodeSystem resources, it is ensured that all properties in the CS are used correctly (CodeSystemFactory#validateResourceSemantics ):

  • All properties MUST be declared if they are used.

  • All properties that are declared SHOULD be used.

  • All declared property data types (property[+].type = #string ) must match those that are used in the concepts.

For ConceptMap, there is currently no additional validation.

Last updated