SpringMT-2023: Management Track Assessments Spring 2023

Some features of these PDF and HTML documents were discussed at the TUG 2023 meeting, in Ross Moore’s presentation video.

The method of production follows these steps.

  1. 1.
    First LaTeX source is processed to build a Tagged PDF valid for the PDF/UA-1 format.
  2. 2.
    The resulting PDF is derived into an HTML version using ngPDF .
  3. 3.
    The HTML page, and any images, is then downloaded and copied to this location.
  4. 4.
    Some post-processing with sed commands is done on the HTML to remove some incorrect structures, invalid attributes, and the ‘- ’ remnants of hyphenation within the PDF.

Validation for PDF/UA and WCAG 2.0 using PAC 2021

The PAC 2021 (PDF Accessibility Checker) is very good software for checking the validity of a tagged PDF document for conformance with the PDF/UA-1 standard. Unlike other validators (veraPDF, Adobe Acrobat Pro's Preflight Tool, and BFO-pdf) it also pays attention to the ‘containment rules’ concerning how common structure elements can occur as children of other such elements. Furthermore, it has extra heuristics for checking some aspects of WCAG recommendations, beyond what is built-in to PDF/UA format itself. An update PAC 2024 is due for release in early 2024; thanks to Thomas Schempp for allowing early access to a pre-release version.

The results for SpringMT2023.pdf, passing all applicable tests in PAC 2021, are summarised in the images shown here:

PDF/UA tests by category … … all categories … … by WCAG 2.1 success criteria.
validation for PDF/UA validation details for PDF/UA validation for WCAG

Other outputs from PAC 2021 and Acrobat Pro are as follows:

  • See the Summary Report in PDF format.
  • Screen Reader Preview: an emulation in HTML of how Assistive Technology might read aloud the PDF/UA document. This includes extra ‘spoken tags’ which briefly explain the structure;
    e.g. ‘; Title ;’, ‘; start of section named ... ;’, etc.
  • Export from Acrobat Pro: Text (Accessible) plain-text version, including the ‘spoken tags’.
  • Export from Acrobat Pro: XML 1.0 format includes Metadata, Bookmarks, etc.

Validation for PDF/UA and WCAG 2.0 using PAC 2024

With PAC 2024 the results are similarly passing all test for PDF/UA and WCAG categories. There is a new ‘Quality’ category having additional tests. Some of these tests are designed primarily to help authors using particular PDF writing software (not LaTeX) avoid making common errors. However this results in flagging some constructions, as seen in the image at right below. Most are designed to give a better result when ‘Read Out Loud’ using Adobe's Acrobat and Reader.

The results for SpringMT2023.pdf, passing all applicable tests in PAC 2024, are summarised in the images shown here:

PDF/UA tests by category … … by WCAG 2.1 success criteria. Quality
validation for PDF/UA validation for PDF/UA tests for Quality

Click here for the 'Screen Reader Preview'. This includes the ‘spoken tags’ described above. Note however that many of the images have not been created correctly from the PDF; of 43 images, only 9 have captured the correct area on the PDF page. It needs to be remarked that the tests results summarised above are limited to sub-categories: 1.1.1, 1.3.1, 1.4.3, 2.4.2, 2.4.3, 3.1.1, 3.1.2 and 4.1.1 .

Accessibility: WCAG 2.1 and ARIA Success Criteria

The following images and explanatory notes indicate how WCAG and ARIA recommendations and requirements are handled. We use the AInspector for Firefox plug-in extension to the Firefox browser, to algorithmically check many aspects of these recommendations.

The results are summarised below, listed according to structural categories, and alternatively according to WCAG Success Criteria groupings. The table headings are V: violation, W: warning, MC: manual check, and P: pass.

AInspector: All Rules summary; 0 V, 0 W, 34 MC, 41 P.
AInspector: WCAG; 34 MC, 41 P  

Note that there are no violations and no warnings, with 34 possible issues requiring manual checking, some of which will be dealt with below. As well as the 41 rules which are algorithmically verified as Pass, there are 58 further rules or recommendations deemed to be not applicable, making up to 133 types of test in all. For instance, since there are no interactive Form fields in the document, then most checks under the ‘Forms’ category are deemed inapplicable; similarly for most under ‘Widgets/Scripts’ and most under ‘Audio/Video’ as well as a few in other categories. This is actually an over-count, as some tests may have some elements Pass while others may need further checking.

2024 updates: WCAG 2.2 and DPUB-ARIA considerations

With a string of updates, and the name changed to ‘AInspector for Firefox’, the validation software has improved. In particular the issue with table rows being treated as ’widgets’ has been fixed, so that redundant tests are no longer being performed. Below are shown the results for the more recent HTML created without using the DPUB-ARIA roles, since these are yet to be well supported in screen-reading software. Work is ongoing in AInspector, to recognise DPUB-ARIA roles at least as something for Manual Checking; when ready, revised results will be shown here.

AInspector: All Rules summary; 0 V, 0 W, 34 MC, 41 P.
AInspector: WCAG; 35 MC, 41 P  

The totals do not seem to have changed much; but no longer having any tests relevant to ‘Forms’, fewer for ‘ARIA Widgets’ and ‘Timing’. There are more tests under ‘Links’ and ‘Keyboard Support’. Also there are now 64 tests deemed N/A, for a total of 140 overall.

WCAG/ARIA Success Criteria details

Click on images below to jump to discussion of the relevant WCAG Success Criteria.

SC 1.1 Text Alternatives; 3 MC, 3 P SC 1.3 Adaptable; 6 MC, 17 P SC 1.4 Distinguishable; 7 MC, 1 P SC 2.2 Enough Time; 2 MC, 0 P SC 2.3 Seizures and ; 1 MC, 0 P SC 2.4 Navigable; 7 MC, 13 P SC 3.1 Readable; 1 MC, 1 P SC 3.2 Predictable; 5 MC, 0 P SC 3.3 Input Assistance; 1 MC, 0 P SC 4.1 Compatible; 1 MC, 6 P SC 2.5 Input Modalities; 1 MC, 1 P

‘Accessible Name’ and ‘Accessible Description’

The ‘Accessible Name’ concept typically is used to construct a name for the target of a hyperlink, though it can be used also with other structures, such as lists and tables. This name should convey the purpose of the link; that is, indicate the type of information to be found if the link were to be followed. Assistive Technology (AT) should construct this name, and may present it in place of the anchor-text of the hyperlink.

The ‘Accessible Description’ typically gives a constructed summary of the information to be found at the target of the hyperlink. It is also used to provide a description of non-textual material, such as an img, located in a <caption>, or at some other location in the document. The summary should be unique in the sense that links having the same ‘Accessible Description’ should have identical summary. Assistive Technology (AT) should construct this summary, and have a means to present it to the user, as an alternative to switching focus to elsewhere within the document to learn more specific information about what is to be found at the target of the hyperlink.

Comparison with PAC results

  • Whereas PAC has no tests for 7 of the WCAG Guidelines — also known as Success Criteria (SC) — AInspector has Manual Checks under criteria: 2.2, 2.3, 3.2 and 3.3. These are for things that cannot be determined programmatically; such as content not automatically changing, images not flashing too rapidly, methods to avoid errors using form controls – of which there are none in either the PDF or HTML versions. While it is not possible to definitively claim a Pass for relevant structures, this can be inferred from a knowledge of the material being used to construct the PDF. For example, with all text and images known to be static, there can be no flashing or rapid change of content. Some tests under 2.5 can be determined programatically.
  • Under SC 3.3, AInspector 2.99 treats each table row as if it were an interactive widget; e.g. in a grid of cells taking keyboard input. This is not the case for the static data tables in the PDF, nevertheless an associated Manual Check is stated as being relevant. This was reported as a bug in the AInspector software, which is now fixed in the v3.0 release.
  • Under SC 3.2, AInspector states a need for Manual Checking related to consistency of structures and heading levels, etc. when moving between different pages of a website. With a single web-page produced from a single PDF document, such tests are effectively irrelevant. These tests would be relevant if the PDF had been used to derive a multi-page HTML website, so would correspond to consistent use of styles and structures within the PDF itself.

Instead of displaying just the gray ‘no go’ icon indicating that no tests are applicable, it would be perfectly reasonable for those Success Criteria to be discussed in the User Manual, or other documentation. There could be links to the relevant pages, on which the issues are discussed. And there could be links to the places in the PDF where a Manual Check ought to be done.
* With the release of PAC 2024 in December 2023, the software now does produce links to websites that explain the test conducted.

  • In the 6 categories where automated checking is possible (1.1, 1.3, 1.4, 2.4, 3.1 and 4.1) AInspector recommends further Manual Checks beyond what is determined as Passing. In the following subsections, these are studied in some detail. Refer also to the Guideline images above.

In the following we give an image of the AInspector panel in the Firefox browser, focusing on a structural element to which the rule applies. Sometimes this is accompanied by an image from a PDF browser, showing how relevant structure and attributes are specified. In those cases where the rule is applicable to the document as a whole, there is no structural element to take focus.

WCAG Guideline 1.1: Text Alternative

  • 1.1.1 ‘Alt text must summarize purpose’

    That all images have an alt attribute can be checked programmatically; as can that this is not just the filename of the image, and that it is no more than 100 characters. Not so easy to check is that this Alt-text is actually relevant to the image; but one would expect that ensuring this would be part of the editing and proofing for the PDF publication. In some PDF and HTML browsers this is easily checked by hovering the mouse over the image and reading what pops up.

    the alt-text summarises the scene shown in the image
  • 1.1.1 ‘Long description for complex images’

    Accessibility and understanding can be further enhanced when the document contains a ‘long description’, being a reasonably brief summary of what is seen within the image, or explaining its relevance. This could be a caption, or some text appearing elsewhere in the document. If so, then a reference to this is an appropriate value for an aria-describedby attribute.

    a longdesc is available for the Bluefish image

    In the image above, we see the ‘long description’ which appears in the ‘Photo Gallery’ Glossary structure, concerning the Bluefish species. This occurs structurally under the identifier ‘TOCI.203’. The MC still requests to manually ‘Verify’ the relevance of the text found there. As this gallery also includes acknoeledgements of the source of the images, since January 2023 it is now also given the DPUB-ARIA role of doc-credits; each listed item now has the accompanying required DPUB-ARIA role of doc-credit.

    When there is no such aria-* attribute, the MC request is to ‘Determine whether … ’ there could be a reference to some relevant part of the document. For this PDF, there are reports for 5 of the 7 fish species. Those 5 have a ‘Photo Gallery’ entry, whereas the other 2 do not — the first and last of the small fish images. To confirm relevance it would be sufficient to understand how the chosen references are determined. In the example PDF the identified Glossary entries are for the same images being shown at a larger size within the Report sections for those fish species’ stocks.

    In practice it is much easier to ‘Verify that … ’ for some specific piece of text, than to ‘Determine whether … ’ something applicable exists within the document. So it is highly desirable to include a link to such a ‘long description’ whenever possible, using an aria-describedby reference. That has been done with this example PDF and its derivation into HTML.

    Here are some more examples for types of images used in the PDF and derived HTML.

    1. With line plots and other charts, there is a caption which follows. This is used as target for aria-describedby.
      the caption serves as a longdesc for the Bluefish Recruits image

      It is part of the usual editorial process to verify that there is a caption, that it is relevant, and its accuracy. Internal LaTeX coding ensures that it becomes the target text for a PDF attribute.

    2. Illustrative photos have an entry in the ‘Photo Gallery’ section, which is implemented as a Glossary. The text found there, including a page-number link, is used as the target for aria-describedby.
      a longdesc is available for the Seafood Plate image, in the Photo Gallery

      It is part of the usual editorial process to check the entries in any Glossary, particularly the ‘Photo Gallery’. Insertion of the photos themselves and their Glossary entries is done using a special LaTeX environment with coding designed to ensure all tasks are done consistently.

  • 1.1.1 ‘Use MathJax for mathematical expressions’

    Use of Mathematical expressions in this document is limited to stating the names of statistical variables, and specifying a value or range of values. For these the variable name serves as a link to a Glossary entry describing the purpose of the variable. MathML is thus not required to convey the meaning; ordinary text is sufficient to express an equality or range of values. Hence MathJax support is not incorporated in this HTML, though it could have been if there were more involved mathematical expressions.

    The MC is requesting that all 43 images be checked; but certainly images of mathematical expressions are not being used.

WCAG Guideline 1.3: Adaptable

  • 1.3.1 ‘Landmarks must identify content regions’

    The primary ARIA ‘Landmarks’ (BANNER, COMPLEMENTARY, MAIN, CONTENTINFO) correspond to the main top-level PDF document structure elements. For ‘Regions’ we have each section under ‘MAIN’, and also the ‘NAVIGATION’ sections — Table of Contents, List of Tables, List of Figures, and each of the 5 Glossaries (incl. the ‘Photo Gallery’).

    BANNER Landmark is the Cover page Landmarks as structure in the PDF

    This structure is imposed within the LaTeX processing to build the PDF document. Other tests for ‘Landmarks’ with these stated roles give ‘Pass’, so the MC here is of the kind to ‘Verify … ’. Testing can/should be done in a structure-aware PDF browser (e.g., Acrobat Pro) and with PDF/UA validation software, to ensure that there is no error in other parts of the structure tree which might then have a wider effect.

  • 1.3.1 ‘Use semantic markup for lists’

    Most enumerated or bulleted lists use semantic tagging with <ol> or <ul>, and items using <li> structures, perhaps with an attribute to specify the numbering or bullet style. These come from <L> and <LI> structures in the PDF tagging, or <TOC> and <TOCI> structures for lists of hyperlinks in the Table of Contents, List of Tables, List of Figures and Glossaries. Particularly for these latter, a CSS style class has rules to determine how the bulleting is done or suppressed.

    listing of Glossaries within the Table of Contents

    The Table of Contents is typically a structure of nested lists, which can be rather complicated. When sectioned into sublists, a heading can be provided for a <TOC> as a <Caption> structure, which must occur before the <ol> or <ul> within the HTML version. An aria-labelledby attribute then associates the heading with the list structure it refers to. This is how the word ‘GLOSSARIES’ becomes the ‘Accessible Text’ for a sublist in the image above. For a list item, the ‘Accessible Text’ is generally created from the contents of that item, as in the image below.

    Appendix entry within the Table of Contents
    \contentsline {subsection}{\numberline {Appendix C}Peer Review Meeting Attendees}{27}{subsection.1.C}%

    The codeline shown, from the .toc file, generates what is seen visually in the PDF, and with modified macro expansions also the structures with accompanying attributes. The grouping {Appendix C} is used multiply, as a label and in ‘Accessible Text’ for a hyperlink anchored on the visible page number, calculated by subtracting from 27 the number of front-matter pages, namely 13.

    The Manual Checking here is asking to ‘Verify’ that an element ‘identifies a list item element in a meaningfully grouped list of items’. This will certainly be the case with list structures that were created using LaTeX environments \begin{enumerate}, \begin{itemize} and \begin{description}, or other intricate constructions such as with \contentsline and \numberline. Thus this check is largely part of normal editorial work in the preparation of the PDF.

    One difference from normal editing, however, is being able to include the purpose of a list, without adding to visual content, using an aria-label attribute on the <L> structure; see image from the PDF, at right below.

    List including responses to TOR questions PDF attributes for TOR List

    However, the above do not cover all ways to declare a list structure semantically, as the following examples show.

    1. In the image below, Editorial Notes are typeset as separate paragraphs, starting with a bold-faced label. The whole paragraph has ARIA attributes role="listitem" and aria-labelledby="CRDclause.1", which latter name is for the label. The name ‘/CRDclause’ is used both for the structure, with ‘RoleMap’ to /P, and as a name for a CSS class. In this way we emulate the semantics of a captioned ‘description list’ using just <div>, <p> and <span> HTML tags. Furthermore, the aria-* attributes convey brief text giving the purpose of each element.

      ARIA listitem role for editorial comments
    2. In the images below we see how the collection of AOP panellists are presented inline within the PDF, but enclosed in a Custom structure with name /AOPauthors. Using a ‘RoleMap’ entry of /Div this becomes a <div> within the derived HTML, having associated ARIA attributes of role="list" and aria-label="Assessment Oversight Panel". For the individual panellists the Custom structure is /PRPpanellist, which has a ‘RoleMap’ entry of /Span, with attributes and class name as seen in the right-hand image below.

      ARIA listitem role for an AOP panellist AOP panellist tagging in the PDF

      Note how each panellist has an accompanying affiliation which is presented as a footnote in the PDF, but which follows immediately afterwards in the structure tree, hence also within the HTML. These affiliations have a CSS class with rules to float the content to the right side of the parent rectangle, at reduced size. The ARIA attribute of role="listitem" allows Assistive Technology to inform a non-visual reader about the list-like aspect of this kind of information; which a fully-sighted person would deduce from layout alone within the PDF.

  • 1.3.1 ‘Data cells must have row/column headers’

    The presence of a ‘Headers’ attribute for data cells is easily checked algorithmically. However, when a data cell is empty, it could be that there is no value to be displayed, or that the cell is for layout purposes only.

    Blank cell at top-left of Table

    In this document, PDF or HTML version, there are 28 such cells which are deliberately blank, without header attributes. This can be checked manually. The remaining 484 data cells have been successfully checked for headers.

  • 1.3.2 ‘Reading order: CSS positioning’

    This check concerns the use of subscripts in the names of statistical variables. These are all constructed in a similar way within the LaTeX coding, which places content in the correct reading order. So checking for just a few of these should be sufficient to be sure that all are OK.

    subscripted variable name
  • 1.3.3 ‘Not only shape, size and location’

    This check is very wide-ranging, covering aspects of the document design, technical implementation of accessibility features, and author/editor's use of such features. It certainly cannot be checked in any algorithmic way.

    subscripted variable name

    In practice the approach should be that in all situations where a fully-sighted reader gains understanding of the content due to font size and style, visual layout or special placement of textual content, there should be some aria-* (or other) semantic attribute(s) which can be used to explicitly convey what would otherwise be conveyed implicitly.

  • 1.3.4 ‘Do not restrict view or operation’
    subscripted variable name

    Perhaps the best tests here are to view the document, both PDF and HTML versions, on a smartphone. Ensure that, for both horizontal and vertical orientations, there is smooth scrolling that allows access to all parts of the document. This certainly needs Manual Checking. Also do similar tests, Zooming-in and Zooming-out on different-sized monitor screens.

WCAG Guideline 1.4: Distinguishable

Some of the tests in this category apply to the document as a whole, not related to any particular structural element, and thus not checkable algorithmically.

Color not only way to convey information Stop automatic audio Text should Reflow Should fit small screen
  • 1.4.1 ‘Use of color’ — see image at left above.

    Color is used to indicate hyperlinks and Glossary item usage (an Abbreviation, say). These elements have accompanying aria-* attributes which identify their specialised purpose and link target.

  • 1.4.2 ‘Pause, stop or mute audio’ — see image at 2nd from left above.

    There is no audio content, so this kind of situation does not arise.

  • 1.4.4 ‘Resize text content’ — see image at 2nd from right above.

    With the specialised scientific content, especially the wide tables and extensive discussion, we do not feel that 'Reflow‘ is appropriate. Scrolling is quite suitable.

  • 1.4.5 ‘Use CSS to stylize text’
    Use CSS instead of images

    Although an image may contain text, there is no image being used as a picture of text for styling purposes.

  • 1.4.10 ‘Support small screen dimensions’ — see image at right, in the group of 4 above.

    With the specialised scientific content, especially the wide tables and extensive discussion, we do not feel that it is appropriate to force information into such small dimensions as 320 x 256, without scrolling.

  • 1.4.11 ‘Color contrast of user interface controls’

    As with SC 3.3, the reporting of all table rows (see image below) as being an ‘interface control’ is surely an error in the AInspector programming. This issue is fixed with AInspector v3.0. Now, however, it reports MC being needed as a AA level check for all internal hyperlinks. But under SC 1.4.3, a value of 9.4 is calculated for hyperlink text on a white background.

    Table Row treated as a Control

    The intention was for this document to have no interface controls other than those provided by the reader software. Thus this test should be automatically a Pass, or simply deemed ‘not applicable’.

  • 1.4.11 ‘Color contrast of graphics’
    Drawing of a Bluefish

    The computer-generated graphs and charts generally have high contrast. Photographs and the fish illustrations are primarily for decorative purposes, so there is no specific information that they are required to convey. Besides, those photographs and illustrations mostly appear on other NOAA websites, for which similar accessibility requirements apply. Thus there is no detailed extra checking needed for this rule, beyond assessing the suitability of the chosen images for printed publication.

WCAG Guideline 2.4: Navigable

  • 2.4.1 ‘Skip to main content link’, and multiple Pass tests for Landmarks

    The logo on the cover page has a dual use, being the anchor for the ‘Skip to Main’ hyperlink, bypassing all document front-matter.

    Skip to Main link for the logo image Skip to Main link in the PDF

    AInspector gives a Pass for many rules concerning the use of ARIA ‘Landmarks’; in particular the top-level ones of ‘BANNER’, ‘COMPLEMENTARY’, ‘MAIN’ and ‘CONTENTINFO’. The following image shows this top-level structure in FireFox's Accessibility panel of its ‘Web Developer Tools’.

    Landmarks as top-level structure in Accessible view

    The ‘Accessibility Tree’ is derived from the Tagged PDF Structure Tree, after Role Mappings have been applied. The result is close to, but not exactly, an isomorphic tree structure.

  • 2.4.2 ‘TITLE must identify website and page‘ and be H1

    The document's title is used in multiple places, including identifiers for both the Browser window and tab. Within the main frame it is slightly abridged, within the derivation of the Cover page and Title-page.

    Full Window with Title
  • 2.4.3 ‘Sequential tab order of focusable elements must be meaningful’

    Whereas AInspector checks all 1500+ hyperlinks, and other structures having a specified tabindex attribute, to verify that the tabbing order is meaningful, there is no such check of the PDF with PAC 2021, or PAC 2024.

    meaningful navigation via tab order
  • 2.4.4 ‘Link text must describe the link target’ and be unique

    This is perhaps the most significant rule for Accessibility, especially in a document having 1500+ hyperlinks, mostly internal. In particular, when navigating using AT or via the keyboard skipping from link to link, it is important to know where a link will take you before choosing to follow it. What kind of information will be found there? And having followed a link, did it take you to what was expected?

    To address such questions ARIA uses the concepts of ‘Accessible Name’ and ‘Accessible Description’ which can apply to structural and semantic elements, including <link>s. These concepts were described earlier, and how they are determined is summarised briefly above.

    It is preferable that the ‘Accessible Name’ should be unique in the sense that links having the same ‘Accessible Name’ should have identical target (value of the href). Certainly there must be an ‘Accessible Description’ which allows to differentiate between links which do have the same name but different targets. For example, when navigating via a list of links only, removed from their original context in the document, then the Name and Description must be sufficient to determine which link to take, if either. In our example PDF and its derived HTML, there should be no such instances.

    No-one is ever going to manually check 1000+ links. With so many, we look here at examples of different usages of internal hyperlinks, with the ‘Accessible Name’ constructed for enhanced accessibility. Images below show how the required pieces of information are stored within the PDF file.

    1. Table of Contents, link to sectioning, starting on a particular page.

      The anchor text is the visible page number, here the roman numeral ‘x’; this has MCID of 31, as seen near the bottom of the right-hand image below. The target destination is named section*.11 whereas the /SD is to the object numbered 2712. The Annotation itself is of type /OBJR stored as object numbered 457, referred to indirectly as a child of the <Link> structure.

      Link in HTML Table of Contents Link in PDF Table of Contents
      Link attributes in PDF Table of Contents

      In the lower image we see the attributes used for exporting to other formats. For XML (exported from Acrobat Pro), the ‘Named Destination’ section*.11 is used. For HTML the structure name Sect.06 is provided as target, having also an aria-label of ‘starts on page x’, as the ‘Accessible Name’. Within the HTML file the name (‘Locations/regions …’) of the Glossary occurs immediately preceding the link, so need not be repeated.

      At the LaTeX level, the information for the link comes from lines generated automatically during the processing, and written into ‘auxiliary’ files for a subsequent run.

      • from SpringMT2023.toc for lines within the Table of Contents:
        \contentsline {subsection}{Locations/regions: state, country, etc.}{xiii}{section*.11}%
        Note the virtual page number of ‘xiii’ is displayed visually as ‘x’ since there are 3 initial Cover/Title pages coming before roman numbering of front-matter pages commences.
      • from SpringMT2023.tgx having tagging information from the previous run:
        \TPDFmakeSD{section*.11}{2712}{Sect.06}%

      The latter code line, read during the setup for a job run, establishes the relationship between the ‘Named Destination’ section*.11, the PDF object ID 2712 of the target, and its structural name Sect.06. Such 3-way relationships are vital for working with ‘Structure Destinations’, since with just a single processing run the target object is processed well after links to it are needed. Thus we refer to the file SpringMT2023.tgx as being a ‘Tag History’ file, for tagging.

      In this next example we give the LaTeX code first, for easy comparison with above. This time they are given the order in which they are read and processed.

      • from SpringMT2023.tgx:
        \TPDFmakeSD{subsection.6.1}{8830}{H3.19}%
      • from SpringMT2023.toc:
        \contentsline {subsection}{\numberline {6.1}Reviewer Comments: Scup{}}{75}{subsection.6.1}%
        Subtract 13 (for 3 + 10 front-matter pages) from the virtual page number 75, to get the visual page number of 62 displayed as an arabic numeral.
      Subsection Link in HTML Table of Contents Subsection Link in PDF Table of Contents

      With a numbered sub-section, the label string ‘subsection 6.1’ is deducible from the available information. This is used within the aria-label attribute, for better understanding the target of the link. Now this is the heading of the sub-section, being the element named H3.19.

    2. List of Tables or Figures, link to the caption of the Table or Figure on a particular page.

      It is usual in LaTeX to specify a \label{...} following the \caption{...} for a figure or table in a floating environment; i.e., \begin{figure} or \begin{table}. This \label provides a user-defined identifier that effectively aliases an internally-assigned named destination for the caption's location within the PDF file. Thus it is the <Caption> structure which is referenced with the ‘Structure Destination’, but named according to the particular Figure or Table.

      Link in HTML List of Tables Link in PDF List of Tables

      The file SpringMT2023.lot provides data for the List of Tables, as in the line shown below. Data there contributes to the attribute values for the Link structure.

      • from SpringMT2023.lot:
        \contentsline {table}{\numberline {6}{\ignorespaces Catch and status table for deep-sea red crab}}{41}{table.caption.25}%
      • from SpringMT2023.tgx:
        \TPDFmakeSD{table.caption.25}{6146}{Caption.17}%
        \TPDFtaginfo{6146}{6144}{<Caption>}{Caption.17}{921}{41}{41}%

      Here we include also the \taginfo for the <Caption> structure, in which the the final 3 numbers are used to check consistency of the pagination, with successive LaTeX processing runs. On the previous run, the floating table occured on virtual page 41 (= 28 + 13) having PDF object number 921. The apparent duplication of 41 is for checking that the internally-recorded page number is correct after floating.

    3. Cross-reference to a Table or Figure within the same Section.

      Although within a paragraph, a cross-reference uses a user-supplied name; e.g. , within the LaTeX-generated PDF it is only the internal ‘Named Destination’, and associated ‘Structure Destination’, that is used.

      Cross-reference Link to a table, in HTML Cross-reference Link to a table, in PDF

      The connection between the user-defined \label{...} and the internal destination is handled through the auxiliary file SpringMT2023.aux, using a \newlabel command as shown below. The \@writefile{lot}{...} command causes a line to be written into file SpringMT2023.lot.

      • from SpringMT2023.aux:
        \@writefile{lot}{\contentsline {table}{\numberline {8}{\ignorespaces Catch and status table for longfin inshore squid}}{51}{table.caption.31}\protected@file@percent }
        \newlabel{DORYUNITCatch_Status_Table}{{8}{51}{Catch and status table for longfin inshore squid}{table.caption.31}{}}
      • from SpringMT2023.tgx:
        \TPDFmakeSD{table.caption.31}{6759}{Caption.21}%
        \TPDFtaginfo{6759}{6754}{}{Caption.21}{1908}{50}{51}%

      This time the table float was processed before virtual page 50 was completed, but it floats to virtual page 51 having object number 1908.

    4. Acronym or Abbreviation, anchors a link to the associated Glossary entry.

      An ‘Accessible Name’ of ‘Glossary:NEFSC’ is provided by the aria-label attribute of the hyperlink. Target of the hyperlink is the structure named ‘TOCI.091’, which is derived from a <TOCI> item within a Glossary <TOC>. The ‘Accessible Description’ is the subject of a rule in SC 2.4.6.

      Link to a Glossary entry Uniqueness of Link name

      The same ‘Accessible Name’ is created for all links to the same Glossary item, by using a LaTeX macro (in this case \NEFSC) to construct the link and record the use of an Acronym.

      • from CRDacronymFiles.tex:
        \CRDmakeacronym{NEFSC}{\NEFSC}{NEFSC}{N E F S C }{Northeast Fisheries Science Center}{}

      The 6 arguments to \CRDmakeacronym are used as follows: #1 is the sort key, with #2 being the user-macro to invoke the acronym/abbreviation in the LaTeX source, resulting in #3 being typeset, with #5 to be shown as the description on a Glossary page. The extra parameters #4 and #6 (or #5) are for /Alt and /E (Expansion) text in the PDF page contents. If #5 is unsuitable, by having expandable items (active macros), then #6 is used to pass a clean text-only description of the meaning of the abbreviation. Notice that #4 ensures that a screen-reader spells out the acronym, rather than trying to pronounce a poorly-spelt word.

      content for NEFSC glossary link

      The corresponding portion of the page content stream is shown above, including the colouring, positioning and font-selection commands produced by the pdfTeX processing, as well as the tagging and textual content.

    5. Statistical variable name (perhaps with subscript, name or math symbol) anchors to a Glossary entry.
      Acronym Link with subscript in HTML Acronym Link with subscript in PDF
      • from SpringMT2023.tgx:
        \TPDFmakeSD{Ap21::FMSY}{3885}{Link.0764}%

        \TPDFtaginfo{3885}{3870}{<Acronym>}{Acronym.182}{973}{21}{22}%
        \TPDFtaginfo{3886}{3885}{<Link>}{Link.0764}{973}{21}{22}%
        \TPDFtaginfo{3888}{3886}{<SUBScript>}{SUBScript.040}{973}{21}{22}%

      The link for the <Acronym> is named Link.0764 which has a <SUBScript> child. Its target has ‘Accessible Name’ Glossary:FMSY and is the Glossary item named TOCI.140 which provides the ‘Accessible Description’. Being the 1st instance of this kind, it also has a ‘Structure Destination’ associated with the ‘Named Destination’ of Ap21::FMSY.

      Declaring this kind of Acronym having structured expansion is done as follows.

      • from CRDspecialAcronym.tex:
        \CRDmakespecialacronym
           {FMSY}{\FMSY}{\FMSYmath}{F_{\!\!\mathit{MSY}}^{}}{F[M S Y] }%
           {fishing mortality rate for maximum sustainable yield}{}

      This time there are 7 arguments to the meta-command \CRDmakespecialacronym, with #1 and #2 same as previously. Now #4 creates the visible text (using math-mode) while #3 gives a macro-name that produces this visible text within a context that doesn't require an Acronym link, should this ever be needed. So #5 is the /Alt-text, with #6 the printable expansion which also serves for the /E (expansion) text with #7 being empty.

    6. Glossary, back-link to where the Glossary entry was used; 1st instance on the stated page.

      In this example we have a ‘Named Destination’ of Ap14::AOP corresponding to a ‘Structure Destination’ named Link.0576, which is the 1st (and only) child of an <Acronym> structure. This latter is a Custom structure having a ‘RoleMap’ entry of /Reference, which could potentially have more sub-structure than a single Link.

      HTML backlink in Glossary PDF backlink in Glossary
      • from SpringMT2023.gls:
        \glossentry{AOP}{\glossaryentrynumbers{\relax
              \setentrycounter[]{page}\glsnumberformat{14}\delimN
              \setentrycounter[]{page}\glsnumberformat{16\delimR 25}\delimN
              \setentrycounter[]{page}\glsnumberformat{35}\delimN
              ...
              \setentrycounter[]{page}\glsnumberformat{75}}}%
      • from SpringMT2023.tgx:
        \TPDFmakeSD{Ap14::AOP}{2920}{Link.0576}%

        \TPDFtaginfo{2920}{2905}{}{Acronym.006}{2861}{14}{14}%
        \TPDFtaginfo{2921}{2920}{}{Link.0576}{2861}{14}{14}%

      The string Ap14 signifies the 1st Acronym on page 14, with key AOP appended to denote the specific Acronym. This is stated explicitly in the aria-label of the link. Now 14 = 13 + 1, so the visible page-number of ‘1’ appears as the anchor-text, being the 1st main-matter page following the 13 pages of Cover and front-matter. From the 2nd \taginfo line above, we see that the object numbered 2921, named as Link.0576, has the object numbered 2920 as parent. This parent is named Acronym.006, being the 6th usage of an Acronym outside of the Glossary itself. Thus the ‘Structure Destination’ information in the PDF leads to an appropriate link also in the HTML.

      Even the punctuation invoked by \delimN is treated specially in the PDF content stream, getting /Alt text of ‘ and ’ for screen-reading. Similarly \delimR gets /Alt text of ‘ up to ’ within a number range. The whole line, including the explanation, is spoken as:

      ; Glossary-item ; ; AOP ; Assessment Oversight Panel ; occurs on page: ; 1 and 3 up to 12
      and 22 and 33 and 42 and 52 and 62

      Here the nature of the information following is stated first, with each ‘;’ contributing a slight pause. The otherwise ambiguous uses of ‘,’ (comma) and ‘–’ (the endash character) have been replaced by their semantic meaning within this context. Be aware that the PAC 2024 validator, under the ‘Quality’ panel, flags each of these /Alt usages as potentially in error. Yet this is clearly intended as an enhancement of the accessibility for screen-reading.

    7. Footnote marker, as link to footnote text.

      On a PDF page the footnote text appears at the bottom of the page. Within HTML there is no such (convenient ?) location. With the structural tagging of the note coming after that of the in-text marker, a convenient layout can be achieved in the HTML using a CSS rule to float the footnote text (at reduced size) to the right, as seen in the image at left below. It is useful in both PDF and HTML to have a hyperlink connecting the marker to the footnote text.

      Footnote marker link in HTML Footnote marker link in PDF
      • from SpringMT2023.tgx :
        \TPDFmakeSD{Hfootnote.5}{3127}{FNote.5}%

        \TPDFtaginfo{3121}{3118}{<PRPanel>}{PRPanel.1}{3102}{16}{16}%
        \TPDFtaginfo{3122}{3121}{<PRPanelist>}{PRPanelist.5}{3102}{16}{16}%
        \TPDFtaginfo{3123}{3122}{<Reference>}{Reference.082}{3102}{16}{16}%
        \TPDFtaginfo{3124}{3123}{<Link>}{Link.0605}{3102}{16}{16}%
        \TPDFtaginfo{3126}{3124}{<SUPScript>}{SUPScript.5}{3102}{16}{16}%
        \TPDFtaginfo{3127}{3122}{<FNote>}{FNote.5}{3102}{16}{16}%

      The lines above from SpringMT2023.tgx show how the <Link> named Link.0605 fits into the <PRPanel> structure, with its anchor-text inside a <SUPScript>. The target <FNote> follows immediately afterwards in the structure tree. A ‘Named Destination’ of Hfootnote.5 corresponds to a ‘Structure Destination’ for this structure named FNote.5.

    8. Latin-text of a fish species' name, anchors a link to small image on the fish-stock Glossary page.

      The aim of the next link is to provide access to a picture and name of a fished stock species, given its formal latin name. Only experts in the field would use that latin name, so having a link to something more commonly used is natural, and similar to a Glossary link in intent.

      Link to Red Crab species in HTML Link to Red Crab species in PDF
      • from SpringMT2023.tgx:
        \TPDFmakeSD{AtlanticDeepSeaRedCrab}{828}{CRDsmallfishimage.3}%

        \TPDFtaginfo{6561}{6552}{<CRDreference>}{CRDreference.03}{6543}{47}{47}%
        \TPDFtaginfo{6562}{6561}{<Link>}{Link.1174}{6543}{47}{47}%

        \TPDFtaginfo{828}{814}{<CRDsmallfishimage>}{CRDsmallfishimage.3}{809}{7}{7}%
        \TPDFtaginfo{830}{828}{<Fig>}{Fig.05}{809}{7}{7}%
        \TPDFtaginfo{832}{828}{<CRDfishname>}{CRDfishname.3}{809}{7}{7}%

      In the code lines above we see how the ‘Named Destination’ of AtlanticDeepSeaRedCrab is related to the Custom structure named CRDsmallfishimage.3, which includes a <Fig> for an image, along with its labelling caption. This target location can be seen in the ‘pop-up’ in the image below, when the mouse is hovered over the <Link> named CRDfishname.3 when using specific PDF browser software running under a MacOS system.

      Link to Red Crab species in PDF on Mac

      This image above shows how the intent of the ‘Accessible Description’ can be achieved, but in a fully visual setting. Some PDF viewing software under MacOS has the feature that when the mouse is hovered over the anchor for an internal hyperlink, a small-sized view of the PDF around the destination is produced and ‘pops up’. Focus remains, with the pop-up going away upon moving the mouse sufficiently.

    9. 2.4.4 ‘Link text must describe the link target’ with external links

      With external link targets the ‘Accessible Name’ is based primarily on the anchor text. However an ‘Accessible Description’ may be possible using text elsewhere in the document.

      1. Link to a journal article, available online

        Academic journals have commonly used Acronym-like abbreviations, well known to researchers in the field. Here the anchor-text of the external link incorporates this abbreviation along with an identifying string for the article itself at the journal's own website.

        Link to JMS article in HTML Link to JMS article in PDF

        In the above images we see that the target ‘Journal of Marine Science’ is abbreviated to ‘JMS’, and the article identifier used there is ‘fsm107’. So the anchor text has these combined into ‘JMS:fsm107' for the ‘Accessible Name’. Using an aria-labelledby attribute the full bibliographic entry is picked up as the ‘Accessible Description’. In the PDF this entry content for a Custom structure of type <CRDreference>, named CRDreference.04. With PDF 1.7 this custom type has a ‘RoleMap’ entry of /P; this will become /BibItem in future when using PDF 2.0 as the underlying PDF standard.

      2. External link using the full URL

        When the anchor text is a full URL then ‘Accessible Name’ is this web-address. It is likely to be as meaningful to a visually-impaired person, as it is to someone fully-sighted. Any choice of whether to follow this link will be based upon context.

        Link to a full URL in HTML Link to a full URL in PDF

        In this case the text of the URL has an associated /Alt-text replacement, to ensure that it is read properly by AT that detects and uses this attribute.

        ; link to ; http://www.n e f s c.n o a a.gov/n e f s c/publications/
    10. 2.4.5 ‘At least two ways of finding content‘

      Both the PDF and its derived HTML have multiple ways to navigate to specific content. Apart from scrolling, and page-up/down buttons in a PDF browser, there are collections of hyperlinks as follows.

      • Table of Contents, navigating via sections, subsections, etc.
      • List of Tables, with 2 or 3 tables in each main section.
      • List of Figures, navigating to graphs and charts at the end of each fish-stock report section.
      • 4 Glossary sections in the front-matter part of the document, each having back-links to where an acronym or abbreviation has been used. These are organised for specific kinds of information.
        1. Abbreviations for fish stocks reviewed
        2. Abbreviations and Acronyms
        3. Statistical/review concepts, parameters, etc.
        4. Locations/regions: state, country, etc.
      • Photo Gallery, with links to photos and illustrations occuring throughout the whole document. Each main section has at least one such photo or illustration. This is implemented as a 5th Glossary, linked-to from each photo's caption, and with a back-link to where it occurs.
      Table of Contents, Navigation landmark

      The first 4 of the bulleted structures is in a section with ARIA role='navigation', so becomes a ‘Landmark’. Verifying that this is appropriate is what the MC is about, in the image above, for the latter 4 instances.

    11. 2.4.6 ‘Provide list labels when appropriate‘

      In the following image, the sublist in the ‘Table of Contents’ gets an aria-labelledby attribute. In the PDF this refers to the contents of a <Caption> structure, 1st child of a <TOC>.

      Glossaries in HTML ToC Glossaries in PDF ToC

      In the derived HTML, the <p> resulting from the <Caption> structure must be moved to outside (i.e., before) the <ol>, as only <li> is allowed as a child. This explains why the selection in the HTML image does not include the word ‘GLOSSARIES’, whereas it does in the PDF view. Each of the 38 lists can be checked to have an ‘Accessible Name’ obtained from a aria-labelledby attribute. (Note that strings such as ol#TOC.03 are in error in the AInspector panel. This is a tag name, displayed in place of the intended name, which is given further down.)

    12. 2.4.6 ‘Accessible name is descriptive‘

      With Glossary-item links, the ‘Accessible Name’ for the target is constructed as ‘Glossary:’ followed by the database key used for the glossary item. Typically for an acronym or abbreviation this is just the short version itself. Hence we get Glossary:NEFSC in the example shown in the image below.

      Accessible Name for Glossary Item

      Having simply recognisable keys is part of the way the Glossary items are specified, for ease of use by report authors. It follows that the resulting Accessible names will be descriptive as a matter of course. For other kinds of links, check the images provided above, under SC 2.4.4; in all cases the names are built algorithmically using identifiers relevant to the target destination.

    13. 2.4.7 ‘Focus must be visible‘
      Focus style

      Here the issue is about whether the browser software shows a visible border around a link or other selected (interactive) element to indicate focus on that element. With Firefox there is a ‘Preference’ item that needs to be activated; then this does occur.

      FirefoxChromeAcrobat
      Focus on link in Firefox Focus on link in Chrome Focus on link in Acrobat

Observe the rectangles around the link for ‘NEFSC’ following ‘Species Names:’, as obtained using tabbing in 3 browsers; 2 for HTML, 1 for PDF. Each browser has ‘Preference’ settings to adjust the style of these rectangles.

WCAG Guideline 3.1: Readable

  • 3.1.2 ‘Identify language changes’
    Language Changes

The default language is set on the <html> tag, and again with <body lang="en-us">. Note that the PDF specifies /Lang (en-US), whereas the Country Code is given in lowercase in the HTML. This may be technically an error in the ngPDF derivation software, though not likely to lead to any misunderstanding.

WCAG Guideline 4.1: Compatible

  • 4.1.2 ‘Accessible name is required’, and Pass tests

    The MCs here should arise only when <tr> occur as part of interactive grid structures; but here they occur within static tables. This is due to a mistake in the AInspector software, related to the issue discussed above under SC 3.3.

    Accessible Name required

    There is no sensible name to apply to the top row of the table, as shown. This is the same for the 19 rows reported, occuring in other tabular structures. On the other hand, the lower rows are part of the 121 Passing elements, where the 1st column is a header for cells on the same row. The ‘Accessible Name’ for these rows is (perhaps unnecessarily) built from the contents of that 1st cell.