Managing content begins with organizing information. Taxonomies (using the term broadly to include even shallow taxonomies like subject headings) are critical tools for organizing information. Indeed whenever content, or information, is arranged, or organized, it can be seen to be a taxonomic process.
Even the simplest CMS has some means of navigating the content. And we shall see that although not all taxonomy work is for navigation, there is no navigation system that does not have an implied taxonomy.
There are hundreds of thousands, perhaps hundreds of millions, of taxonomies in use in the world. Many of them are online. Every website navigation scheme has an underlying taxonomy.
The great attempts to produce a universal classification system for knowledge, from Aristotle's original categories and Francis Bacon's creation of the modern natural sciences to the great Dewey Decimal and Library of Congress classification schemes, all have a taxonomy.
They are all attempts at a galactic taxonomy.
Although many taxonomy requirements documents and requests for proposals specify that the taxonomy include three (or even four or more) levels, the right number of levels is highly dependent on the particular content being organized. There are many large taxonomies in use with tens of thousands of nodes arranged in ten or more levels. Many web portals drill down to well over four levels. And many file/folder structures in our personal computers go deeper than four levels.
Although there can be a taxonomy that is not used for website navigation, there is no website navigation without taxonomy. And a taxonomy is perforce a means of navigating the content being arranged. Taxonomy is derived from the Greek tassein, to arrange.
Note that a navigation taxonomy should generally not go down deeper than a few levels.
A thesaurus is an arrangement of terms (words or phrases), with a simple hierarchy of parent-child relationships described as broader terms (BT) and narrower terms (NT).
But thesauri also introduce the concept of synonym relationships, a set of equivalent terms that may be substituted for one another, usually with one preferred term (PT).
In addition, thesauri allow arbitrary associative relationships between terms. References to a term such as See and See also are examples of related terms (RT).
Ontologies share the hierarchical structure of taxonomies, but they make extra demands on the objects they include..
Explicit rules or axioms describe the relationship between a node (e.g, parent or container) and the objects included in that node (children or contained objects).
The Linnaean biological taxonomy is an ontology in which each species is also a member of the containing genus.
The Semantic Web uses ontologies that conceptualize (describe in terms agreed to by participants in a community of discourse) some domain of phenomenal knowledge in a formal way that allows computers to make inferences about a term from its containing relationships. For example, that a bulldog is a dog in some contexts.
A single facet in a faceted classification scheme typically has an enumerated set of possible values. For example, a size facet might be small, medium, large. However a facet may contain a taxonomy of possible values.
So a faceted classification might contain many taxonomies.
At a minimum, there will be different taxonomies for groups working with different content. In the grand vision of Enterprise Content Management (ECM), a single uber-taxonomy may have a place, with different groups using different sections of the master taxonomy.
But in large enterprises with multiple public Internet and private intranet websites, each of them likely has its own navigation taxonomy. And it would be a big surprise if records management and document management taxonomies resembled web content management (WCM) taxonomies.
These are two distinct stages in information organization - first categorization, then classification.
First comes the design and construction of a taxonomy, thesaurus, ontology, or faceted classification. At this stage the terms, keywords, ideas, concepts, memes, topics, facets, etc. are identified. These are the categories. However they are arranged, they are a controlled vocabulary. This process is categorization. Each node in the resulting classification scheme should provide a globally unique meme ID.
Next comes the classification of content into the categories. This can be described as attaching metadata to the documents, "tagging" them with the appropriate terms or keywords. In advanced memography, they are tagged with globally unique meme IDs to support a high precision memetic search.
We are assembling a list of taxonomy sources.
If you know some good sources, please tell us and we will list them here.