The Semantic Web
One of the biggest challenges we face today as regards the information society is information overload, a problem that becomes ever greater due to the enormous size of the WWW. The Web gives us access to millions of resources, regardless of their physical location or language.
In order to deal with the huge amount of information, new models of business have emerged on the Web, such as commercial search engines (of which Google is by far away the most important).
Because of the Web’s anticipated continuous growth, it’s to be expected that browsers will have difficulties preserving quality in their results. Furthermore, search engines only find static content and are unaware of the dynamic part of the Web (pages built from databases).
We foresee that current-generation browser technology has reached its limits. In order to handle the continuous growth of the WWW (in size, languages and formats), it’s essential to exploit other information.
Here is where the Semantic Web comes in.
Quotations related to the Semantic Web
"The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation." [Tim Berners-Lee, James Hendler, Ora Lassila, Scientific American, May 2001]
"For the Web to scale, tomorrow's programs must be able to share and process data even when these programs have been designed totally independently. The Semantic Web is a vision: the idea of having data on the web defined and linked in a way that it can be used by machines not just for display purposes, but for automation, integration and reuse of data across various applications." [W3C 2001]
"The real power of the Semantic Web will be realized when people create many programs that collect Web content from diverse sources, process the information and exchange the results with other programs. The effectiveness of such software agents will increase exponentially as more machine-readable Web content and automated services (including other agents) become available. The Semantic Web promotes this synergy: even agents that were not expressly designed to work together can transfer data among themselves when the data come with semantics." [Tim Berners-Lee, James Hendler, Ora Lassila, Scientific American, May 2001]
Understanding the Semantic Web
Today’s Web is based on HTML, which specifies how to format a page in order to be read by humans. HTML cannot be used by information-retrieval techniques in order to improve the result, which must use only the words that make up the pages’ content, and thus they are restricted to keywords.
On the Semantic Web, pages don’t just store the content as a body of words with no connection in a document, but also include their meaning and structure. Picture 1 shows how the Semantic Web’s languages are based upon XML, and the Semantic Web’s language pyramid includes RDF, RDFS, OIL, DAML+OIL and OWL.

Picture 1: The Web’s language pyramid.
These languages are much richer than HTML and allow –more or less- for the representation of the content’s meaning and structure (inter-relation of concepts). This helps make Web content comprehensible and usable for software agents, services and innovating business models based on knowledge, where we will see a gradual change from medium for information recall to one we can use to delegate tasks it will undertake alone.
The information-overload problem can be partially dealt with by giving the Web artificial intelligence. Software agents can have various levels of intelligent behaviour, from simply reacting to stimulus up to adapting and learning, where the agents learn users’ likes and dislikes. This will free users from irrelevant information, only ‘bothering’ them with information of real value. Instead of the current scenario where the user must actively sift through all the information and open programs, this work will be delegated to autonomous software agents. In the end, the tasks users will want to carry out will become more and complex. The agents will have to ‘learn’ to work within ‘social environments’ in which they will need to collaborate, compete or negotiate with other agents. The quality and usability of the Semantic Web’s infrastructure depends on the progress made in these three areas.
Recently, the US and EU Governments have recognised the importance of the Semantic Web and have undertaken programmes (DAML and IST Action Line III.4.1) dedicated to financing research aimed at developing base technology for the Semantic Web.
Despite the great advantages the Semantic Web promises, its success or failure will depend –just as with the WWW- in grand part on the ease of access to, and availability of, varied and high quality content. There are still many problems to sort out before we can bring this about, including, but not only:
- The availability of content. Currently, there is little Semantic Web content available. The content of existing websites must be updated to Semantic Web content, including static HTML pages, existing XML content, and dynamic content, multimedia and web services.
- Availability, development and evolution of ontologies. Ontologies become the key element, because they provide the semantics in the Semantic Web’s content. There must be a great effort to create a commonly-used ontology for the Semantic Web, to provide infrastructures appropriate for the development, management and mapping-out of ontologies, and, bearing in mind the haphazard way the Web is distributed, to adequately control the evolution of ontologies and the notes that link them together.
- Scalability. A great effort will be necessary in order to organise Semantic Web content, store it and provide the mechanisms to find it. All these tasks must be carried out and coordinated in a scalable way, since semantic solutions must be prepared for the Semantic Web’s massive growth.
- Multi-language support. This problem already exists in today’s Web, and should be taken into account in the Semantic Web. Any approximation to the Semantic Web should provide mechanisms for accessing information in different languages, allowing for the creation and access to content regardless of the original language used by those who provide or use the content in question.
- Display. Intuitive display of Semantic Web content will be ever more important in order to solve the growing information overload, since users need straight-forward content recognition relevant to their aims. New techniques which differ from the traditional method which uses the hypertext (typical of the current Web) must be explored.
- Stability of Semantic Web languages. Finally, standardisation in this new field are urgently necessary, so as to allow the creation of technology appropriate for the Semantic Web.
Further information
Portal for the Semantic Web: http://www.semanticweb.org/
Projects underway on the Semantic Web financed by the European Commission: http://www.ercim.org/publication/Ercim_News/enw51/stork.html
The Semantic Web in the press
Investment reports on the impact of the Semantic Web
