What is the World Wide Web?

The World Wide Web (abbreviated in this document as the "Web") is a collection of electronically linked documents (called pages) that are accessible from the Internet. The words document and page both refer to a single file and are synonymous. You navigate between between Web pages using two types of links:

A Web site is a group of related pages residing on a Web server. A Web site can be as small and simple as a single page with no links, all the way to an extensive interlinked site with hundreds of pages. A user typically visits a Web site via a home page. When viewed from the perspective of a Web site designer, a home page is the Web page that automatically loads when readers access your site on the Web. A home page usually has a file name of index.html for Web servers that use the UNIX operating system, or index.htm for those Web servers capable of handling only three characters for the file name extension. Windows typically uses 3-character file extensions. When viewed from the perspective of a person using the Web, the term home page refers to the Web page that automatically loads when the user first starts a browser. A home page typically contains hyperlinks to other pages within the Web site and hyperlinks to pages on the Web. For example, when you visit www.microsoft.com, the Web server displays the home page for the site.

A Web server is a computer running software that stores and delivers Web documents to a Web browser. It must have a Web server software package installed and running at all times to respond to the requests for Web pages made by Web browsers. The Web server accepts requests for documents from other computers, and then delivers those documents to clients running a Web browser. The Web browser then formats those documents and displays them to the user. Different computer vendors develop and sell different Web servers as described in the following list.

For a Web site to work, it must be hosted on a Web server. The following list describes various Web hosting options:

A Web browser is a software package that formats documents sent by a Web server and displays them for viewing. Many Web browsers are available, including Netscape's Navigator and Microsoft's Internet Explorer. Of course, Netscape Navigator's popularity has declined significantly in recent years. Figure 1 illustrates the relationship between a Web server and a Web browser.

Figure 1: Relationship between a Web server and Web client

Navigating the Web

To visit a Web site, you enter an Internet address in a format called a Uniform Resource Locator (URL) in the Web browser. The Web browser then downloads that Web page from the Web server, and formats the page. Every page on the Web has a unique address, much like the one for your house. The URL uniquely identifies the exact location of a Web page on the Web. Note that the term Uniform Resource Indicator (URI) is often used in place of the term URL.

An example of a URL is shown in the following figure. This is the URL for this Web site.

Figure 2: Parts of a URL

The following list describes the parts of the preceding URL:

  1. Type of protocol used to transfer data. A protocol is the set of rules that describes how information is transferred across the Internet between clients and servers. The protocol used by the Web to transfer data is called the Hypertext Transfer Protocol (HTTP). The Internet, and most Web browsers, support other protocols. The most common protocols are listed in Table 1.

    file:// Opens a file on a mounted disk volume
    http:// Opens a World Wide Web page
    ftp:// Connects to a server using the File Transfer Protocol
    gopher:// Connects to a Gopher server
    telnet:// Connects to a server using Telnet
    news: Connects to a Usenet newsgroup
    mailto: Sends an electronic mail message
    snews: Opens a secure newsgroup connection
    shttp:// Opens a secure World Wide Web connection
    Table 1: Common Internet protocols
  1. Domain name. Identifies the computer that is used to store the Web page. It usually begins with "www" but does not require those characters. You can think of the domain name as the Internet name of the computer running the Web server containing the Web page.
  2. Directory path. This is the location on the Web server computer's disk where the Web page is stored. Directory paths are hierarchical. That is, one directory may contain another directory. These directories ultimately contain Web pages, and other support files.
  3. Document name. This is the name of the Web page. The Web page frequently has a suffix of ".html" or ".htm." These suffixes are an abbreviation for Hypertext Markup Language, which is a the computer language used to represent Web pages. As mentioned, index.html and index.htm have special meaning and are used by Web servers to denote the beginning or default Web page for that directory. Other operating systems used for Web servers have other default Web page names, such as "default.html" or "default.asp". Note that UNIX is case sensitive. Thus, the files named "Index.html" and "index.html" are not the same. Windows file names are NOT case sensitive.