Recent Changes - Search:

Courses

edit SideBar

File systems, files, directories and devices

Include our styles below Page bread crumbs: Unix For Busy People wiki - Courses - Unix For Busy People - Handouts - Handout 03
Pages by tags:
Subscribe to this wiki: RSS Feed RSS or subscribe to this page for changes: RSS Feed RSS
18 articles have been published so far. Recent changes
Share


Devices:

Unix sees devices as either a stream or a block oriented. A stream access is a sequential
stream of data sent in one direction or bi-directional from the CPU to the device. A block
device on the other hand can be accessed randomly in any order and transfers data via
chunks of data referred to as blocks. Blocks are the smallest addressable unit of the
device. Typical examples of a block device include the hard disk, a DVD or any kind of
disk storage device including solid state. A stream device would be the network, serial or
printer port or a USB port when not used for data storage.

Regardless of orientation all devices are identified by a major and minor number in the
kernel although in newer versions of both Linux and Solaris this identification scheme
is largely symbolic now.

Devices are bound (loaded into) to the kernel via software called a device driver. They
are accessed logically though the device driver. The device driver then communicates directly
(though the cpu, the bus and sometimes DMA or Direct Memory Access) to the device.

Attach:CoopIO-1.gif Δ

All devices appear as a file system (we will explain file systems in a moment) and are
usually rooted in the /dev directory. An example of a SCSI disk in Linux might be /dev/sdb1
and in Solaris /dev/dsk/c0t0d0s1. Normally as a user you needn't worry much about devices
only to be aware that they exist. As a non-root user it would normally be impossible to
write to our otherwise do any damage to a device (if the system is configured properly).
Access to devices are always through several layers of software (Utility, API, file system,
kernel, device driver, etc) and then though several layers of hardware (buffer, cache, cpu,
bus and finally the device itself and any remotely attached devices as in SCSI LUN's).

Attach:zk-0201U.gif Δ

A fairly decent (albeit technical) explanation of devices and device drivers (if you want to
know the gory details) is locate here.

Special Devices:

/dev/null -- Also known as the "bit bucket" or "black hole", this virtual file discards all contents written to it. This is typically used to throw away unwanted data streams, such as log files.

/dev/random -- This is a virtual file which contains random numbers (subject to the limitations of Random Number Generators in Computing). It uses system noise to generate random numbers and blocks if not enough entropy in the noise is available. Random is commonly used more by programs that absolutely need high quality random data (such as SSH to generate an encryption key).

/dev/urandom -- Same as /dev/random, except it always returns random numbers, even if there is not enough entropy in the system noise available. In the latter case, pseudorandom numbers are generated, which are based on an algorithm, depending on the type of Unix system.

Terminal devices:

The Unix terminal is a simple device that acts much like a file. Terminal emulation is still used by telnet, ssh, xterm, since hardwired terminals are rarely used today. Use tty to tell the name of the current terminal device. Try cat /etc/motd > terminal-device-name or any other Unix command to read or write to the device. Unlike other devices (particularly stream devices) terminal drivers perform a lot of additional processing to be more adaptable to humans such as buffering, terminal addressing using escape codes, line disciplines (XON/XOFF) and modem control and terminal characteristic such as local echo and synchronous and asynchronous operation.

Storage:

Simply put storage is anything the computer can use to maintain state. This state can be
permanent or semi-permanent. Semi-permanent would include random-access memory. Permanent
storage is what we will be talking about here and these are typically used for storing logical
chunks of data known as files. A file is a collection of data that the smallest unit usable
by humans. Think of a file as a papers in a file folder. They are all referenced, bound with
and accessed from that file folder. So it is with computer files. There are a number of
technologies in Unix for storing files. These can include tape, disk, CD-ROM, DVD and solid
state drives like USB or SD and Compact Flash cards. Some drives are logical and are accessed
across the network such as NAS or Network Attached Storage or are virtual such as a virtual
disk file in some virtual host environments. There are also larger scale attached drives like
SCSI and SAN Storage Area Networks. HMH deploys almost all of these technologies.

Disks:

Notes: http://en.wikipedia.org/wiki/Hard_disk

Disks also known as HDD or hard disk drives (as opposed to floppy drives in ye olden days)
is a non-volatile (essentially permanent) storage device that stores digitally encoded data
on rapidly rotating rigid (i.e. hard) platters with magnetic surfaces. Strictly speaking,
"drive" refers to the motorized mechanical aspect that is distinct from its medium, such as
a tape drive and its tape, or a floppy disk drive and its floppy disk. Early HDDs had removable
media; however, an HDD today is typically a sealed unit (except for a filtered vent hole to
equalize air pressure) with fixed media.

Disk geometry and characteristics

HDDs record data by magnetizing ferromagnetic material directionally, to represent either a 0
or a 1 binary digit (state). They read the data back by detecting the magnetization of the
material. A typical HDD design consists of a spindle that holds one or more flat circular
disks called platters, onto which the data is recorded. The platters are made from a
non-magnetic material, usually aluminum alloy or glass, and are coated with a thin layer of
magnetic material.

The platters are spun at very high speeds. Information is written to a platter as it rotates
past devices called read-and-write heads that operate very close over the magnetic surface.
The read-and-write head is used to detect and modify the magnetization of the material
immediately under it. There is one head for each magnetic platter surface on the spindle,
mounted on a common arm. An actuator arm (or access arm) moves the heads on an arc (roughly
radially) across the platters as they spin, allowing each head to access almost the entire
surface of the platter as it spins. The arm is moved using a voice coil actuator or in
some older designs a stepper motor.

Cylinder-head-sector

Notes: http://en.wikipedia.org/wiki/Cylinder-head-sector

Cylinder-head-sector, also known as CHS, was an early method of mapping the geometric
coordinate (cylinder/head/sector) of data on a disk's surface and the addressing system
used by the disk's filesystem (linear base address or LBA). Though CHS values no longer
have a direct physical relationship to the data stored on disks, pseudo CHS values (which
can be translated by disk electronics or software) are still being used by many utility programs.

Attach:360px-Cylinder_Head_Sector.png Δ

Logical block addressing

Notes: http://en.wikipedia.org/wiki/Logical_Block_Addressing

Data on single disks are now addressed using LBA or Logical block addressing.

Zone Bit Recording

Notes: http://en.wikipedia.org/wiki/Zone_bit_recording

I am not a disk "geek" and I don't want to get to deep into this however, note that current
disk drives use Zone Bit Recording, where the number of sectors per track depends on the
track number. The disk drive will report a SPT or number of sectors per track for the disk
to provide for these calculations, but which has little to do with the disk drive's true geometry.

Spindle

The spindle of a hard disk is the spinning axle on which the platters are mounted.

Attach:300px-Hard_drive-en.png Δ

Volumes:

In modern computing it is advantageous to group storage disks into collections known as
volumes. These then become known as "logical volumes" and usually appear to the operating
system as one single disk but deep down inside at the hardware level are multiple spindles,
read-write heads, cylinders and sectors all apearing as one logical drives. This helps boost
performance hence why this ability exists. In ye olden days it was common for database
administraors (particularly Oracle) to try to seperate database strage onto seperate spindels
for performance gains. This thinking is no longer requesied due to volumes and volume management.
Volumes also gave way to using RAID or redundant array of inexpensive disks which is a technology
that allowed computer users to achieve high levels of storage reliability from low-cost and
less reliable PC-class disk-drive components, via the technique of arranging the devices into
arrays for redundancy. "RAID" is now used as an umbrella term for computer data storage schemes
that can divide and replicate data among multiple hard disk drives. The different
schemes/architectures are named by the word RAID followed by a number, as in RAID 0, RAID 1, etc.
RAID's various designs involve two key design goals: increase data reliability and/or increase
input/output performance. When multiple physical disks are set up to use RAID technology, they
are said to be in a RAID array. This array distributes data across multiple disks, but like a
disk volume the array is seen by the computer user and operating system as one single disk. RAID
can be set up to serve several different purposes.

Notes: http://en.wikipedia.org/wiki/Redundant_array_of_independent_disks

Partitions:

Notes: http://en.wikipedia.org/wiki/Disk_partitioning

Like MS-DOS based systems (Windows) Unix disks are divided into logical groups called partitions.
Partitons may use a portion or the entire disk storage. In Unix-based and Unix-like operating
systems such as Linux and Mac OS X, it is possible to create multiple partitions (also known in
the Solaris operating system and the BSD based operating systems as "slices") on a disk device.
Each partition can be used for a file system or as a swap partition.

Multiple partitions allow directories such as /tmp, /usr, /var, or home directory space to be
allocated their own file system. Such a scheme has a number of potential advantages: if one
file system gets corrupted, the rest of the data (the other file systems) stay intact, minimizing
data loss; specific file systems can be mounted read-only, or with the execution of setuid files
disabled (thus enhancing security); performance may be enhanced due to less disk head travel.
However, the disadvantage of subdividing the drive into fixed-size partitions is that a file
system in one partition may become full, even though other file systems still have plenty of
usable space.

A good partitioning scheme requires the user to predict how much space each partition will need,
which may be a difficult task; especially for new users. Logical Volume Management, often used
in servers, increases flexibility by allowing data in volumes to expand into separate physical
disks (which can be added when needed); another option is to resize existing partitions when
necessary.

File systems:

The Unix file system (often also written as filesystem) is a method of storing and organizing
computer files and the data they contain to make it easy to find and access them. Unix File
systems usually use a data storage device such as a hard disk or CD-ROM and involve maintaining
the physical location of the files, they might provide access to data on a file server by
acting as clients for a network protocol (e.g., NFS (NAS), SMB, or NAS clients), or they may
be virtual and exist only as an access method for virtual data (e.g., procfs). It is
distinguished from a directory service and registry. It is the file systems job to remember
where you stored your files and be able to retrieve them for you on demand. At HMH there are
several file systems in use including NAS or Network Attached Storage which uses a protocol
called NFS or Network File System.

Each file system is stored in a separate whole disk partition.

Directories:

Notes: http://en.wikipedia.org/wiki/Unix_directory_structure

In Unix-like operating systems, the Unix directory structure is a convention of organization within a file system.

To use the example of a physical file cabinet, if the separate drawers in the file cabinet are
represented as the highest level of sub-directories in the file system or system prompt, then
the room the file cabinet is in, may be represented as the root directory.

The directory structure is hierarchical and begins with the root file system and extends downward using the forward slash "/" as the delimiter in the path name. The further down you go the moe slashes are used.

Directories can contain files or other directories called sub-directories.

Sub directories:

Sub directories are directories under the root (/) directory or other directories below that level. Directories can be created or renamed only by the the system administrator in the root file system. Normal non-privileged users cannot create directories directly under the root file system but are usually assigned to a lower directory in the structure such as /home or /export/home. Then a directory will be created under one of those directories usually with your login name. For example a person named Andy Johnson would be assigned a username johnsona and given a "home" directory of /home/johnsona or /export/home/johnsona. This is considered your "home" directory and is where you land whenever you login. It is normally where all the files you create are stored and is also where yourenvironment files live.

Root directory or root file system:

The root file system is the primary file system on a Unix system. As the name implies it contains the primary file system ion which the operating system is stored and uses for file storage. There is a special command called chroot which can change this for a given login (job) session.

The root directory is the directory on Unix-like operating systems that contains all other directories and files on the system and which is designated by a forward slash ( / ).

The use of the word root in this context derives from the fact that this directory is at the very top of the directory tree diagram (which resembles an inverted tree) that is commonly used to represent a filesystem. Strictly speaking, there is only one root directory in your system, which is denoted by / (forward slash). It is root of your entire file system and can not be renamed or deleted.

/root:

On Linux (and some AT&T drrived Unixes), there is also a directory which is named /root.
Confusingly, it is not a root directory in the sense of this article, but rather the home
directory of the Superuser login "root". We will talk about the root login in a later class.

Linux directory structure

Notes: http://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard

/ -->

   /bin  - Stands for "binaries"; Contains some fundamental utilities needed by a system 
         administrator. As a failsafe, these were placed in a separate directory so that they 
         could be placed on a separate disk or disk partition in case the main drive failed.

   /sbin  - Statically linked binaries also meant originally to be a seperate partition.

   /usr - Holds executables, libraries, and shared resources that are not system critical: 
        X11, KDE, PERL, etc. The name "Unix System Resources" is a post hoc backronym.)

   /boot - Usually a seperate partition which contain boot-strap files needed at boot time.

   /dev - short for devices. Contains file representations of every peripheral device attached 
        to the system.

   /etc - Contains configuration files and some system databases.

   /home - contains the home directories for the users. On Solaris this is usually in /export/home.

   /lib - This is the depository of all integral UNIX system libraries. 

   /lost+found  - Each partition has its own lost+found directory. It's purpose as it's name
                implies is to become a storage bin for files that become lost from their
                original directory. Only the system administrator needs to worry about
                this directory.

   /mnt - Temporarily mounted filesystems.

   /media - Mount points for removable media such as CD-ROMs and PEN drives.

   /var - Short for "variable." A place for files that may change often, such as the storage 
        to a database, the contents of a database, log files (usually stored in /var/log), 
        email stored on a server, etc. 

   /opt - This originally meant optional software applications but has really become to
          mean any software that is installed that did not come with your Linux distribution
          so as to avoid contention with file names or software patches being applied in the
          root file system. There are several schools of thought on this directory and some
          system administrators (myself included) and some distributions use /usr/local for
          the same purpose.

   /proc - This is a special directory used by the kernel. Well, actually /proc is just a 
         virtual directory, because it doesn't exist really. It contains some info about the 
         kernel itself. There's a bunch of numbered entries that correspond to all processes 
         running on the system, and there are also named entries that permit access to the 
         current configuration of the system. Many of these entries can be viewed as text files.

   /root - The home directory for the superuser root. 

   /sys  - Modern Linux distributions include a /sys directory as a virtual filesystem (Sysfs, 
         comparable to /proc, which is a Procfs), which stores and allows modification of the 
         devices connected to the system.

   /tmp - A place for temporary files. Most Unix systems clear this directory upon start up. 

Solaris directory structure

Much like Linux above however also contains some additional psuedo directories which include /net which is used for
the Automounter a NFS based network file mounting system not unlike Microsoft Windows UNC drive linking. In Solaris
/sys is replaced by /system and /platform is a hardware specific set of system libraries supporting certain hardware architectures.

Files:

File types

Every item in a UNIX file system can de defined as belonging to one of four possible types:

Ordinary files

Ordinary files can contain text, data, or program information. An ordinary file cannot contain
another file, or directory. An ordinary file can be thought of as a one-dimensional array of
bytes.

Directories

As previously mentioned directories are containers that can hold files, and other directories.
A directory is actually implemented as a file that has one line for each item contained within
the directory. Each line in a directory file contains only the name of the item, and a numerical reference to the location of the item.

Special files

Special files represent input/output (i/o) devices, like a tty (terminal), a disk drive, or a
printer. As mentioned before Unix treats such devices like files.

Links

A link is a pointer to another file. Think of links as aliases or another name to locate a file.
Without getting too deep into this there are two types of links in a Unix file system: symbolic
and hard. A hard link essentially appears to be for all purposes to be the same as the file that
it references. Hard links sometimes can be difficult to locate because they share the same
indentification (inode) as the file they reference and for that reason can be dangerous and
should generally be avoided. Files cannot be deleted if it is still be referenced and this
includes hard links. Also for this reason hard links can only exist on the same partition.
Slso known as "symlinks") ymbolic or "soft" links on the other hand are merely pointers to
another file can be easily located since their file type appears as an "l" in directory listings.
Since symlinks are pointers (or guide posts) they can exist in different file systems or
even partitions. Symlinks are handy when an application is expecting a directory to exist
in a certain path but because of disk space limitations a new file system was created to
accomodate the directory or for some reason had to be moved in some way. Think of it as
a logical detour sign.


Unix For Busy People is created by Kevin P. Inscoe is licensed under a
Creative Commons Attribution 3.0 United States License.

Back to main web site - http://unixforbusypeople.com

Edit - History - Print - Recent Changes - Search
Page last modified on February 09, 2010, at 02:38 PM EST