An analysis of the I/O characteristics of
OpenVMS® languages
This document contains PROPRIETARY and CONFIDENTIAL information and such information may not be disclosed to others for any purpose without written permission from Touch Technologies, Inc.
Touch Technologies, Inc. (TTI) has prepared this publication for use by TTI personnel, licensees, and customers. This information is protected by copyright. No part of this document may be photocopied, reproduced or translated to another language without prior written consent of Touch Technologies Incorporated.
TTI believes the information described in this publication is accurate and reliable; much care has been taken in its preparation. However, no responsibility, financial or otherwise, is accepted for any consequences arising out of the use of this material.
The information contained herein is subject to change without notice and should not be construed as a commitment by Touch Technologies, Inc.
The following are trademarks of Touch Technologies, Inc., and may be used only to describe products of Touch Technologies, Inc.:
DYNAMIC TAPE ACCELERATOR INTOUCH 4GL CleanDisk REMOTE TAPE FACILITY DYNAMIC LOAD BALANCER
The following are trademarks of Digital Equipment Corporation, and may be used only to describe products of Digital Equipment Corporation:
DBMS DCL DECNET RDB RMS OpenVMS VMS
Traditional third-generation languages are very I/O intensive. As the number of users accessing data files increases, the I/O system becomes a severe bottleneck.
Purpose
The purpose of this manual is to compare different languages' capabilities to handle I/O and their effects on performance. The compared languages will include a number of OpenVMS/VMS traditional 3GLs (COBOL, FOR-TRAN, ...) as well as INTOUCH, a high-performance 4GL.
In addition, possible solutions to I/O bottlenecks will be discussed.
Topics to be covered include:
The QIO can be a direct I/O or a buffered I/O depending on the
device being accessed. I/Os to disk, tapes, etc. are direct I/Os.
I/Os to DECNET, terminals, etc. are buffered I/Os.
1.2.1 Direct I/Os
A direct I/O is when VMS accesses the user's buffer directly for the I/O operation. The DEVICE DRIVER START I/O then takes the data in the user's buffer and submits to the CONTROLLER the contents of the user's buffer. The CONTROLLER then submits the I/O request to the device.
The application cannot issue another I/O request until the first I/O operation is completed. Direct I/Os are designed for synchronous operations.
A synchronous I/O operation is when one I/O must complete before
another I/O request can be issued.
1.2.2 Buffered I/Os
A buffered I/O is when VMS makes a copy of the user's buffer. The copied buffer is allocated out of Non-paged dynamic memory (NPAGEDYN).
VMS uses the copied buffer in NPAGEDYN to complete the requested I/O operation. Thus, the application can issue a new I/O request to the user's buffer without having to wait for the first I/O to complete. Buffered I/Os are designed for asynchronous operations.
An asynchronous I/O operation is when a second I/O request can be submitted before the previous I/O request has been completed. LANGUAGE I/O BENCHMARKS
Digital-provided traditional third generation languages (3GLs), by default, perform synchronous buffered I/Os when issuing screen write requests. A synchronous buffered I/O is a buffered I/O where the first I/O must complete before the next I/O can be issued.
Each time the application requests multiple screen writes, 3GLs perform a buffered I/O, wait for the I/O to complete, perform another buffered I/O, wait for the I/O to complete, and so on. The application cannot overlap I/Os.
Terminals perform buffered I/Os so that the application doesn't have to wait for an I/O to complete before issuing another I/O request. However, 3GLs, by default, wait for the I/O to complete, negating the advantage provided by buffered I/Os.
A multiple screen write request is issued such as:
print ... in BASIC display .... in COBOL print ... in FORTRAN printf (...); in C
For each screen write request, a synchronous buffered I/O is
performed. If you have 1000 write requests, each request must be
completed before the next write request can be issued.
2.1.1 Building a Buffer
INTOUCH is a high-performance 4GL designed to reduce I/O overhead. INTOUCH does this by packetizing the I/O requests.
INTOUCH packetizes the I/O requests by building a buffer the size of the SYSGEN parameter MAXBUFCNT. The first screen write request is put in the buffer. If an additional screen write request is issued, INTOUCH puts this request in the buffer. When either the buffer is full, or 1/10 of a second has gone by, INTOUCH issues a SINGLE asynchronous buffered I/O request.
The effect of issuing one asynchronous buffered I/O for multiple
write requests is a reduction in QIOs. A reduction in QIOs
results in a reduction in CPU time and a reduction in elapsed time.
2.1.2 Benchmark
A simple program was written to benchmark various programming languages' speed in performing multiple screen write requests. This program was written to count from 1 to 1000 and print each number to the screen. The following languages were used: BASIC, COBOL, FORTRAN, C, and INTOUCH.
The results of running this benchmark program are listed below. This test was run on a OpenVMS 3100.
Language | Buffered I/Os | CPU Time Seconds | Elapsed Seconds |
---|---|---|---|
BASIC | 1,000 | 2.76 | 8.00 |
FORTRAN | 1,003 | 2.68 | 8.00 |
COBOL | 1,002 | 2.87 | 8.00 |
C | 1,050 | 2.86 | 7.00 |
INTOUCH | 3 | .79 | 6.00 |
Because INTOUCH packetizes the screen I/O requests and sends the buffered I/Os asynchronously, the total number of buffered I/Os is significantly reduced. As a result, CPU time is also reduced significantly.
INTOUCH dramatically improves performance and removes bottlenecks
associated with screen I/O and interactive applications.
2.2 File I/O s
Digital-provided 3GLs, by default, use RMS to issue I/Os to data
files. The 3GL application issues an I/O request to RMS. RMS then
issues a QIO. The QIO is a direct I/O when accessing a data file.
2.2.1 Populating a Data File
There are many RMS options that can be activated when opening a data
file, but, by default, the traditional 3GLs use almost none of the
RMS options. The use of the RMS options is complex and often not
well understood.
2.2.2 Benchmark
A program was run to benchmark the number of direct I/Os that various programming languages use when populating (writing data to) a data file. The program populated an empty indexed data file by writing 1000 records.
The results of running the benchmark program are listed below. This test was run on a OpenVMS 3100.
Language | Direct I/Os | CPU Time Seconds | Elapsed Seconds |
---|---|---|---|
BASIC | 1,814 | 8.92 | 70.00 |
FORTRAN | 422 | 6.40 | 23.00 |
COBOL | 1,814 | 9.55 | 62.00 |
INTOUCH | 164 | 2.66 | 11.00 |
BASIC and COBOL both performed 1814 direct I/Os. However, FORTRAN performed only 422 direct I/Os. This is because FORTRAN uses the RMS deferred I/O option when writing new data records. In deferred I/O operations I/Os are written to local data buffers. Once the local data buffers are all full, they are written out to the data file, decreasing the number of direct I/Os.
INTOUCH performed only 164 direct I/Os. The results of the benchmark program show that the number of direct I/Os is lowest when using INTOUCH. INTOUCH also uses the RMS deferred I/O option, but INTOUCH packetizes the I/O requests. Packetized requests are written to the data file either when the buffers are all full or after one second has elapsed.
In addition, INTOUCH dynamically controls local and global data buffering to further reduce the direct I/Os to the data file.
INTOUCH, by default, optimizes I/O operations. The programmer does not have to know special programming to utilize this INTOUCH feature.
Because INTOUCH reduces the direct I/Os, both elapsed time and CPU
time are dramatically reduced. (Over twice as fast as FORTRAN, and
almost six times faster than COBOL!!)
2.2.4 Reading Data Records
When reading a data record in a sequential file, 3GLs, by default, read 16 blocks of data at a time.
INTOUCH dynamically adjusts the number of blocks read at one time to
significantly reduce the number of direct I/Os. In addition,
INTOUCH performs read ahead operations. The read ahead option tells
RMS to read a second buffer of data as the first one is being
processed by the application.
2.2.5 Benchmark
A program was written to benchmark the number of direct I/Os each programming language uses when reading 1300 records from a sequential file. The results of running the benchmark program are listed below. This test was run on a OpenVMS 3100.
Language | Direct I/Os | CPU Time Seconds | Elapsed Seconds |
---|---|---|---|
BASIC | 47 | 4.06 | 7.0 |
FORTRAN | 44 | 5.56 | 9.0 |
COBOL | 45 | 4.14 | 6.0 |
INTOUCH | 10 | 3.01 | 6.0 |
Because INTOUCH dynamically adjusts the number of blocks read at one
time and does read aheads, the number of direct I/Os performed by
INTOUCH was only 10. The 3GLs had to perform over 40 direct I/Os
for the same operation. The CPU time required by INTOUCH was also
2.55 to 1.13 seconds LESS than any of the 3GLs.
2.2.6 Updating Data Records
When updating a data record, traditional 3GLs, by default, fetch the
data record, make the requested change, then write the record to
disk. This operation is done for each record.
2.2.7 Updating Data Record Example
A buffer of data is read---records A, B and C are read into the buffer.
-------------------------------- | A | B | C | --------------------------------
Record A is updated. The traditional 3GLs then write the WHOLE buffer back out to disk, even though only record A has been changed.
Record B now needs to be updated. The traditional language reads 16 blocks of data from disk again if needed, updates record B, then rewrites the WHOLE buffer back out to disk.
INTOUCH packetizes the requested I/O updates. The packetizing of
the I/Os is internal to INTOUCH.
2.2.8 Benchmark
A program was written to benchmark the number of direct I/Os each programming language uses when updating 1300 data records from a sequential file. The results of running the benchmark program are listed below. This test was run on a OpenVMS 3100.
Neither FORTRAN or C have language syntax that allows them to update data in a sequential file. So those languages were left out of this benchmark.
Language | Direct I/Os | CPU Time Seconds | Elapsed Seconds |
---|---|---|---|
BASIC | 1,429 | 9.12 | 60.00 |
COBOL | 1,428 | 8.50 | 59.00 |
INTOUCH | 20 | 6.46 | 8.00 |
Because INTOUCH packetizes the I/O requests, the number of direct I/Os is dramatically reduced. INTOUCH performed only 20 direct I/Os as opposed to the other languages performing OVER 1400 direct I/Os. CPU time and elapsed time are also significantly lower.
If you have a 3GL, you can reduce the I/O overhead by using the Digital provided RMS options. REDUCING FILE I/O BOTTLENECKS
There are steps that can be taken to reduce the I/O overhead. The steps are:
By analyzing file I/O operations, hot files (those with high I/O counts) are identified. Hot files consume valuable I/O resources. Once identified, hot files can be moved to your fastest disk devices. Files with high read/write ratios are excellent candidates for local and global data buffering, or can be moved to a RAM disk.
Two major actions can be taken to reduce the I/O bottlenecks caused by files with high I/O counts:
Speeding up a file's I/O operations can be accomplished by moving the
file to a faster or less busy device or by moving the file across
multiple spindles (as in a shadow set). Both read and write
operations can be sped up using these methods.
3.3 Eliminating I/O Operations
Eliminating file I/O operations can be accomplished in a number of ways. Some of these ways include:
Method | Result |
---|---|
host based data caching | speeds up file reads |
RMS global buffering | speeds up file reads |
RMS file converts | speeds up both reads and writes |
RMS local buffering | speeds up both reads and writes |
disk defragmentation | speeds up both reads and writes |
file defragmentation | speeds up both reads and writes |
Note
Both RMS local buffering and global buffering can be requested for a specific file.
Host based data caching uses free memory for high-speed data caching.
I/O requests to the file are intercepted by the caching system. If
the I/O request is a write operation, the data is passed to the disk
device. No speed up occurs. If a read I/O request is intercepted and
the requested data is already in the memory data cache, the request is
satisfied with a very fast memory move. No actual I/O to the disk
occurs. Host based data caching systems are available from a number
of commercial software vendors.
3.5 RMS Buffering
RMS moves data from the disk into memory buffers. From the buffers, data is moved into the application program. Whenever the requested data can not be found in a data buffer, RMS must access the disk to find the data. Accessing the disk is much slower than getting information from a data buffer.
RMS provides two types of file data buffers. These are:
Local data buffers are not shared among processes. Local buffers can only be accessed by the process that they were created for. When RMS opens an indexed file, by default it creates two local data buffers.
Global data buffers are shared among processes. Global buffers can be accessed by all processes that have the file open. By default RMS does not create any global data buffers.
File I/Os can be reduced using either or both of these buffering
methods. However, increased buffering requires additional system
resources. To avoid running out of system resources, both SYSGEN
and AUTHORIZATION (SYSUAF) parameter changes are needed.
3.5.1 RMS Local Buffering
RMS indexed files with high file I/O counts can benefit from increased local buffering. As the number of local buffers is increased, more I/O requests can be satisfied from the local buffer cache in memory, reducing the number of disk I/Os. In some cases, even write requests can be sped up using local buffering (for deferred write operations).
The number of local buffers used by RMS indexed files can be set on either a per-process or system-wide basis. In either case, the Digital provided SET RMS command is used to specify the number of local buffers.
For example, to set the number of local buffers used for indexed files for ALL users on the system to eight, the following DCL command is used:
$ SET RMS/SYSTEM/INDEX/BUFFER=8
To set the number of local buffers used for indexed files for JUST THIS PROCESS to ten, the following DCL command is used:
$ SET RMS/INDEX/BUFFER=10
The SET RMS command takes effect the next time a file is open.
3.5.2 RMS Global Buffering
RMS based hot files with high read I/O percentages (75% or greater) can benefit from increased global buffering. As the number of global buffers is increased, more read I/O requests can be satisfied from the global buffer cache. Write requests are written directly to the disk and are not sped up by global buffering.
To specify the number of global buffers to be used on a file, the file must be closed. To set the number of global buffers on file MYFILE.DAT to thirty, the following DCL command is used:
$ SET FILE myfile.dat/GLOBAL=30
After the global buffers are set up on a cluster, the global buffers
are created the first time the file is accessed cluster-wide.
3.5.3 A RMS Global Buffering Example
Global buffering uses address space and may use physical pages off
the free list. For example, you have a file that has 2 buckets and
you set up 30 global buffers. When the file is first accessed, 60
pages of address space is allocated (2 buckets x 30 global buffers)
to the user's process. The number of physical pages allocated, for
the first accessor to the file, can be from 0 to 60 pages depending
on what the user is doing.
The second accessor to the file would not use any additional
physical pages because global buffers are shared among processes.
The second accessor would, however, have 60 pages of address space
allocated to their process.
So, for each user accessing the file, an additional 60 pages of
address space is allocated. However, no additional physical memory
pages are used --- those are shared.
3.5.4 Monitoring RMS Cache Hits
VMS version 5.0 and higher provides a utility for monitoring RMS
buffer caching activity.
3.5.5 Statistics Option
To perform RMS monitoring, the file to be monitored must first have the statistics option set. The statistics option takes up a small amount of space in the file header. However, there is no overhead in collecting statistics because VMS always collects this data. The statistics option just allows the user to display the data.
In order to SET the statistics option on a file, the file must be closed. To set statistics on the file MYFILE.DAT, the following DCL command is used:
$ SET FILE myfile.dat/STATISTICS
After the statistics option has been set on the file, the following MONITOR command is used:
$ MONITOR RMS/FILE=myfile.dat/ITEM=CAC
The Digital provided MONITOR RMS utility provides both LOCAL and GLOBAL buffer caching information. The higher the cache hit percent shown in the display, the better the I/O performance of the file.
OpenVMS/VMS Monitor Utility RMS CACHE STATISTICS on node TTI 1-DEC-1989 21:52:11 (Index) SALES_MASTER.DAT;1 Active Streams: 2 CUR AVE MIN MAX Local Cache Hit Percent 37.00 36.65 0.00 40.00 Local Cache Attempt Rate 51.16 5.53 0.00 51.16 Global Cache Hit Percent 57.00 57.02 0.00 100.00 Global Cache Attempt Rate 31.89 3.50 0.00 31.89 Global Buf Read I/O Rate 13.95 1.48 0.00 13.95 Global Buf Write I/O Rate 0.00 0.00 0.00 0.00 Local Buf Read I/O Rate 0.00 0.02 0.00 0.33 Local Buf Write I/O Rate 0.00 0.00 0.00 0.00
If only Global buffers are set, the Local Cache Hit Percent will be
zero because VMS looks in the Local Buffers before looking in the
Global buffers. If the requested data is not in the Local Buffers,
the Global buffers are searched for the data. If the data is not in
the Global buffers, VMS gets the data from disk. VMS then puts the
data in a Global buffer since Global buffers were the last place VMS
checked for the data.
3.5.6 SYSGEN Parameter Changes
RMS global buffering requires increased use of VMS global pages and global sections. In addition, some RMS related SYSGEN parameters must be changed. The following MINIMUM SYSGEN parameter values are recommended when global buffering is specified:
SYSGEN Parameter Name | Minimum Value |
---|---|
GBLPAGFIL | 16384 |
RMS_GBLBUFQUO | 16384 |
GBLPAGES | 50000 |
GBLSECTIONS | 800 |
Both RMS local buffering and global buffering require increased use of VMS locking, address space and synchronization resources. The following MINIMUM SYSGEN parameter values are recommended when either local buffering or global buffering is specified:
SYSGEN Parameter Name | Minimum Value |
---|---|
IRPCOUNT | 500 |
LOCKIDTBL | 4000 |
LOCKIDTBL_MAX | 16000 |
PQL_MENQLM | 600 |
RESHASHTBL | 2500 |
SRPCOUNT | 4500 |
VIRTUALPAGECNT | 35000 |
PQL_MPGFLQUO | 35000 |
PQL_MBYTLM | 35000 |
To view the number of global sections and global pages used you can enter:
$ INSTALL:==$INSTALL/COMMAND $ INSTALL LIST/GLOBAL/SUMMARY Summary of Local Memory Global Sections 272 Global Sections Used, 21964/13036 Global Pages Used/Unused
The SYSGEN parameter GBLSECTIONS is the total number of global
sections.
3.6 Disk Defragmentation
Disk defragmentation is the process that causes files to become
physically contiguous. Contiguous files can be accessed with fewer
I/O operations than non-contiguous files. The two ways to defragment
a disk are to do a full BACKUP and RESTORE to the target disk or to
use a commercially available disk defragmentation product.
3.7 RMS File CONVERSION
As RMS based files are written to, they become internally fragmented and disorganized. Over time, both read and write operations cause extra physical I/O operations to the RMS file due to this fragmentation. The Digital provided CONVERT utility can be used to defragment and reorganize RMS files. To convert the file MYFILE.DAT, at the DCL prompt enter:
$ CONVERT myfile.dat myfile.new $ RENAME myfile.new myfile.dat; (note the trailing ";")
This two-step process safely converts and reorganizes an RMS file.
Note
If the CONVERT fails, DO NOT DO THE RENAME. THIS INSURES THE INTEGRITY OF YOUR ORIGINAL UNCONVERTED FILE.
If you don't have the time to defragment all of your disks, you can instead defragment your most badly fragmented hotfiles one at a time.
VMS provides a way to defragment individual files. There are three steps to the defragmentation process:
A .FDL is a file definition language file. This file can be used with the Digital provided VMS CONVERT utility to defragment a file. To create a .FDL for the file MYFILE.DAT you would use the following DCL command:
$ ANALYZE/RMS/FDL MYFILE.DAT
The ANALYZE command creates a file called MYFILE.FDL. The .FDL is a
text file containing a description of MYFILE.DAT.
3.8.2 Customize the .FDL file
Using the text editor of your choice, edit the .FDL file and insert the text "best_try_contiguous yes" as shown:
FILE best_try_contiguous yes <--- the inserted text ALLOCATION nnn ORGANIZATION xxx . . .
The Digital provided CONVERT utility can be used to defragment and reorganize your files using a .FDL. Any time you change an .FDL you need to do a convert. To convert and defragment the file MYFILE.DAT, at the DCL prompt enter:
$ CONVERT/FDL=myfile.fdl myfile.dat myfile.new $ RENAME myfile.new myfile.dat; (note the trailing ";")
Note
If the CONVERT fails, DO NOT DO THE RENAME. THIS INSURES THE INTEGRITY OF YOUR ORIGINAL UNCONVERTED FILE.
Be sure to ALWAYS use the /FDL qualifier when doing a CONVERT. If the /FDL qualifier is not used the CONVERT will eliminate the best_try_contiguous = yes from the .FDL