3 Management Functional Areas
Last update PvD

3.5 PM
Performance Management

Overview

  1. Standard
  2. Purposes
  3. Quality of Service
  4. Issues

Standard

[M.3400]:  Performance Management (PM) provides functions to evaluate and report the behavior of telecommunication equipment and the effectiveness of the network or network element.  Its role is to gather statistical data for the purpose of monitoring and correcting the behavior and effectiveness of the network, NE or equipment and to aid in planning and analysis.  As such, it is carrying out the performance measurement phase of Recommendation M.20.

A TMN collects Quality of Service (QoS) data from NEs and supports the improvements in QoS.  The TMN may request QoS data reports to be sent from the NE, or such a report may be sent automatically on a scheduled or threshold basis.  At any time, the TMN may modify the current schedule and/or thresholds.  Reports from the NE on QoS data may consist of raw data which is processed in the TMN, or the NE may be capable of carrying out analysis of this data before the report is sent.

Quality of Service includes monitoring and recording of parameters relating to:

In general, Performance Management must provide tools to perform the following tasks:

Performance monitoring
Performance monitoring involves the continuous collection of data concerning the performance of the NE.  Acute fault conditions will be detected by alarm surveillance methods.  Very low rate or intermittent error conditions in multiple equipment units may interact resulting in poor service quality and may not be detected by alarm surveillances.  Performance monitoring is designed to measure the overall quality, using monitored parameters in order to detect such degradation.  It may also be designed to detect characteristic patterns before signal service quality has dropped below an acceptable level.
The basic function of performance monitoring is to track system, network or service activities in order to gather the appropriate data for determining performance.
Performance (management) control
Performance analysis
Performance data may require additional processing and analysis in order to evaluate the performance level of the entity.

Performance Management usually concerns huge amounts of data.  Handling such amounts of data is a problem in itself.  The main point however is, that data is not information;  it lacks purpose.  Specifying the purpose may allow reduction/­aggregation of data in an earlier stage and/or at a lower level.


Purposes for Performance information

The main purposes for Performance Management are:

Capacity Management
Predict network usage (trend), predict performance (QoS problems) and adapt capacity (i.e. extend or reduce the network, modify allocations, priorities, etc).
Overcapacity leads to poor profitability, whereas insufficient capacity causes severe quality loss;  the right balance is important.  This is independent from any specific user, but related to overall traffic on equipment, i.e. capacity and usage, and consequently to QoS.  It does require the collection and aggregation of quite a large amount of data from dispersed sources to a central destination.  The results are preferably in an 'SML view' on the network:  e.g. a traffic matrix.
Capacity Management includes real-time Performance Monitoring and Performance Control for what is commonly called Traffic Management.
Maintenance Surveillance, i.e. problem detection and prediction
Detect equipment malfunction through performance problems and predict problems by monitoring quality trends (e.g. noise levels, error rates, laser deterioration, etc).
Instead of central collection of performance data to check proper operation, one may compare values (typically 'rates', i.e. events per time) against their thresholds at the suitable places across the network;  exceeding such a threshold should signal a performance alarm:  the so-called Threshold Crossing Alarm (TCA; potentially through Leaky Bucket techniques).  However, probably not all such monitoring can be implemented as threshold crossings;  the need for some collection remains but no aggregation is required (i.e. basically a distributed application).
User service quality
This is specific for each service by the user, potentially even for each service instance.  The relevant QoS parameters achieved for that service should be recorded similar to the detailed billing record (potentially integrated with the CDR), in particular when they not meet the standard and fines are probable {there is considerable overlap with AM}.
Specific issues are:

Typically, one requires very little information when everything is all right (i.e. a summary is sufficient);  when there are problems however, nearly 'all' available data is wanted:  one needs the capability to 'drill down' i.e. request more detailed information in specific areas.


Quality of Service

The performance of a system can best be judged along the (ITU-T) recommendation for Quality of Service (QoS) for Availability per type of request (service/action):

Dependability
against outages;  commonly called 'availability';
Accuracy
against errors;  typically 'error rates';
Speed
delay and throughput, potentially synchronisation:  wander and jitter (delay variation);

for the (connection) phases

of a telecom service. {See [Q.822] ?}.
Above parameters are basically independent but sometimes related by definition.  E.g. for transmission, when exceeding a certain interval with bit errors ('accuracy'), the connection is assumed to be out-of-service ('dependability').  That makes sense, but complicates processing and reporting of such values.

Typical QoS parameters are:

The important thing is that the relevant QoS parameters are different for each service, or even for each (service) application.  E.g. for voice and video, the bit error rate can be quite high before it is even noticed, whereas data transport is rather vulnerable for bit errors.  Some services are extremely sensitive to delay variation (e.g. jitter for voice), whereas for others (data transport) this is irrelevant.
It implies that a service is qualified by its QoS parameters. When the service is used for various purposes/­applications, multiple service grades should be offered.  It is not only the level of a particular QoS parameter which varies with the service (e.g. availability 99.9% versus 99%);  new QoS parameters may appear and others may disappear:  it's a different set of QoS parameters, and it leads to distinct service classes (typically a standard class, a premium class and customised classes).  In formalised form it becomes a contract:  a Service Level Agreement (SLA) or a Service Level Guarantee (SLG).

This view has a serious impact on Performance Management, and as a consequence, for the network and its management.  When the distinct services are expressed in their QoS parameters, they should be assured by the network and measured by Performance Management.  It implies that network capacity planning should use those QoS parameters to size the network, and network management should be capable to measure and report these.  It requires a performance model of (each of the components in) the network to predict service performance {and SLA guarding}.  It is likely to lead to grouping/­allocating network resources for specific service classes.

Note that the above-mentioned QoS parameters are only some parameters in basic form;  one may provide them in a more elaborate form.  For example, the parameter dependability (or availability) can be presented as a single value (e.g. 99.73%), but also in a more elaborate form like 'total outage', 'number of outages', 'average outage', 'longest outage'.
Presentation is important for perception and appreciation.


Issues

Performance management requires gathering of statistics, maintenance and examination of logs, and adaptation of equipment and system behavior.  It concerns a huge amount of data;  intelligent reduction is paramount.  This assumes purpose and performance models.
The challenges are:

Modelling
Reporting
Acquisition
Aggregate data to information

From above, it will be clear that there is no simple 'final solution'.  Apart from the current non-ideal starting point, requirements will evolve and implementation will have to adapt.

The basic cycle is:  measure (& collect & aggregate) → analyse/­deduce → plan → adapt.

PM Planning & Evaluation cycle

Note however that this cycle is applicable at two levels of abstraction:

  1. for Performance Management (adapt network, i.e. improve utilisation:  capacity/­quality management);  and
  2. for determining relevant performance indicators (adapt metrics:  i.e. improve Performance Management).

Performance management gives a good insight in the operation of the system, and supplies clues for performance enhancements.  The handling of overload conditions for both the managed network and the management system is a Performance Management issue (it may cause a 'Performance Alarm', which is forwarded to Fault Management);  it is very treacherous subject as it is difficult to exercise (traffic control) commands in an overloaded system (but essential for survival).


Further references:

SLA creation
Some simple rules for the creation of SLAs.
Capacity Management process cycles
More information on network planning and traffic control.

Next section: SM
Up to Contents
Up to Index

=O=