

#### Improving the performance of execution time control by using a hardware Time Management Unit

Kristoffer Nyborg Gregertsen Department of Engineering Cybernetics Ada-Europe 2012 – Stockholm – 2012-06-14



Ada 2012 brings execution time control for interrupt handling



#### Summary

- Ada 2012 brings execution time control for interrupt handling
- Makes low overhead even more important



### Summary

- Ada 2012 brings execution time control for interrupt handling
- Makes low overhead even more important
- Designed specialized Time Management Unit (TMU)



### Summary

- Ada 2012 brings execution time control for interrupt handling
- Makes low overhead even more important
- Designed specialized Time Management Unit (TMU)
- Shown to significantly reduce execution time control overhead



Background and motivation



Background and motivation

Execution time control for interrupt handling



Background and motivation

Execution time control for interrupt handling

Implementation of Ada 2012 execution time control



Background and motivation

Execution time control for interrupt handling

Implementation of Ada 2012 execution time control

Time Management Unit (TMU)



Background and motivation

Execution time control for interrupt handling

Implementation of Ada 2012 execution time control

Time Management Unit (TMU)

Conclusion



- Need WCET for scheduling analysis



- Need WCET for scheduling analysis
- Hard to find on modern architectures:
  - Deep pipelines
  - Branch-prediction and speculative execution
  - Multi-level cache and DRAM refresh cycle
  - Multi-core with shared memory and coherent cache



- Need WCET for scheduling analysis
- Hard to find on modern architectures:
  - Deep pipelines
  - Branch-prediction and speculative execution
  - Multi-level cache and DRAM refresh cycle
  - · Multi-core with shared memory and coherent cache
- Reported 30-50% overestimation



- Need WCET for scheduling analysis
- Hard to find on modern architectures:
  - Deep pipelines
  - Branch-prediction and speculative execution
  - Multi-level cache and DRAM refresh cycle
  - Multi-core with shared memory and coherent cache
- Reported 30-50% overestimation
- Also very pessimistic: real WCET >> average ET



- Need WCET for scheduling analysis
- Hard to find on modern architectures:
  - Deep pipelines
  - Branch-prediction and speculative execution
  - Multi-level cache and DRAM refresh cycle
  - Multi-core with shared memory and coherent cache
- Reported 30-50% overestimation
- Also very pessimistic: real WCET >> average ET
- Using WCET as budget  $\implies$  low utilization



- Dynamic control - not just static analysis



- Dynamic control not just static analysis
- Mechanism:
  - · Execution time measurement and monitoring
  - Handler called when timer expires



- Dynamic control not just static analysis
- Mechanism:
  - · Execution time measurement and monitoring
  - Handler called when timer expires
- Policy:
  - Task overrun handling
  - Execution time servers
  - Support advanced scheduling policies...



- Dynamic control not just static analysis
- Mechanism:
  - · Execution time measurement and monitoring
  - Handler called when timer expires
- Policy:
  - Task overrun handling
  - Execution time servers
  - Support advanced scheduling policies...
- Still need some timing analysis for budgets



- Package Ada.Execution\_Time
- Type CPU\_Time and function Clock



- Package Ada.Execution\_Time
- Type CPU\_Time and function Clock
- Execution time monitoring for single tasks:
  - Child package Timers with tagged type Timer
  - Set with protected handler expires



- Package Ada.Execution\_Time
- Type CPU\_Time and function Clock
- Execution time monitoring for single tasks:
  - · Child package Timers with tagged type Timer
  - Set with protected handler expires
- Execution time budgets for dynamic task groups:
  - Child package Group\_Budgets tagged type Group\_Budget



- Package Ada.Execution\_Time
- Type CPU\_Time and function Clock
- Execution time monitoring for single tasks:
  - · Child package Timers with tagged type Timer
  - Set with protected handler expires
- Execution time budgets for dynamic task groups:
  - Child package Group\_Budgets tagged type Group\_Budget
- Ravenscar no timers or group budgets



- Tasks execution time defined as:

- Time spent executing that task...
- including services on behalf of task



- Tasks execution time defined as:
  - Time spent executing that task...
  - including services on behalf of task
- Execution time of interrupt handlers:
  - · Implementation defined which task is charged
  - Implementations charge interrupted task
  - · Inaccuracy to execution time measurement for tasks
  - Raised as an issue...



Tasks execution time defined as:

- Time spent executing that task...
- including services on behalf of task
- Execution time of interrupt handlers:
  - · Implementation defined which task is charged
  - Implementations charge interrupted task
  - Inaccuracy to execution time measurement for tasks
  - Raised as an issue...
- Also apply to other languages, POSIX...



- Is it right to charge interrupted task?



- Is it right to charge interrupted task?
- Separate execution time measurement:
  - · Improves accuracy for tasks
  - Allows tighter task budgets
  - Testing and diagnostics...



- Is it right to charge interrupted task?
- Separate execution time measurement:
  - Improves accuracy for tasks
  - Allows tighter task budgets
  - Testing and diagnostics...
- Full execution time control for interrupts:
  - Provide interrupt timers
  - Unexpected high interrupt rate
  - Bursts due to error...
  - Design and usage errors...



www.ntnu.no

- Is it right to charge interrupted task?
- Separate execution time measurement:
  - Improves accuracy for tasks
  - Allows tighter task budgets
  - Testing and diagnostics...
- Full execution time control for interrupts:
  - Provide interrupt timers
  - Unexpected high interrupt rate
  - Bursts due to error...
  - Design and usage errors...
- Important with low overhead!











# Interrupt handling – reality Task clock Overhead Handler Interrupt clock LL handler Task



11

Discussed at IRTAW-14 in Portovenere, autumn 2009



- Discussed at IRTAW-14 in Portovenere, autumn 2009
- Total execution time for interrupt handling:
  - Rivas and Gonzales Harbour
  - Implemented for MaRTE OS



12

- Discussed at IRTAW-14 in Portovenere, autumn 2009
- Total execution time for interrupt handling:
  - Rivas and Gonzales Harbour
  - Implemented for MaRTE OS
- Separate execution time measurement for interrupts:
  - Gregertsen and Skavhaug
  - Implemented on GNATforAVR32
  - Initially used interrupt priorities
  - Updated to Interrupt\_Id after discussion



- Discussed at IRTAW-14 in Portovenere, autumn 2009
- Total execution time for interrupt handling:
  - Rivas and Gonzales Harbour
  - Implemented for MaRTE OS
- Separate execution time measurement for interrupts:
  - Gregertsen and Skavhaug
  - Implemented on GNATforAVR32
  - Initially used interrupt priorities
  - Updated to Interrupt\_Id after discussion
- Workshop forwarded both proposals



- Discussed at IRTAW-14 in Portovenere, autumn 2009
- Total execution time for interrupt handling:
  - Rivas and Gonzales Harbour
  - Implemented for MaRTE OS
- Separate execution time measurement for interrupts:
  - Gregertsen and Skavhaug
  - Implemented on GNATforAVR32
  - Initially used interrupt priorities
  - Updated to Interrupt\_Id after discussion
- Workshop forwarded both proposals
- Now in draft for ISO-standard Ada 2012!



```
package Ada.Execution Time is
  Interrupt Clocks Supported : constant Boolean :=
     implementation-defined;
  Separate Interrupt Clocks Supported : constant Boolean :=
     implementation-defined;
  function Clock For Interrupts return CPU Time;
private
end Ada.Execution Time;
```



NTNU – Trondheim Norwegian University of Science and Technology

with Ada.Interrupts;

package Ada.Execution\_Time.Interrupts is

function Clock (Interrupt : Ada.Interrupts.Interrupt\_Id)
 return CPU\_Time;

function Supported (Interrupt : Ada.Interrupts.Interrupt\_Id)
 return Boolean;

end Ada.Execution\_Time.Interrupts;



### Interrupt timer proposal

```
with Ada.Execution_Time.Timers;
```

package Ada.Execution\_Time.Interrupts.Timers is

type Interrupt\_Timer (I : Ada.Interrupts. Interrupt\_Id)
 is new Ada.Execution\_Time.Timers.Timer
 (Ada.Task\_Identification.Null\_Task\_Id'Access)
 with private;

#### private

end Ada.Execution\_Time.Interrupts.Timers;

Implemented in GNATforAVR32



## Interrupt timer proposal

```
with Ada.Execution_Time.Timers;
```

package Ada.Execution\_Time.Interrupts.Timers is

type Interrupt\_Timer (I : Ada.Interrupts. Interrupt\_Id)
 is new Ada.Execution\_Time.Timers.Timer
 (Ada.Task\_Identification.Null\_Task\_Id'Access)
 with private;

#### private

end Ada.Execution\_Time.Interrupts.Timers;

- Implemented in GNATforAVR32
- Not to be included in Ada 2012...



## Atmel AVR32 UC3 series



#### – Atmel AVR32 architecture:

- 32-bit RISC
- Efficient ISA
- 4 interrupt levels
- Atmel Norway



Kristoffer Nyborg Gregertsen, Execution time control using a Time Management Unit

# Atmel AVR32 UC3 series



– Atmel AVR32 architecture:

- 32-bit RISC
- Efficient ISA
- 4 interrupt levels
- Atmel Norway

#### - UC3 microcontroller series:

- Second implementation
- Embedded control apps.
- Integrated SRAM
- 16 to 64 KB SRAM
- Up to 60 MHz



NTNU – Trondheim Norwegian University of Science and Technology

## **GNATforAVR32**

- GNU Ada Compiler (GNAT) for AVR32 architecture:

- GNU Compiler Collection (GCC)
- GNAT front-end  $\rightarrow$  AVR32 back-end



# **GNATforAVR32**

- GNU Ada Compiler (GNAT) for AVR32 architecture:

- GNU Compiler Collection (GCC)
- GNAT front-end  $\rightarrow$  AVR32 back-end
- Bare-board Ravenscar run-time environment:
  - Open Ravenscar Kernel by UPM
  - Used by ESA's LEON space application processor
  - Real-time kernel integrated with GNARL
  - Ported to UC3 microcontroller series



# **GNATforAVR32**

- GNU Ada Compiler (GNAT) for AVR32 architecture:

- GNU Compiler Collection (GCC)
- GNAT front-end  $\rightarrow$  AVR32 back-end
- Bare-board Ravenscar run-time environment:
  - Open Ravenscar Kernel by UPM
  - Used by ESA's LEON space application processor
  - Real-time kernel integrated with GNARL
  - Ported to UC3 microcontroller series
- Small code size low memory requirements



## Ada 2012 implementation

- Similarities between RTC and execution time clocks:

- Same clock and alarm abstraction
- Use the COUNT / COMPARE timer for both clocks
- Reset and reprogram on clock change
- Tick-less clocks



www.ntnu.no

## Ada 2012 implementation

- Similarities between RTC and execution time clocks:
  - Same clock and alarm abstraction
  - Use the COUNT / COMPARE timer for both clocks
  - Reset and reprogram on clock change
  - Tick-less clocks
- Interrupt handling:
  - Handler registered allocated clock from pool
  - Change clock before calling handler
  - Store interrupted clock on stack



## Ada 2012 implementation

- Similarities between RTC and execution time clocks:
  - Same clock and alarm abstraction
  - Use the COUNT / COMPARE timer for both clocks
  - Reset and reprogram on clock change
  - Tick-less clocks
- Interrupt handling:
  - Handler registered allocated clock from pool
  - Change clock before calling handler
  - Store interrupted clock on stack
- Low overhead can it be further reduced?



# Time Management Unit (TMU)

- HW timer specialized for execution time control:

- 64-bit COUNT / COMPARE registers
- Interrupt line asserted when COUNT 
   COMPARE
- Atomic swapping of COUNT / COMPARE values
- Triggered by write to final swap register



# Time Management Unit (TMU)

- HW timer specialized for execution time control:

- 64-bit COUNT / COMPARE registers
- Interrupt line asserted when  $\text{COUNT} \ge \text{COMPARE}$
- Atomic swapping of COUNT / COMPARE values
- Triggered by write to final swap register
- Memory-mapped interface:
  - Portable to different architectures
  - Easy to use, no special instructions



# Time Management Unit (TMU)

- HW timer specialized for execution time control:

- 64-bit COUNT / COMPARE registers
- Interrupt line asserted when  $\text{COUNT} \ge \text{COMPARE}$
- Atomic swapping of COUNT / COMPARE values
- Triggered by write to final swap register
- Memory-mapped interface:
  - Portable to different architectures
  - Easy to use, no special instructions
- Functional specification in SystemC







Kristoffer Nyborg Gregertsen, Execution time control using a Time Management Unit

# Memory map

| Offset | Register            | Reset state |
|--------|---------------------|-------------|
| 0x00   | TMU_COMPARE_HI      | Oxfffffff   |
| 0x04   | TMU_COMPARE_LO      | Oxfffffff   |
| 0x08   | TMU_COUNT_HI        | 0           |
| 0x0c   | TMU_COUNT_LO        | 0           |
| 0x10   | TMU_SWAP_COMPARE_HI | Oxfffffff   |
| 0x14   | TMU_SWAP_COMPARE_LO | Oxfffffff   |
| 0x18   | TMU_SWAP_COUNT_HI   | 0           |
| 0x1c   | TMU_SWAP_COUNT_LO   | 0           |



## TMU implementation for UC3

- Implemented for UC3 by master student:

- High-speed bus  $\rightarrow$  peripheral bus
- Bound to peripheral bus clock for synchronous design
- Interface like other AVR32 peripherals
- Interrupt control registers
- Disabled by default



## TMU implementation for UC3

- Implemented for UC3 by master student:
  - High-speed bus  $\rightarrow$  peripheral bus
  - · Bound to peripheral bus clock for synchronous design
  - Interface like other AVR32 peripherals
  - Interrupt control registers
  - Disabled by default
- Main change is move to peripheral bus:
  - · Increased latency for access
  - Reduced predictability



www.ntnu.no

## TMU implementation for UC3

- Implemented for UC3 by master student:
  - High-speed bus  $\rightarrow$  peripheral bus
  - · Bound to peripheral bus clock for synchronous design
  - Interface like other AVR32 peripherals
  - Interrupt control registers
  - Disabled by default
- Main change is move to peripheral bus:
  - Increased latency for access
  - Reduced predictability
- Possible to use local CPU bus







Kristoffer Nyborg Gregertsen, Execution time control using a Time Management Unit





Kristoffer Nyborg Gregertsen, Execution time control using a Time Management Unit

## Ada 2012 implementation with TMU

- Take advantage of powerful AVR32 instructions:

- Load / store 64-bit values
- Atomic access to COUNT / COMPARE
- Load / store several registers
- Efficient swap operation



## Ada 2012 implementation with TMU

- Take advantage of powerful AVR32 instructions:

- Load / store 64-bit values
- Atomic access to COUNT / COMPARE
- Load / store several registers
- Efficient swap operation

Only few changes needed in run-time environment:

- Interface to TMU
- Clock interface  $\rightarrow$  two HW clocks
- Context switch



## Ada 2012 implementation with TMU

- Take advantage of powerful AVR32 instructions:

- Load / store 64-bit values
- Atomic access to COUNT / COMPARE
- Load / store several registers
- Efficient swap operation

Only few changes needed in run-time environment:

- Interface to TMU
- Clock interface  $\rightarrow$  two HW clocks
- Context switch
- Tested with synthesizable UC3 code



### **Performance improvements**

|                   | Improvement |               |  |
|-------------------|-------------|---------------|--|
| Test              | CPU cycles  | Reduction (%) |  |
| Context switch    | 65          | 54            |  |
| Interrupt handler | 30          | 25            |  |
| Timing event      | 4           | 4             |  |
| Interruption cost | 42          | 21            |  |

Compared to implementation without TMU



### **Performance improvements**

|                   | Improvement |               |  |
|-------------------|-------------|---------------|--|
| Test              | CPU cycles  | Reduction (%) |  |
| Context switch    | 65          | 54            |  |
| Interrupt handler | 30          | 25            |  |
| Timing event      | 4           | 4             |  |
| Interruption cost | 42          | 21            |  |

- Compared to implementation without TMU
- Significant overhead reductions



# Conclusion

#### - Execution time control for interrupts in Ada 2012:

- Total and separate execution time measurement
- Important with low overhead!



# Conclusion

- Execution time control for interrupts in Ada 2012:

- Total and separate execution time measurement
- Important with low overhead!
- Implementation on GNATforAVR32:
  - 32-bit timer tick-less measurement
  - Non-standard interrupt timer
  - Acceptable overhead could be reduced...



# Conclusion

Execution time control for interrupts in Ada 2012:

- Total and separate execution time measurement
- Important with low overhead!
- Implementation on GNATforAVR32:
  - 32-bit timer tick-less measurement
  - Non-standard interrupt timer
  - Acceptable overhead could be reduced...
- Time Management Unit:
  - Specialized 64-bit timer for execution time control
  - Implemented and tested with AVR32 UC3
  - Significantly reduces overhead

