Efficiently Using C and C++

layout: true
name: blank
styling: styling.css
styling-by: Martin Weitzel

.stylehint[
Styled with [{{styling}}]({{styling}}) by {{styling-by}}
]

---
layout: true
name: plain
copyright: (CC) BY-SA
branding:  [Dipl.-Ing. Martin Weitzel](http://tbfe.de)
customer:

<!--
  *****************************************************************************
  Template used for for pages NOT referring to any Info-Graphic
  *****************************************************************************
  The following attributes are mandatory FOR THE TEMPLATE PAGE and should
  simply be left empty if not meaningful.

copyright: will be reproduced in each page footer first
  branding: will reproduced in each page footer next
  customer: will be reproduced in each page footer last

As the above attributes are part of several page templates a global replace
  should be used for consistent changes.

On pages USING THIS TEMPLATE the following attributes must be set:

header: ## and header text (i.e. including the markdown formatting indicator)

-->

.pagefooter[
{{copyright}}: {{branding}} {{customer}}
]

---
layout: true
name: linkinfo
copyright: (CC) BY-SA
branding:  [Dipl.-Ing. Martin Weitzel](http://tbfe.de)
customer:

<!--
  *****************************************************************************
  Template used for for pages INTRODUCING to a new Info-Graphic
  *****************************************************************************
  On this kind of pages a size-reduced version of the whole info graphic will
  be reproduced and occupies aproximately 2/3 of the page width. So only add
  little information, preferrably links to later pages dealing with single
  sections of the info graphic.

On pages USING THIS TEMPLATE the following attributes must be set:

graphic: file path to of the infographic EXCLUDING the suffix.
  header: ## and header text (i.e. including the markdown formatting indicator)

-->

.infographic[
[![Info-Grafik](InfoGraphics/{{graphic}}.png)](InfoGraphics/{{graphic}}.png
                "Click to open - add [CTRL+] SHIFT for new [tabbed] window")
]

.pagefooter[
{{copyright}}: {{branding}} {{customer}}[]
]

---
layout: true
name: withinfo
copyright: (CC) BY-SA
branding:  [Dipl.-Ing. Martin Weitzel](http://tbfe.de)
customer:

<!--
  *****************************************************************************
  Template used for pages dealing with a SPECIFIC SECTION of an Info-Graphic
  *****************************************************************************
  On such pages a link to the info graphic is reproduced in the top-right
  corner (or maybe elsewhere depending on the style sheet), so there are no
  restrictions with respect to the space available for the content of the page.

On pages USING THIS TEMPLATE the following attributes must be set:

graphic: file path to of the info graphic EXCLUDING the suffix and
  section: specific section in the info graphic this page refers to
  header: ## and header text (i.e. including the markdown formatting indicator)
-->

.infolink.right[
[Click here for Info-Graphic  
{{graphic}}](InfoGraphics/{{graphic}}.png "add [CTRL+] SHIFT for own [tabbed] window")  
{{section}}
]

.pagefooter[
{{copyright}}: {{branding}} {{customer}}[]
]

---
template: blank
name: frontmatter

.title[
    [Efficient C and C++](#online_version)
]

.subtitle[
    Considerations for Embedded Developers
]
.author.pull-left[
    Dipl.-Ing. Martin Weitzel  
    Technische Beratung für EDV  
    http://tbfe.de  
]
.client.pull-right[
    For PLC2-Days 2014   
    May 20 to May 22  
    http://www.plc2.de
]

---
template: plain
class: agenda
name: agenda_part_1
header: ## Preface

One of the messages about C and C++ I wanted to pass around the last
twenty years is this:

There is little to none language inherent overhead:

* Neither in C - when compared to assembler

* Nor in C++ - when compared to C++._[]

.N[
Once you understand how language features map to the hardware, you'll
be able to do all the cherry-picking that gives you the best from C
and C++ while easily avoiding unexpected large memory footprint or
performance penalties.
]

.F[:
As long as optimizations possible for a classical tool-chain are limited
due to separate compilation, and while object modules in static or shared
libraries must be compiled with little knowledge how and where they are
used, this will have some impact on C++ exceptions for which code may have
to be generated that is never ever used.
]

---
template: plain
class: agenda
name: agenda_part_1
header: ## Agenda - Part 1

-------------------------------------------------

Efficiently Using C (from the embedded viewpoint)

-------------------------------------------------

* [What to Consider in Advance?](#initial_c_considerations)

* [Understand the Tool-chain](#tool_chain)

* [Assembler Code Analysis](#assembler_code)

* [Runtime Performance Analysis](#runtime_performance)

* [Understand the Memory Model](#memory_model)

* [C at the Lowest Level](#low_level_c)

---
template: plain
name: initial_considerations
header: ### What to Consider in Advance?

* Why use C at all? (instead of assembler)

* What are you striving for?

* Reduce Development Time?

* Flexible Components for Reuse?

* Optimize Memory Footprint?

* Optimize Execution Speed?

* … (Anything else?) …

.N[
The answer may well be different for distinct parts of your project!
]

---
template: plain
name: tool_chain
header: ### Understand Your Toolchain

Even if you use an IDE which offers a *Build* just by pressing a button
you should have a basic understanding for:

* The various types of files, usually identified by their suffix.

* The various tools and

* which files they take as input

* and which are produced as output.

.N[
If not good for anything else, this knowledge will at least help if you need
to *put your detective's hat on* later and try to find out what you cannot
explain from immediate evidence or easy to access sources.
]

To give concrete examples, in the following the GNU toolchain is assumed like
it is typically used for Zynq (cross-) building.

---
template: plain
name: translation_unit
header: #### Translation Units

A file constituting a Translation Unit is usually taken as (direct) input
by the compiler. Typically one single [Object File](#object_module) with
the same basename but the suffix `.o` results from a Translation Unit.

When using GCC, the source file suffix `.c` or `.cpp` determines the base
language (C or C++) and the option `-c` has to be added so that translation
stops with object files and does not continue with linking.

It may be necessary to override the language choice made automatically
to select one of various language standards like

* `gcc -c -std=c99 ...` compile C according to the C99 standard or
* `g++ -c -std=c++0x ...` or `g++ -std=c++11` for C++11 standard.._[]

(More to be demonstrated live and on request.)

.F[:
The choice of different commands here will mostly have an influence on
the [Link Loader](#link_loader) not actually run here.
]

---
template: plain
header: #### Header Files

A Header File is usually taken as *indirect input* by the compiler because
some [Translation Unit](#translation_unit) refers to it with an `#include`
directive.

* Header Files are typically used for sharing information that should not be
  duplicated at several places.

* They usually can be recognized by their file name suffix `.h`.._[]

Often one header files refers to another one. To avoid multiple processing
of the same header [Include Guards](#include_guard) are a standard technique.

.F[:
Note that this suffix is not mandatory and will sometimes be chosen differently
for good reason (like for header files automatically created) or local style
rules (like naming C++ headers `*.hh` or `*.hpp`.)
]

---
template: plain
name: include_guard
header: #### Include Guards

An Include Guard has the form:

```
#ifndef SOME_HEADER_H
#define SOME_HEADER_H

… (actual content) …

#endif
```

Include Guards may be omitted if multiple inclusion does not cause problems
at the price of (very) slightly slowing down the compilation speed.._[]

.F[:
If the latter is a real issue include guards might be placed (also) around
the directives that include some file but modern compilers achieve much the
same effect by internal optimization techniques which for GCC are further
discussed here:
http://gcc.gnu.org/onlinedocs/cppinternals/Guard-Macros.html
]

---
template: plain
header: #### Library Header Files

Functions from the standard library are usually not explicitly declared.
Instead their prototype declarations are included from a library header:

```
#include <stdio.h>
#include <string.h>

… (more content) …
```

In simple cases it *may* work in C to call a library function without
including its associated header, but this is not recommended and can
even fail at runtime in strange ways.._[]

.N[
You should probably consider to enable compiler warnings for this case
or maybe compile your code as C++, as then missing prototypes for called
functions always cause compilation errors.
]

.F[:
Actually the crux here is a subtle difference between explicit prototypes
and implicitly assumed declarations for functions accepting single precision
`float` arguments and in case of variadic argument lists and `NULL` pointers
... very little you may actually stumble over but nevertheless a potential
cause of unexplainable failures after seemingly unrelated changes to the code.
]

---
template: plain
name: object_module
header: #### Object Modules

Object Modules are

* the typical input._[] for the [Link Loader](#link_loader) and

* recognized by their file name suffix `.o`.

.N[
Object Modules contain only preliminary addresses for branches in code and
for global data references and register such items in relocation tables which
are also part of the object module file.
]

The relocation tables will later be used to adjust such preliminary addresses
when Object Modules are combined to an [Executable Program](#executable_program).

.F[:
Besides the object file names a linker script serves as input, controlling in
detail how modules are combined. More on this can be found here:
http://www.delorie.com/gnu/docs/binutils/ld_6.html
]

---
template: plain
header: #### Object Modules (Other Uses)

Some other commands that may reveal useful or interesting information on
object modules are:

* `size` - summarize memory requirements
* `nm` - show symbols to be relocated and (preliminary) value
* `objdump` - various information including disassembly (option `-S`)

.N[
If you have set up a typical cross development environment, the compiler and
the above tools will probably be prefixed with your target architecture,
e.g. `arm-linux-gnueabi-size` etc.._[]
]

(More to be demonstrated live and on request.)

.F[:
Such *Cross-Tools* are part of the *GNU Binutils* for which a brief overview
can be found here: http://www.gnu.org/software/binutils/
]

---
template: plain
name: link_loader
header: #### The Link Loader

The Link Loader combines object modules named directly or copied from a
[Static Library](#static_library) into an [Executable Program](#executable_program).

* For [Static Linking](#static_linking) the result is complete.

* In case of [Dynamic Linking](#dynamic_linking) another step will complete it
  when the file produced by the Link Loader is finally executed.

.N[
What the Link Loader also adds is a [Runtime Startup Module](#runtime_startup) that
will arrange everything so that `main` can finally be called like an ordinary
subroutine.._[]
]

(More to be demonstrated live and on request.)

.F[:
Historically there were different Runtime Startup Modules for C and C++.
As it appears, recent versions of the GNU toolchain do not any longer provide
this separation.
]

---
template: plain
name: static_linking
header: #### Static Linking

This is mainly the process to look for defined and expected (symbolic)
references in the set of modules forming an executable.

* For unsatisfied symbolic references the [Link Loader](#link_loader) fetches
  modules from [Static Libraries](#static_library) defining these symbols ...

* ... but of course this may lead to additional unsatisfied references.._[]

.F[:
This ping-pong between resolving references and adding new unresolved ones
could sometimes require a special order to name libraries to the linker
and for libraries which mutually reference each other it will even require
arcane linker options like `-Wl,--start-group` and `-Wl,--end-group`.
]

---
template: plain
name: static_library
header: #### Static Libraries

A Static Library is more or less a container for [Object Modules](#object_module)
that gets searched for members *defining* a reference *expected* by some
object module that is already in the set of modules to be linked.

Such library members are then copied by the [Link Loader](#link_loader) into the
[Executable Program](#executable_program), extending the set of modules to be
linked.

Under Unix/Linux

* typically `.a` is the file name suffix of Static Libraries and

* the tool for maintaining them is `ar`.._[]

.F[:
You may wonder why a special tool is used to maintain Static Libraries and their
content is not simply put into some directory. The answer is "for historic reasons"
because for the early Unix object modules often were substantially smaller as
disk blocks. A module that just wrapped a system call like (`read` or `write` etc.)
would be well below 100 bytes and using one file per module would have resulted in
a big loss of disk space. Remember: disks of 10..20 MByte only(!) were first "huge",
then "large" and later just plain "standard" in this times ...
]

---
template: plain
name: dynamic_linking
header: #### Dynamic Linking

There are two main steps in this process, the first happens exactly once, the second
each time an [Executable Program](#executable_program) is run.

* For unsatisfied references the [Link Loader](#link_loader) checks if such are
  part of a [Dynamic Library](#dynamic_library) (aka. Shared Object) and adds some
  information to the executable required in the next step.

* Actually connecting open references of a dynamically linked Executable Program
  with a [Dynamic Library](#dynamic_library).

If the latter is not yet in use by some other process (or pre-loaded) it will then
be loaded into memory for execution.._[]

.F[:
Physical loading may be deferred and happen by trapping page faults when some
library code is actually executed.
]

---
template: plain
name: dynamic_library
header: #### Dynamic Libraries

A Dynamic Library is much like a container for a number of object modules with
open references between each other already resolved, but no `main`-program to
take over control.

* Under Unix/Linux typically `.so` is used as a suffix and

* the tool for maintaining such libraries is the [Link Loader](#link_loader) too.

---
template: plain
name: executable_program
header: #### Executable Program

An Executable Program may have been statically or dynamically linked or even a mix
of both.

* In the first case it can be run stand-alone - so, static linking is also what is
  required for programs running on the Bare Metal.

* The tool `ldd` is useful to determine if a program was dynamically linked and which
  [Dynamic Libraries](#dynamic_library) an executable actually references.

---
template: plain
name: unix_kernel
header: #### The Unix/Linux Kernel

In its purpose the Unix/Linux Kernel is similar to a [Dynamic Library](#dynamic_library)
as

* it adds centrally implemented functionality to [Executable Programs](#executable_program),

* though the actual linking mechanism is not based on addresses but on small numbers
  exchanged when entering and leaving the kernel code,

* and the user process gains additional privileges by passing this call gate.

E.g. on ARM (i.e. in a Linux system as it runs on Zynq) the special machine instruction
`svc` is used with the number of the system call in `r7`.._[]

(More to be demonstrated live and on request.)

.F[:
With the tools already presented it is not too hard to investigate such details: you need
to extract some system call wrapper like `read.o` from the (static version of) the standard
library using `ar x` and then disassemble it with `objdump -S` ...
]

---
template: plain
name: assembler_code
header: ### Assembler Code Analysis

Now and then you will probably need to investigate how some C or C++ code for
your system is translated. Here are the options:

* Apply `objdump -S` on an [Object Module](#object_module) or
  [Executable Program](#executable_program).

* Compile a [Translation Unit](#translation_unit) to a file with assembler code
  using the option `-S` (upper case!). The result will be named like the
  translation unit source file but with the suffix `.s` (lower case!).

.N[
A nice tool to analyze (short sequences of) assembler code online is this:
http://gcc.godbolt.org/
]

We will occasionally use one of the above methods from now on to give live
examples, maybe driven by the questions you ask.

---
template: plain
name: runtime_performance
header: ### Performance Measurements

Now and then you will probably need to investigate how some C or C++ code for
your system performs.

* To measure execution time of some piece of code the author has often used
  his [CxTime Framework] which is documented separately.

* It builds on an idea from the book [Efficient C] (which is long out of print now)
  to put the code in question in a loop and automatically adjust a repetition
  count.

[Efficient C] http://www.amazon.de/Efficient-C-Thomas-Plum/dp/0911537058
  [CxTime Framework] http://tbfe.de/Projects/readme.html

We may use the above method later (if time allows) on as many concrete examples
as you like and then will probably also see its shortcomings if you try to
measure very basic things (like single C operators for basic types).

.N[
It should not be too hard to adapt the technique to Bare Metal Programming
on ZYNQ by implementing a simple counter in the PL and replacing the call to
the library function `clock` with a function that reads this counter.
]

---
template: plain
name: memory_model
header: ### Understand the Memory Model

On the following pages the basic hardware and memory model onto which the C language
constructs are mapped is explained.

---
template: plain
name: cpu_and_memory
header: #### The CPU and the Memory

The CPU mainly

* fetches and stores pieces of data from and to memory,

* interpreting the bits as raw values according to several types (with different
sizes and meanings of bit patterns in detail),

* eventually transforming those bit patterns into new ones.

.N[
If you have never got in touch with C: there are basic operations which map to many
ALU operations like add, shift, ... thereby giving access to frequently required
machine instructions.._[]
]

(More to be demonstrated live and on request.)

.F[:
Interestingly, while it is possible to work with bit-wise operations to AND, OR, XOR
or shift bit patterns, C does not support bit rotation with the excess on one end being
filled in on the other. So, sometimes, when high performance is valued higher than
portable code, you may need to fall back to insert assembler code in between your C
statements.
]

---
template: plain
name: code_memory
header: #### Code Memory

The CPU works based on the instructions stored here and fetched from the address the
Program Counter (aka. Instruction Pointer) points to.

---
template: plain
name: data_memory
header: #### Data Memory

Data memory should be considered separately for different kinds of data.

##### Global Data

All global variables as well as local variable declared `static` live at
fixed addresses assigned by the [Link Loader](#link_loader). They are
initialized when the program is loaded for execution.

##### Stack Data

All subroutine arguments and also all variables local to functions live
on the stack. They are accessed with addresses relative to the stack
pointer.._[]

##### Heap Data

Another possibility is to acquire and release memory dynamically. This
then is always referred to via [pointers](#pointer).

.F[:
Technically heap memory runs against stack memory and depending on the
hardware some means of protection will be necessary to avoid the one
will overwrite the other.
]

---
template: plain
name: subroutine_call
header: #### Stack Frames

For an executing program which is currently inside a function, called
by some other function ... etc. ... called by `main` the stack holds a
number of Stack Frames of which each can be assigned to an active
function call.._[]

##### Calling Subroutines

Work is split here between

* the caller that needs to store arguments and jump to the subroutine
* the callee that needs to make space for its local variables.

##### Returning from Subroutines

On return from a subroutine

* the caller has to undo space reservation for locals, then
* jump back to the caller
* which has to undo the argument transfer and
* receive the return value (if any).

---
template: plain
name: runtime_startup
header: #### The Runtime Startup Module

The lowest (or depending on your view highest) stack-frame belongs to
the part which called the `main` subroutine:

* This is the Runtime Startup Module, which

* typically needs to do some work that cannot be done in C.

On the other hand also the runtime startup module may call library
code.._[]

.F[:
A C Runtime Startup Module has to close every still open `FILE` and
execute the functions registered with `atexit`. In addition a C++ Runtime
Startup Module has to call global Constructors and Destructors prior and
after the call to `main` respectively, and may also provide a landing
strip for uncaught exceptions.
]

---
template: plain
class: agenda
name: low_level_c
header: ### C at the Lowest Level

C allows a programmer to work very close to the hardware, which may be one
of the reasons why C is so strongly present in certain domains since it
first entered the scene nearly 40 years ago (and C++ too, as it contains
C as a subset).._[]

.N[
In the sections following the view on C is just if it were kind of a
portable assembler.
]

The swing to a high(er) level view of C leading more or less directly to C++
starts with [part 2](#agenda_part_2).

.F[:
Another reason is of course the vast amount of existing libraries, some highly
optimized for a special area; it is the same reason why Fortran is still used
in some scientific applications.
]

---
template: plain
name: arithmetic_types
header: #### Arithmetic Types

Built-in to C are a number of basic arithmetic types (integral and floating), but
the ISO/ANSI standard only define minimum requirements.

* If you need specific sizes look for the types defined in the standard
  library header [`stdint.h`].

* For the implementation specific properties of these types see library
  header [`limits.h`].

[`stdint.h`]: http://en.cppreference.com/w/c/types/integer
[`limits.h`]: http://en.cppreference.com/w/c/types/limits

(More to be demonstrated live and on request.)

---
template: plain
name: flow_control
header: #### Flow Control

C has the usual structured flow control statements like

* `if` … and `if` … `else`
* `while` and `do` … `while`
* `for` …
* `switch` … `case` …

furthermore a `goto`-statement which is limited to branching within
a single subroutine.

The possibility to branch back in the call stack to any active function
is available by means of a pair of library functions which take a
register snapshot (at least program counter and stack pointer) and store it
at a global location via [`setjmp`] to be restored later via [`longjmp`].

.N[
There is no other library support necessary so that this technique
also works on the Bare Metal.
]

[`setjmp`]: http://en.cppreference.com/w/c/program/setjmp
[`longjmp`]: http://en.cppreference.com/w/c/program/longjmp

(More to be demonstrated live and on request.)

---
template: plain
name: pointer
header: #### Working with Addresses

Variables can be defined as pointers, i.e. to hold the address of some
other variable.

* The operation `*` is an indirect memory access via some pointer and

* the operation `&` takes the address of a variable.

Usually pointers carry a type with them which is the type that results
from dereferencing and also influences pointer arithmetic.

(More to be demonstrated live and on request.)

---
template: plain
name: void_pointers
header: #### Working with Addresses

Besides being typed pointers can be generic, i.e. point to nothing specific.

**Because generic pointers in C are compatible with typed pointers, there is a
hole in the type checking system allowing code like the following:**

```
double d;
void *p = &d;
int *ip = p;
*ip = 42;
```

(More to be demonstrated live and on request.)

---
template: plain
name: const_qualifier
header: #### Data Qualified with `const`

Applying the `const` qualifier to some piece of data causes compile-time
write protection, i.e. each attempt to modify the data after initialization
will result in a compile error.._[]

.N[
There is no runtime overhead incurred by `const` and frequently the compiler
can optimize constants away completely, i.e. issue code as if the initializing
value were used literally everywhere the constant had been used.
]

.F[:
In hosted environments with memory protection through an MUM the compiler may
also arrange for global `const`-qualified entities to be physically write
protected, so that any attempt to modify such - e.g. by means of an
[Enforced Type Conversion](casts) to a non-`const`-protected entity - will
cause program termination.
]

---
template: plain
name: const_qualifier
header: #### Data Qualified with `volatile`

Applying the `volatile` qualifier limits the optimizations a compiler might
apply if the same memory location is

* read more than once with no intervening writes or

* written more than once with no intervening reads.

.N[
Using `volatile` may or may not be sufficient for memory mapped access to device
registers, as it would still allow out of order processing or caching by the
the hardware.._[]
]

.F[:
Technically speaking `volatile` does not require that the compiler places barriers,
nor does it guarantee write-through caching behavior - at least not according to the
minimum requirements as written down in the C/C++ standard.
]

---
template: plain
name: casts
header: #### Enforced Type Conversions

In C any type conversion can be explicitly enforced by means of a cast operation.

* Between arithmetic types conversions happen automatically, so explicit type
  conversions are not strictly necessary (but may help to avoid warnings), but

* with an enforced conversion the value of a pointer could be specified as
  integral value.

.N[
This is especially useful in embedded programming as it allows accessing
memory mapped devices without resorting to assembler.
]

---
template: plain
name: raw_memory_access
header: #### Raw Memory Access

This is easy in C because via an enforced type conversion a pointer may be set
to any value and then dereferenced. The following lines could be from a
Bare Metal Program, an operating system kernel, or a device driver:._[]

```
#define UART1_BASE (0xE0001000)
…
#define UART1_CtrlRegAddr ((uint16_t *) (UART1_BASE + 0x30))
…
volatile uint16_t * const CtlUART1 = UART1_CtrlRegAddr;
```

(More to be demonstrated live and on request.)

.F[:
Besides, in the example shown the pointer itself is constant, i.e. it may not
be changed to a different value after its initialization. What is volatile is
the content of the memory word accessible through the dereferenced pointer.
]

---
template: plain
name: enumerations
header: #### Enumerations

Enumerations allow to give names to constants.._[]

Here is an example:

```
enum Color { Red, Blue, Green, White, Black };
/* by default =0,   =1,    =2,    =3,    =4 */
```

And another one:

```
enum { UART_TxFullBit = (1 << 4) };
…
while ((*CtlUART1 & UART_TxFullBit) != 0)
    ; /* wait for transmitter register being sent */
```

.F[:
There is no real type-safety enforced with `enum`-s in C because as
are assignment compatible with all the builtin arithmetic types.
]

(More to be demonstrated live and on request.)

---
template: plain
name: data_structures
header: ### C Data Structures

The support for builtin data structures in C is rather limited:._[]

Basically there are just

* [Arrays](#array) and

* [Structures](#struct).

The former represents a number (which is to be specified at compile time) of
**values of the same type** from which may be selected one using the index
operator (`[…]` a pair of square brackets).

The latter represents a collection of named **values of possibly different types**
accessed with either `.name` or `->name` appended to the a variable with
`struct`-type or a pointer to `struct`-type respectively.

.F[:
C++ has much more to offer on this side but often only at the price that heap
memory is used (by default) and hence such library support might exclude itself
from being (easily) applied in small Bare Metal Projects.
]

---
template: plain
name: array
header: #### Arrays

Arrays are formed from [Builtin Data Types](#arithmetic_types) or
[Structures](#structures) allocated in adjacent memory.

There may be some padding after elements so that the following is
given for any type `T`:

* For `T data[N]` there will always hold

* `sizeof data` == `N * sizeof data[0]`. <--- please correct THIS !!!!

(More to be demonstrated live and on request.)

---
template: plain
name: array_indexing
header: #### Arrays

When an array `T arr[N]` is indexed by an integral value `i` with the
operation `arr[i]`, the index is automatically scaled with `sizeof T`.

As indexing is zero-based the address effectively calculated is:

`(T*)((char *)&arr[0] + i*sizeof(T))`

.N[
Despite the required scaling, indexing into an array is a very inexpensive
operation, at least if there is only one dimension and the elements are of
a basic type.
]

Otherwise there is some overhead for the multiplication required to scale
the index, but advanced hardware often has dedicate fast multipliers for
this purpose.._[]

(More to be demonstrated live and on request.)

.F[:
Scaling is especially cheap on the ARM (as used in the Zynq) as long the scaling
factor is a small power of two  because there is an auto-scaling addressing mode
that will left-shift the index prior adding it to the array base address.
]

---
template: plain
name: array_vs_pointer
header: #### Arrays vs. Pointers

Arrays are said to *decay into pointers* in each use except after `sizeof`
(determine bytes in memory used) and `&` (taking address).

Therefore with `T arr[N]`

* `arr` and `&arr` have different types but the same value, while

* `arr+1` and `&arr+1` have different types **and** different values.

(More to be demonstrated live and on request.)

---
template: plain
name: array_arguments
header: #### Arrays as Arguments

When handed as argument to a function, an array always decays to a pointer
to its first element, because in the argument list

* the meaning of `T arg[]`

* is the same as `T *arg`.

.N[
To hand an array over to some function also its size needs to be handed over
in an additional argument as the function has no other means to find it out.._[]
]

(More to be demonstrated live and on request.)

.F[:
But there is a trick to wrap the array with a `struct` ...
]

---
template: plain
name: struct
header: #### Structures

Structures act as containers around builtin types, arrays and other structures.

* Addressing inside the structure is via an offset from the structure start.

* Therefore often some address arithmetic will be necessary at runtime, but ...

* ... as the offset of an element in a `struct` is fixed and known at compile
  time most any modern processor has an addressing mode allowing exactly this
  addition with little or no runtime overhead.

(More to be demonstrated live and on request.)

---
template: plain
name: how_to_continue_1
header: ## Now it's up to you ...

... how we proceed:._[]

* Answer your questions?

* Continue with more live examples?

* Delve a bit deeper into the C library and what it has to offer for
  embedded programmers?

.F[:
The presentation so far had approximately 40 pages and the total time slot
is 90 minutes. Assuming we have spent between one and two minutes per page
we are either near to the end now or may have some time left.
]

---
template: plain
name: agenda_part_2
header: ## Agenda - Part 2

---------------------------------------------------

Efficiently Using C++ (from the embedded viewpoint)

---------------------------------------------------

* [What to Consider in Advance?](#initial_cpp_considerations)

* [Stepping from C to C++](#from_c_to_cpp)

* [Introducing Structures](#introducing_struct)

* [Introducing Classes](#introducing_class)

* [Reuse based on Classes](#has_a_relation)

* [Specializing General Concepts](#is_a_relation)

* [Reusability Through Templates](#genericity)

---
template: plain
name: initial_cpp_considerations
header: ### What to Consider in Advance?

* Why use C++ at all? (instead of C or any other language)

* You are sure you want the "Full OO-Stack" ...

* ... or just do some cherry-picking?

* What are you striving for?

* Reduce Development Time?

* Flexible Components for Reuse?

* Optimize Memory Footprint?

* Optimize Execution Speed?

* … (Anything else?) …

---
template: plain
name: from_c_to_cpp
header: ## Stepping from C to C++ (The Basics)

A typical learning sequence is this (maybe skipping the first and
even some parts of the second step):._[]

* No subroutines, lots of separate variables, much code duplication

* Common code to subroutines, handing over single variables

* Related variables in structures, handing over `struct`-pointers

* Turning structures into classes

.F[:
We will try to cover all the above topics in this talk but the depth
to which we will go also depends on your questions.
]

---
template: plain
name: from_c_to_cpp
header: ## Stepping Deeper into C++ (Advanced Concepts)

When heading for advanced topics, the following will become areas of
interest (not necessarily in this order as the topics are independant
of each other).._[]

* Parametrizing types with templates

* Virtual member functions for handling small variations

* Adaptability through virtual functions

* Adaptability achieved with templates

* Template Meta-Programming

.F[:
We will not necessarily cover all these topics in this talk. It depends
on how much time we spend with the basics, on your questions and general
interest in advanced C++.
]

---
template: plain
name: copy_and_paste_programming
header: ### Copy&Paste-Programming (1)

Our starting point might look as follows

```
/* store min/max/average-measurements */
double val;
int cnt= 0;
double min, max, avg, sum= 0.0;
…
while (get_sensor(&val)) {
    if (++cnt == 1) min= max= val;
    else if (min > val) min= val;
    else if (max < val) max= val;
    sum+= val;
    avg= sum / cnt;
}
```

---
template: plain
name: copy_and_paste_programming_2
header: #### Copy&Paste-Programming (2)

If this is now required for a second sensor, the following might be
added by copying and pasting the source code and making some changes
to the variable names:

```
… (as before) …
double val2;
int cnt2= 0;
double min2, max2, avg2, sum2= 0.0;
while (get_sensors(&val, &val2)) {
    … (code as before) …
    if (++cnt2 == 1) min2= max2= val2;
    else if (min2 > val2) min2= val2;
    else if (max2 < val2) max2= val2;
    sum2+= val2;
    avg2= sum2 / cnt2;
}
```

---
template: plain
name: copy_and_paste_programming_3
header: #### Copy&Paste-Programming (3)

Then, sometime later again, a third sensor will be added following
the same *"copy & paste & change what has to be changed"* procedure:

```
… (variables as before) …
double val3;
int cnt3;
double min3, max3, avg3, sum3= 0.0;
double oldvalues[20];
while (get_sensors(&val, &val2, &val3)) {
    … (code as before) …
    if (++cnt3 == 1) min3= max3= val3;
    else if (min3 > val3) min3= val3;
    else if (max3 < val3) max3= val3;
    …
}
```

Just for the sake of changes in requirements let us assume the average is
calculated a bit different this time, may be from the last 20 values
(therefore the array `oldvalues`), so that it follows the value of the
measurement just a bit smoothened.

---
template: plain
name: common_code_to_functions_1
header: ### Moving Common Code to Functions (1)

If the always common code gets obvious and is viewn as a burden to
maintain (assume we expect to get many more duplications), the next
step may be to move the common calculations into a function:

```
void calc_minmax_and_avg(double val, double *pmin,
                         double *pmax, double *psum,
                         int *pcnt, double *pavg) {
    if (++(*pcnt) == 1) *pmin= *pmax= val;
    else if (*pmin > val) *pmin= val;
    else if (*pmax < val) *pmax= val;
    *psum+= val;
    *pavg= *psum / *pcnt;
}
```

---
template: plain
name: common_code_to_functions_2
header: #### Moving Common Code to Functions (2)

The many times duplicated code is replaced with function calls:

```
…
get_sensors(&val, &val2, &val3);
calc_minmax_and_avg(val, &min, &max, &sum, &cnt, &avg);
calc_minmax_and_avg(val2, &min2, &max2, &sum2, &cnt2, &avg2);
calc_minmax_and_avg_smoothened(val3, &min3, &max3, &sum3,
                               &cnt3, oldvalues, 20, &avg3);
…
```

---
template: plain
name: common_code_to_functions_3
header: #### Moving Common Code to Functions (3)

Instead of pointers in C++ references could be used, which get
automatically dereferenced inside the function:

```
void calc_minmax_and_avg(double val, double &rmin,
                         double &rmax, double &rsum,
                         int &rcnt, double &ravg) {
    if (++rcnt == 1) rmin= rmax= val;
    else if (rmin > val) rmin= val;
    else if (rmax < val) rmax= val;
    rsum+= rval;
    ravg= rsum / rcnt;
}
```

.N[
Behind the scenes references are just *differently dressed pointers*._[]
and the assembler code generated is no different from that generated for
pointers.
]

.F[:
Or rather `const` pointers because a reference will always point to (be an
alias for) the same entity to which it has been initialized.
]

---
template: plain
name: common_code_to_functions_4
header: #### Moving Common Code to Functions (3)

The calls then would omit the explicit address operators:

```
…
get_sensors(&val, &val2, &val3);
calc_minmax_and_avg(val, min, max, sum, cnt, avg);
calc_minmax_and_avg(val2, min2, max2, sum2, cnt2, avg2);
calc_minmax_and_avg_smoothened(val3, min3, max3, sum3, cnt3,
                               oldvalues, 20, avg3);
…
```

But at assembly level the result is the same (typically).

---
template: plain
name: introducing_struct
header: ### Combining Related Data into a `struct`

Handing over related data as separate arguments is inconvenient and
error prone, so the next step comes in quite natural:

```
struct min_max_avg {
    double min;
    double max;
    double sum;
    int cnt;
    double avg;
};
```

---
template: plain
name: introducing_struct_2
header: #### Combining Related Data into a `struct` (2)

This changes the functions doing the calculations to:

```
void calc_minmax_and_avg(double val, struct min_max_avg *p) {
    if (++(p->cnt) == 1) p->min= p->max= val;
    else if (p->min > val) p->min= val;
    else if (p->max < val) p->max= val;
    p->sum+= val;
    p->avg= p->sum / p->cnt;
}
```

---
template: plain
name: introducing_struct_3
header: #### Combining Related Data into a `struct` (3)

Looking more closely reveals that the average not necessarily needs to be part
of this `struct`, it could as well be calculated by some other helper function:

```
void calc_minmax_and_avg(double val, struct min_max_avg *p) {
    if (++(p->cnt) = 1) p->min= p->max= val;
    else if (p->min > val) p->min= val;
    else if (p->max < val) p->max= val;
    p->sum+= val;
}

double get_average(const struct min_max_avg *p) {
    return p->sum / p->cnt;
}
```

---
template: plain
name: introducing_struct_4
header: #### Combining Related Data into a `struct` (4)

```
MinMaxAvg data, data2, …
…
get_sensors(&val, &val2, &val3);
set_value(val, &data);
set_value(val2, &data2);
…
…  = get_average(&data);
…
```

The loss or gain in efficiency depends on how much is saved by only
handing over one argument compared to the address arithmetic that now
must take place inside the functions.

---
template: plain
name: introducing_struct_5
header: #### Combining Related Data into a `struct` (5)

Another step in the direction of hiding data could be to negotiate
with the clients (other part of the program using the `struct` only
access the content via helper functions:

```
double get_minimum(const struct min_max_avg *p) {
    return p->min;
}
double get_maximum(const struct min_max_avg *p) {
    return p->max;
}
```

.N[
This directly leads to demand the compiler should inline small functions,
because otherwise a simple `mov` at assembler level would be replaced with
a typically much more costlier subroutine call.
]

---
template: plain
name: from_struct_to_class
header: ### From `struct`-s to `class`-es

As soon as abstract data types have been introduced as shown so
far, many C++ features fall easily into place as they solve some
problem or make otherwise inelegant and cumbersome to maintain
code more pleasant, like:

* Object Notation

* Access Protection

* Constructors and Destructors

---
template: plain
name: operations_as_members
header: ####  Operations as Members

The notation is natural and easily understood:

```
struct MinMaxAvg {
    double min;
    double max;
    … (etc as before) …
    void set_value(double val);
    double get_minimum() const;
    double get_minimum() const;
    double get_average() const;
};
```

---
template: plain
name: implicit_this_pointer
header: ####  Implicit `this` Pointer

.N[
Prepending `this->` is an implicit assumption for all access to members.
]

```
void MinMaxAvg::set_value(double val) {
    if (++cnt == 1) min= max= val;
    else if (min > val) min= val;
    else if (max < val) max= val;
    sum+= val;
}
double MinMaxAvg::get_minimum() const {
    return max;
}
double MinMaxAvg::get_maximum() const {
    return min;
}
double MinMaxAvg::get_average() const {
    return sum / cnt;
}
```

---
template: plain
name: object_notation
header: ####  Object Notation

On the call site the variable to which the function applies is more
visibly standing out.

```
MinMaxAvg data, data2;
…
data.set_value(val);
data2.set_value(val2);
…
```

.N[
So far moving from a `struct` to a `class` adds notational convenience
and causes no reason for code overhead compared to C.
]

---
template: plain
name: access_protection
header: ####  Access Protection

At this point access protection can be added with little effort:

```
class MinMaxAvg {
private:
    double min;
    double max;
    … (etc as before) …
public:
    void set_value(double val);
    double get_minimum() const;
    … (etc as before) …
};
```

Again this is a compile time feature only and there is no reason for
any overhead in the code.

---
template: plain
name: constructors
header: ####  Constructors

In the C version with `struct`-s the problem of reliable initialisation had
been neglected for brevity.._[]

```
class MinMaxAvg {
private:
    … (etc as before) …
public:
    MinMaxAvg() : sum(0.0), cnt(0) {}
    … (etc as before) …
};
```

Initialization is now guaranteed for:

```
MinMaxAvg data, data2, …
```

.F[:
As long as the variables had all been global no problem would have occured
since C guarantees 0-initialisation in this case. Otherwise one additional
subroutine for initialisation could have been supplied ... but were not
**guaranteed** to be called as it is the case with constructors.
]

---
template: plain
name: how_to_continue_2
header: ## Now it's up to you ...

... how we proceed:._[]

* Answer your questions?

* Continue with more live examples?

* Delve a bit deeper into more C++ features or the C++ library and of what
  use they are for an embedded developper?

.F[:
The presentation so far had approximately 25 pages and the total time slot
is 90 minutes. Assuming we have spent between two to three minutes per page
we are either near the end now or have some time left.
]

---
template: plain
name: literature
header: Some literature mentioned ...

The (new) book [Realtime C++] by Christopher Michael Kormanyos especially
deals with how to profit from using C++ in Embedded Projects.

http://www.springer.com/computer/communication+networks/book/978-3-642-34687-3