# Pointers III: Pointers to Pointers and More Here's where we cover some intermediate and advanced pointer usage. If you don't have pointers down well, review the previous chapters on [pointers](#pointers) and [pointer arithmetic](#pointers2) before starting on this stuff. ## Pointers to Pointers If you can have a pointer to a variable, and a variable can be a pointer, can you have a pointer to a variable that it itself a pointer? Yes! This is a pointer to a pointer, and it's held in variable of type pointer-pointer. Before we tear into that, I want to try for a _gut feel_ for how pointers to pointers work. Remember that a pointer is just a number. It's a number that represents an index in computer memory, typically one that holds a value we're interested in for some reason. That pointer, which is a number, has to be stored somewhere. And that place is memory, just like everything else^[There's some devil in the details with values that are stored in registers only, but we can safely ignore that for our purposes here. Also the C spec makes no stance on these "register" things beyond the `register` keyword, the description for which doesn't mention registers.]. But because it's stored in memory, it must have an index it's stored at, right? The pointer must have an index in memory where it is stored. And that index is a number. It's the address of the pointer. It's a pointer to the pointer. Let's start with a regular pointer to an `int`, back from the earlier chapters: ``` {.c .numberLines} #include int main(void) { int x = 3490; // Type: int int *p = &x; // Type: pointer to an int printf("%d\n", *p); // 3490 } ``` Straightforward enough, right? We have two types represented: `int` and `int*`, and we set up `p` to point to `x`. Then we can dereference `p` on line 8 and print out the value `3490`. But, like we said, we can have a pointer to any variable... so does that mean we can have a pointer to `p`? In other words, what type is this expression? ``` {.c} int x = 3490; // Type: int int *p = &x; // Type: pointer to an int &p // <-- What type is the address of p? AKA a pointer to p? ``` If `x` is an `int`, then `&x` is a pointer to an `int` that we've stored in `p` which is type `int*`. Follow? (Repeat this paragraph until you do!) And therefore `&p` is a pointer to an `int*`, AKA a "pointer to a pointer to an `int`". AKA "`int`-pointer-pointer". Got it? (Repeat the previous paragraph until you do!) We write this type with two asterisks: `int **`. Let's see it in action. ``` {.c .numberLines} #include int main(void) { int x = 3490; // Type: int int *p = &x; // Type: pointer to an int int **q = &p; // Type: pointer to pointer to int printf("%d %d\n", *p, **q); // 3490 3490 } ``` Let's make up some pretend addresses for the above values as examples and see what these three variables might look like in memory. The address values, below are just made up by me for example purposes: |Variable|Stored at Address|Value Stored There| |-|-|-| |`x`|`28350`|`3490`---the value from the code| |`p`|`29122`|`28350`---the address of `x`!| |`q`|`30840`|`29122`---the address of `p`!| Indeed, let's try it for real on my computer^[You're very likely to get different numbers on yours.] and print out the pointer values with `%p` and I'll do the same table again with actual references (printed in hex). |Variable|Stored at Address|Value Stored There| |-|-|-| |`x`|`0x7ffd96a07b94`|`3490`---the value from the code| |`p`|`0x7ffd96a07b98`|`0x7ffd96a07b94`---the address of `x`!| |`q`|`0x7ffd96a07ba0`|`0x7ffd96a07b98`---the address of `p`!| You can see those addresses are the same except the last byte, so just focus on those. On my system, `int`s are 4 bytes, which is why we're seeing the address go up by 4 from `x` to `p`^[There is absolutely nothing in the spec that says this will always work this way, but it happens to work this way on my system.] and then goes up by 8 from `p` to `q`. On my system, all pointers are 8 bytes. Does it matter if it's an `int*` or an `int**`? Is one more bytes than the other? Nope! Remember that all pointers are addresses, that is indexes into memory. And on my machine you can represent an index with 8 bytes... doesn't matter what's stored at that index. Now check out what we did there on line 9 of the previous example: we _double dereferenced_ `q` to get back to our `3490`. This is the important bit about pointers and pointers to pointers: * You can get a pointer to anything with `&` (including to a pointer!) * You can get the thing a pointer points to with `*` (including a pointer!) So you can think of `&` as being used to make pointers, and `*` being the inverse---it goes the opposite direction of `&`---to get to the thing pointed to. In terms of type, each time you `&`, that adds another pointer level to the type. |If you have|Then you run|The result type is| |:-|:-:|:-| |`int x`|`&x`|`int *`| |`int *x`|`&x`|`int **`| |`int **x`|`&x`|`int ***`| |`int ***x`|`&x`|`int ****`| And each time you use dereference (`*`), it does the opposite: |If you have|Then you run|The result type is| |:-|:-:|:-| |`int ****x`|`*x`|`int ***`| |`int ***x`|`*x`|`int **`| |`int **x`|`*x`|`int *`| |`int *x`|`*x`|`int`| Note that you can use multiple `*`s in a row to quickly dereference, just like we saw in the example code with `**q`, above. Each one strips away one level of indirection. |If you have|Then you run|The result type is| |:-|:-:|:-| |`int ****x`|`***x`|`int *`| |`int ***x`|`**x`|`int *`| |`int **x`|`**x`|`int`| In general, `&*E == E`^[Even if `E` is `NULL`, it turns out, weirdly.]. The dereference "undoes" the address-of. But `&` doesn't work the same way---you can only do those one at a time, and have to store the result in an intermediate variable: ``` {.c} int x = 3490; // Type: int int *p = &x; // Type: int *, pointer to an int int **q = &p; // Type: int **, pointer to pointer to int int ***r = &q; // Type: int ***, pointer to pointer to pointer to int int ****s = &r; // Type: int ****, you get the idea int *****t = &s; // Type: int ***** ``` ### Pointer Pointers and `const` If you recall, declaring a pointer like this: ``` {.c} int *const p; ``` means that you can't modify `p`. Trying to `p++` would give you a compile-time error. But how does that work with `int **` or `int ***`? Where does the `const` go, and what does it mean? Let's start with the simple bit. The `const` right next to the variable name refers to that variable. So if you want an `int***` that you can't change, you can do this: ``` {.c} int ***const p; p++; // Not allowed ``` But here's where things get a little weird. What if we had this situation: ``` {.c .numberLines} int main(void) { int x = 3490; int *const p = &x; int **q = &p; } ``` When I build that, I get a warning: ``` warning: initialization discards ‘const’ qualifier from pointer target type 7 | int **q = &p; | ^ ``` What's going on? The That is, we're saying that q is type `int **`, and if you dereference that, the rightmost `*` in the type goes away. So after the dereference, we have type `int *`. And we're assigning `&p` into it which is _a pointer to_ an `int *const`, or, in other words, `int *const *`. But `q` is `int **`! A type with different `const`ness on the first `*`! So we get a warning that the `const` in `p`'s `int *const *` is being ignored and thrown away. We can fix that by making sure `q`'s type is at least as `const` as `p`. ``` {.c} int x = 3490; int *const p = &x; int *const *q = &p; ``` And now we're happy. We could make `q` even more `const`. As it is, above, we're saying, "`q` isn't itself `const`, but the thing it points to is `const`." But we could make them both `const`: ``` int x = 3490; int *const p = &x; int *const *const q = &p; // More const! ``` And that works, too. Now we can't modify `q`, or the pointer `q` points to. ## Multibyte Values {#multibyte-values} We kinda hinted at this in a variety of places earlier, but clearly not every value can be stored in a single byte of memory. Things take up multiple bytes of memory (assuming they're not `char`s). You can tell how many bytes by using `sizeof`. And you can tell which address in memory is the _first_ byte of the object by using the standard `&` operator, which always returns the address of the first byte. And here's another fun fact! If you iterate over the bytes of any object, you get its _object representation_. Two things with the same object representation in memory are equal. If you want to iterate over the object representation, you should do it with pointers to `unsigned char`. Let's make our own version of [`memcpy()`](#man-memcpy) that does exactly this: ``` {.c} void *my_memcpy(void *dest, const void *src, size_t n) { // Make local variables for src and dest, but of type unsigned char const unsigned char *s = src; unsigned char *d = dest; while (n-- > 0) // For the given number of bytes *d++ = *s++; // Copy source byte to dest byte // Most copy functions return a pointer to the dest as a convenience // to the caller return dest; } ``` (There are some good examples of post-increment and post-decrement in there for you to study, as well.) It's important to note that the version, above, is probably less efficient than the one that comes with your system. But you can pass pointers to anything into it, and it'll copy those objects. Could be `int*`, `struct animal*`, or anything. Let's do another example that prints out the object representation bytes of a `struct` so we can see if there's any padding in there and what values it has^[Your C compiler is not required to add padding bytes, and the values of any padding bytes that are added are indeterminate.]. ``` {.c .numberLines} #include struct foo { char a; int b; }; int main(void) { struct foo x = {0x12, 0x12345678}; unsigned char *p = (unsigned char *)&x; for (size_t i = 0; i < sizeof x; i++) { printf("%02X\n", p[i]); } } ``` What we have there is a `struct foo` that's built in such a way that should encourage a compiler to inject padding bytes (though it doesn't have to). And then we get an `unsigned char *` to the first byte of the `struct foo` variable `x`. From there, all we need to know is the `sizeof x` and we can loop through that many bytes, printing out the values (in hex for ease). Running this gives a bunch of numbers as output. I've annotated it below to identify where the values were stored: ``` 12 | x.a == 0x12 AB | BF | padding bytes with "random" value 26 | 78 | 56 | x.b == 0x12345678 34 | 12 | ``` On all systems, `sizeof(char)` is 1, and we see that first byte at the top of the output holding the value `0x12` that we stored there. Then we have some padding bytes---for me, these varied from run to run. Finally, on my system, `sizeof(int)` is 4, and we can see those 4 bytes at the end. Notice how they're the same bytes as are in the hex value `0x12345678`, but strangely in reverse order^[This will vary depending on the architecture, but my system is _little endian_, which means the least-significant byte of the number is stored first. _Big endian_ systems will have the `12` first and the `78` last. But the spec doesn't dictate anything about this representation.]. So that's a little peek under the hood at the bytes of a more complex entity in memory. ## The `NULL` Pointer and Zero These things can be used interchangeably: * `NULL` * `0` * `'\0'` * `(void *)0` Personally, I always use `NULL` when I mean `NULL`, but you might see some other variants from time to time. Though `'\0'` (a byte with all bits set to zero) will also compare equal, it's _weird_ to compare it to a pointer; you should compare `NULL` against the pointer. (Of course, lots of times in string processing, you're comparing _the thing the pointer points to_ to `'\0'`, and that's right.) `0` is called the _null pointer constant_, and, when compared to or assigned into another pointer, it is converted to a null pointer of the same type. ## Pointers as Integers You can cast pointers to integers and vice-versa (since a pointer is just an index into memory), but you probably only ever need to do this if you're doing some low-level hardware stuff. The results of such machinations are implementation-defined, so they aren't portable. And _weird things_ could happen. C does make one guarantee, though: you can convert a pointer to a `uintptr_t` type and you'll be able to convert it back to a pointer without losing any data. `uintptr_t` is defined in ``^[It's an optional feature, so it might not be there---but it probably is.]. Additionally, if you feel like being signed, you can use `intptr_t` to the same effect. ## Pointer Differences As you know from the section on pointer arithmetic, you can subtract one pointer from another^[Assuming they point to the same array object.] to get the difference between them in count of array elements. Now the _type of that difference_ is something that's up to the implementation, so it could vary from system to system. To be more portable, you can store the result in a variable of type `ptrdiff_t` defined in ``. ``` {.c} int cats[100]; int *f = cats + 20; int *g = cats + 60; ptrdiff_t d = g - f; // difference is 40 ``` And you can print it by prefixing the integer format specifier with `t`: ``` {.c} printf("%td\n", d); // Print decimal: 40 printf("%tX\n", d); // Print hex: 28 ``` ## Pointers to Functions Functions are just collections of machine instructions in memory, so there's no reason we can't get a pointer to the first instruction of the function. And then call it. This can be useful for passing a pointer to a function into another function as an argument. Then the second one could call whatever was passed in. The tricky part with these, though, is that C needs to know the type of the variable that is the pointer to the function. And it would really like to know all the details. Like "this is a pointer to a function that takes two `int` arguments and returns `void`". How do you write all that down so you can declare a variable? Well, it turns out it looks very much like a function prototype, except with some extra parentheses: ``` {.c} // Declare p to be a pointer to a function. // This function returns a float, and takes two ints as arguments. float (*p)(int, int); ``` Also notice that you don't have to give the parameters names. But you can if you want; they're just ignored. ``` {.c} // Declare p to be a pointer to a function. // This function returns a float, and takes two ints as arguments. float (*p)(int a, int b); ``` So now that we know how to declare a variable, how do we know what to assign into it? How do we get the address of a function? Turns out there's a shortcut just like with getting a pointer to an array: you can just refer to the bare function name without parens. (You can put an `&` in front of this if you like, but it's unnecessary and not idiomatic.) Once you have a pointer to a function, you can call it just by adding parens and an argument list. Let's do a simple example where I effectively make an alias for a function by setting a pointer to it. Then we'll call it. This code prints out `3490`: ``` {.c .numberLines} #include void print_int(int n) { printf("%d\n", n); } int main(void) { // Assign p to point to print_int: void (*p)(int) = print_int; p(3490); // Call print_int via the pointer } ``` Notice how the type of `p` represents the return value and parameter types of `print_int`. It has to, or else C will complain about incompatible pointer types. One more example here shows how we might pass a pointer to a function as an argument to another function. We'll write a function that takes a couple integer arguments, plus a pointer to a function that operates on those two arguments. Then it prints the result. ``` {.c .numberLines} #include int add(int a, int b) { return a + b; } int mult(int a, int b) { return a * b; } void print_math(int (*op)(int, int), int x, int y) { int result = op(x, y); printf("%d\n", result); } int main(void) { print_math(add, 5, 7); // 12 print_math(mult, 5, 7); // 35 } ``` Take a moment to digest that. The idea here is that we're going to pass a pointer to a function to `print_math()`, and it's going to call that function to do some math. This way we can change the behavior of `print_math()` by passing another function into it. You can see we do that on lines 22-23 when we pass in pointers to functions `add` and `mult`, respectively. Now, on line 13, I think we can all agree the function signature of `print_math()` is a sight to behold. And, if you can believe it, this one is actually pretty straight-forward compared to some things you can construct^[The Go Programming Language drew its type declaration syntax inspiration from the opposite of what C does.]. But let's digest it. Turns out there are only three parameters, but they're a little hard to see: ``` {.c} // op x y // |-----------------| |---| |---| void print_math(int (*op)(int, int), int x, int y) ``` The first, `op`, is a pointer to a function that takes two `int`s as arguments and returns an `int`. This matches the signatures for both `add()` and `mult()`. The second and third, `x` and `y`, are just standard `int` parameters. Slowly and deliberately let your eyes play over the signature while you identify the working parts. One thing that always stands out for me is the sequence `(*op)(`, the parens and the asterisk. That's the giveaway it's a pointer to a function. Finally, jump back to the _Pointers II_ chapter for a pointer-to-function [example using the built-in `qsort()`](#qsort-example).