VServer IP Setup (A Journey with Bash) 0. The Players/Setup 0.1 Variables the vserver script knows and uses the following shell variables when it comes to deciding how to set up network interfaces and ipv4root: IPROOT, IPROOTDEV, IPROOTMASK, IPROOTBCAST, and NODEV while you might already know the first four, the last probably will sound unfamiliar, as it can't be specified in the .conf file yet[1]. NODEV is set by the --nodev option passed to the script and basically disables the network alias setup, but not the chbind, which might give unexpected results. 0.2 The Pieces a look at the script shows, that the actual work is broken down into smaller pieces, designed according to 'divide et impera' (divide and conquer) - ifconfig_iproot() will setup the aliases - setipopt() will generate the --ip list as those are different parts of this script, they will be described independently of each other. 1. The Network Aliases 1.1 Basic Requirements basically the following is required to execute the code creating an alias of an existing interface: - NODEV is not set (--nodev option not given) - IPROOT is not empty and does not have the values IPROOT="0.0.0.0" or IPROOT="ALL" 1.2 For each entry ... then for each entry in IPROOT (entries are separated by spaces, as in every good shell script) the following assignments and checks are done ... - is there a ':' in this entry, if o YES, then left part is the o NO, then the is IPROOTDEV - is there a '/' in this entry, if o YES, then the right part is the o NO, then the is IPROOTMASK after those checks, we can assume that , and where either specified or assigned the 'default' values (which might be empty). 1.3 Is there a Device? now the script checks whether is non empty (which means either IPROOTDEV or the device part for this entry wasn't empty), and if found so, does the following (if not, continue with the next entry): - if a vlan device was specified (.) some vlan setup (vconfig add ...) is done and a fake base address (127.0.0.1) is assigned. anyway, the next step is very interesting, as it introduces some indeterministic components, due to an uninitialized pair of variables[2] ... 1.4 Determinism? Sometimes! the up to now collected values for , , are completed by a which is set to IPROOTBCAST (which can be empty) and fed into a (C++/C) tool called 'ifspec' which aims to give the 'device specification' in a 'shell usable' way ... called like this # ifspec it basically does the following: - if is non empty, output ADDR= - otherwise try to get the ipaddr from the interface (via SIOCGIFADDR), and if successful, output that in the same format. now the same is done for with NETMASK=/SIOCGIFNETMASK and for with BCAST=/SIOCGIFBRDADDR except for the detail, that if the isn't specified, and can't be retrieved from the kernel, the tool tries to compute that value in the following manner: = ( & ) | ~ which is perfectly right, if both, and either have been specified or returned by the kernel, and interesting[2] if not ... 1.5 Information Feedback the next step is simple, the generated output of ifspec is evaluated and IPROOTMASK and IPROOTBCAST are updated to the reported values ... the actual interface alias is created with ifconfig : \ netmask $IPROOTMASK broadcast $IPROOTBCAST after that, the next entry is processed. 1.6 Summary so far - a specified device/mask in an entry has priority over the 'defaults' IPROOTDEV/IPROOTMASK. - the mask/bcast is 'calculated' or 'retrieved' from the kernel if not specified via entry or IPROOTMASK/IPROOTBCAST - an entry has this format: [:][/] where and have to be in Dotted Quad Octet (aaa.bbb.ccc.ddd) 2. The IPV4Root 2.1 The List of IPs when it comes to the ipv4root setup, only entries in IPROOT matter, while IPROOTMASK is silently ignored (IPROOTDEV and IPROOTBCAST are not relevan) - if IPROOT is empty, then "0.0.0.0" is used instead - if IPROOT="ALL", then a tool called 'listdevip' generates a list of all configured IPs (including lo/127.0.0.1 and whatever is configured) - otherwise for each entry the optional part is removed and --ip is prepended ... 2.2 The chbind tool the sequence of --ip [/] pairs generated in the previous step, is passed to the chbind tool as arguments, actually restricting the environment to those addresses (max 16 for now)[3] the funny part here is, that the tool is capable of understanding /XX netmasks, where the shell script and ifconfig are not, which can give some funny results like 'Invalid IP number or netmask:' 2.3 Conclusions here - don't rely on IPROOTMASK fallback - don't use /XX masks, unless you know what you are doing (see Dirty Tricks) - do not specify more than the allowed IPs - be careful with empty IPROOT or IPROOT="ALL" 3. Dirty Tricks (Examples) 3.1 The existing Interface eth0 Link encap:Ethernet HWaddr 52:54:00:12:34:56 inet addr:192.168.0.1 Bcast:192.168.0.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.0 UP LOOPBACK RUNNING MTU:16436 Metric:1 a) you want to allow one IP, for example 192.168.0.1 for a vserver, but don't want the script to setup an alias ... IPROOT="192.168.0.1" - because IPROOTDEV isn't specified, and the entry doesn't contain an interface, is empty and no alias is created. b) you want to allow more than one IP, but no aliases IPROOT="192.168.0.1 127.0.0.1" 3.2 For a specific device xyz0 Link encap:Ethernet HWaddr 00:00:00:00:00:00 inet addr:10.0.0.1 Bcast:10.0.255.255 Mask:255.255.0.0 UP BROADCAST RUNNING NOARP MTU:1500 Metric:1 a) you want to add one alias for this interface IPROOT="xyz0:10.0.1.1" (the alias will be xyz0:) b) you want to add more than one alias on the same interface/network IPROOT="xyz0:10.0.1.1 xyz0:10.0.1.2" (the aliases will be xyz0: and xyz0:1) 3.3 For a specific network a) an additional alias for eth0, and ip 172.16.0.1/20 IPROOT="eth0:172.16.0.1/255.255.240.0" 4. Error Messages 4.1 From ifconfig (Aliasing) a) broadcast: Unknown host SIOCSIFADDR: Invalid argument - those are usually caused by an uninitialized interface, which results in funny values for the ifconfig statement + make sure that the you want to create an alias for, is up b) SIOCSIFADDR: No such device eth2:ZZZZ: ERROR while getting interface flags: No such device - this is a good sign, that you specified an interface which doesn't exist ... c) SIOCSIFNETMASK: Invalid argument Invalid IP number or netmask: 24 - a netmask was specified in the /XX format, which cannot be handled correctly by the script (for now) 4.2 From chbind a) Segmentation fault - you managed to call 'chbind --ip' please share the knowledge how you did it? b) Invalid netmask: 256.1.1.2 - obviously the netmask is wrong c) Invalid IP number or host name: 256.0.0.1 - obviously the ip/hostname is wrong 5. And the Future? 5.1 Useful Enhancements - extend the script to actually understand /XX netmasks, and convert them for ifconfig - add an option to display the actual ifconfig statements and IPOPT lists (would avoid a lot of questions) - fix the ifspec bug, and do some sanity checks regarding netmasks and interfaces ... - add a 'cleanup' option to 'remove' the aliases and mounts done by an 'enter' on a stopped vserver. 5.2 Internal Changes - use iproute2 instead of ifconfig - check for interface name length, to avoid collisions[4] 6. The End [1] it could be useful to specify this on a 'profile' basis, this way, a test profile could leave out the network stuff ... [2] struct { unsigned long addr; unsigned long mask; } solved; (jack, enrico, please fix this in both branches) [3] this can be changed in the kernel, but the ip comparison is linear, so each packet will be checked agains all addresses ... [4] actually the max length of a network interface name is determined by #define IFNAMSIZ 16 which means 15 chars and one zero, so the usual eth0 alias 'eth0:abcdefghij' will have 10 chars left for the vserver name and the suffix, which if the name is longer than 9 chars will be ignored (which gives nice misconfigurations) (C) 2003 Herbert Pötzl ------------------------------------------- Permission is granted to copy, distribute and/or modify this document under the terms of the GNU Free Documentation License, Version 1.2 or any later version published by the Free Software Foundation.