MSS in TCP

The maximum size packets that TCP sends can have a major impact on bandwidth, because it is more efficient to send the largest possible packet size on the network.

TCP controls this maximum size, known as Maximum Segment Size (MSS), for each TCP connection. For direct-attached networks, TCP computes the MSS by using the MTU size of the network interface and then subtracting the protocol headers to come up with the size of data in the TCP packet. For example, Ethernet with a MTU of 1500 would result in a MSS of 1460 after subtracting 20 bytes for IPv4 header and 20 bytes for TCP header.

The TCP protocol includes a mechanism for both ends of a connection to advertise the MSS to be used over the connection when the connection is created. Each end uses the OPTIONS field in the TCP header to advertise a proposed MSS. The MSS that is chosen is the smaller of the values provided by the two ends. If one endpoint does not provide its MSS, then 536 bytes is assumed, which is bad for performance.

The problem is that each TCP endpoint only knows the MTU of the network it is attached to. It does not know what the MTU size of other networks that might be between the two endpoints. So, TCP only knows the correct MSS if both endpoints are on the same network. Therefore, TCP handles the advertising of MSS differently depending on the network configuration, if it wants to avoid sending packets that might require IP fragmentation to go over smaller MTU networks.

The value of MSS advertised by the TCP software during connection setup depends on whether the other end is a local system on the same physical network (that is, the systems have the same network number) or whether it is on a different (remote) network.

MTU and MRU

http://www.networkers-online.com/blog/2016/03/understand-mtu-and-mru-the-full-story/

==========================

 

Understand MTU and MRU – The Full Story

MTU or Maximum transmission unit is a topic that pops up every once in a while in different discussions. Although it’s a simple concept, it causes a lot of confusion specially for those who are new to the field. MTU typically becomes an issue of concern during network changes, like adding new vendors equipment or upgrading to a new software. One reason for that is the difference in  implementations used by different vendors or even between different OS versions or equipment  from the same vendor. Here is an example for such confusion  MTU and ping size confusion.

On the other hand MRU which I was discussing today with a college or maximum receive unit is not talked about as much, probably because it rarely pops up in problems or configuration requirements and it typically matches MTU by default, but doesn’t have to be.

Let’s build up these concepts from scratch.

What is a data packet?

packet is the single unit of data that is routed between a source and a destination on the network. Each packet contains information to help devices route or switch the packet to its destination and actual data known as payload.

What is MTU?

MTU.png

maximum transmission unit (MTU) is the largest length of a packet that can be transmitted out of an interface toward a destination. When the word MTU is used plainly, we are typically referring to the interface MTU, but when talking about a protocol MTU (e.g IP MTU, MPLS MTU) we are typically referring to the maximum payload of the protocol itself.

 

Whether the headers are included or not is an implementation that can vary from box to another and from OS to another, so it should always be tested out specially when operating on a multivendor environment.

 

We can’t really understand how MTU is playing a part in  network operations, without understanding the concept of path MTU.

Path MTU: is to the lowest MTU of an interface on the path between the source and destination.  Path MTU is a very important aspect because it has a huge impact on the overall performance of the network and end user experience.

The image below presents an analogy to clarify the path MTU concept. As you can see in the image the yellow rectangles height represents the exit interface MTU of the router (don’t confuse this with bandwidth). The packet in the diagram can easily fit the MTU of the interfaces in the first segment connecting routers A and B; in the second segment, the MTU is smaller and such big packet doesn’t fit the yellow rectangle (interface MTU) in one chunk.  Therefore the path MTU is actually the MTU of the second exist interface (The smallest), because the whole path will only be able to pass packets that fits this MTU.

Path MTU Analogy
Path MTU Analogy

 

Default interface MTU values: source Wikipedia

Media Maximum Transmission Unit (bytes) Notes
Internet IPv4 Path MTU At least 68, max of 64KB Practical path MTUs are generally higher. Systems may use Path MTU Discovery to find the actual path MTU.
Internet IPv6 Path MTU At least 1280,max of 64KB, but up to 4GB with optional jumbogram Practical path MTUs are generally higher. Systems must use Path MTU Discovery to find the actual path MTU.
Ethernet v2 1500 Nearly all IP over Ethernet implementations use the Ethernet V2 frame format.
Ethernet Jumbo Frames 1501 – 9198 The limit varies by vendor. For correct interoperation, the whole Ethernet network must have the same MTU.Jumbo frames are usually only seen in special-purpose networks.
PPPoE over Ethernet v2 1492 = Ethernet v2 MTU (1500) – PPPoE Header (8)
PPPoE over Ethernet Jumbo Frames 1493 – 9190 = Ethernet Jumbo Frame MTU (1501 – 9198) – PPPoE Header (8)
What happens if a packet size is bigger than path MTU?

 

If one host is sending packets that are having a bigger length than the path MTU and IPv4 is in play, these packets will be fragmented if Don’t Fragment (DF) bit is not set. If they can’t be fragmented they will be dropped by the device processing them and an ICMP message with code “fragmentation needed” will be sent to the source to warn it about the problem. Fragmentation is generally a bad thing; it increases network overhead, consumes routers resources and results in many unwanted side effects.

When the source receives “fragmentation needed” ICMP packets, it needs to lower it’s packet size to match to avoid packets being dropped by middle routers.

If IPv6 is in play, we know that it doesn’t support fragmentation, such large packets will be dropped and ICMPv6 message “Packet too big” will be sent to the source to inform it that it needs to lower the packet size to avoid the drops.

 

There are some mechanism that are used solely to avoid these problems in the first place. Two of them are PMTUD and TCP MSS adjust.

 

What is path MTU discovery (PMTUD) ?

 

Path MTU discovery is a standardized mechanism that is used by end hosts to avoid fragmentation or packet drops. The basic idea is that the source host will assume that the path MTU is equal to it’s exit interface MTU and will send all packets on the path with (DF bit) set. If any of the packets is  bigger than the path MTU, it will be dropped by the middle routers and an ICMP message will be sent to the source to inform it that it needs to lower the packet size.

The process will continue by the host until it determines the suitable packet size and to detect any changes in the paths or it will remove the DF bit and allow the packets to be fragmented.

The process is pretty similar when using IPv6, with the difference that fragmentation is not allowed in IPv6 and there is No DF bit to set.

For full details refer to the following RFCs:

  1. RFC 1191Path MTU Discovery, J. Mogul, S. Deering (November 1990)
  2. RFC 1981Path MTU Discovery for IP version 6, J. McCann, S. Deering, J. Mogul (August 1996)

 

What is  TCP MSS?
TCP MSS

I wanted to touch briefly on Maximum segment size,  known as TCP MSS.

TCP MSS is an option in the TCP header that is used by the two ends of the connection independently to determine the maximum segment size that can be accepted by each host on this connection. The maximum segment size is simply the maximum data payload that a TCP packet can accommodate on the connection.

This option can be manipulated by network operators using a feature known as TCP MSS adjust. The feature allows middle routers to  intercept and alter this value if configured to do so as a technique to avoid MTU problems mentioned above.

 

Lastly,  MRU?

 

On the other hand maximum receive unit (MRU) is the largest packet size that an interface can receive, so it’s an ingress interface parameter. In most of the cases MRU equals MTU but it’s not a requirement. You can configure different values for both MTU and MRU to achieve some benefits.

 

What if packets received are bigger than interface MRU?

 

If a device is receiving packets that are bigger in length than interface MRU due to some reason, the packets will be considered “Too big” or oversized. Usually there will be a counter incrementing on the interface  and those packets will likely be dropped by the router’s forwarding plane.

 

Can UDP use Connect() ?

A client application uses the CONNECT command to establish a connection between a local socket and a remote socket.

The command supports both blocking and nonblocking sockets. When the socket is in blocking mode, the function does not return until a connection with the remote peer is established or until an error is received. When the socket is in nonblocking mode, the function returns immediately with either the 36 EINPROGRESS return code or an error.

The CONNECT command performs differently depending on the socket type:

Stream (TCP) sockets
If the application has not already issued an explicit bind, the CONNECT command completes the bind of the socket. The API then attempts to establish a connection to the remote socket that is specified by the name parameter. You can call the CONNECT command only once. Issuing additional CONNECT commands results in a 56 EISCONN error.
Datagram (UDP) sockets
The CONNECT command enables an application to associate a socket with the socket name of a peer. The socket then is considered to be a connected UDP socket. You can call the CONNECT command multiple times with different peer names to change the socket association.
Rules:

  • Using the CONNECT command on a UDP socket does not change the UDP protocol from a connectionless to a connection-based protocol. The UDP socket remains connectionless. The primary benefit of using connected UDP sockets is to limit communication with a single remote application.
  • When a UDP socket becomes a connected UDP socket, it can no longer use the SENDTO and RECVFROM commands. Connected UDP sockets use the socket commands READ, WRITE, SEND, or RECV to communicate with the remote peer, instead of using the SENTO and RECVFROM commands.
Tips:

  • For nonblocking sockets, use the SELECT command to determine when a connection has been established. Test for the ability to write to the socket.
  • A connected UDP socket can revert back to an unconnected UDP socket by calling CONNECT with 0 or AF_UNSPEC specified in the domain field of the name parameter.

Format

Read syntax diagramSkip visual syntax diagram
>>-SOCKET--(--"CONNECT"--,--socketid--,--name--)---------------><

Parameters

socketid
The descriptor of the local socket.
name
Identifies the remote socket.

The format for the name parameter depends on the socket type:

AF_INET sockets (IPv4)
name = “domain portid ipaddress
AF_INET6 sockets (IPv6)
name = “domain portid flowinfo ipaddress scopeid

where

  • The domain value is the decimal number 2 for AF_INET and the decimal number 19 for AF_INET6.
  • The portid value is the port number.
  • The ipaddress value is the IP address of the remote host. It must be an IPv4 address for AF_INET and an IPv6 address for AF_INET6.
  • The flowinfo value must be 0.
  • The scopeid value identifies the interfaces that are applicable for the scope of the address that is specified in the ipaddress field. For a link-local IP address, the scopeid field can specify a link index, which identifies a set of interfaces. For all other scopes, the scopeid field must be set to 0. Setting the scopeid field to 0 indicates that any address type and scope can be specified.

==================================================

https://www.ibm.com/support/knowledgecenter/en/SSLTBW_2.3.0/com.ibm.zos.v2r3.hala001/rexx_connect_r.htm

Volatile Constant will be used in c ?

This is possible and mostly used in embedded system.The example is Interrupt Status Register.As it is a status register , in the program we should not modify this variable.So it should be a constant.But this variable can be changed by the processor or hardware based on the interrupt condition.So when in the program ,we want to read the value of this varible , it should read the actual value with out any optimisation.For this reason ,the variable can be declared as volatile too.

 

======================

In an embedded system, this is typically used to access hardware registers that can be read and are updated by the hardware, but make no sense to write to (or might be an error to write to).

An example might be the status register for a serial port. Various bits will indicate if a character is waiting to be read or if the transmit register is ready to accept a new character (ie., – it’s empty). Each read of this status register could result in a different value depending on what else has occurred in the serial port hardware.

It makes no sense to write to the status register (depending on the particular hardware spec), but you need to make sure that each read of the register results in an actual read of the hardware – using a cached value from a previous read won’t tell you about changes in the hardware state.

A quick example:

unsigned int const volatile *status_reg; // assume these are assigned to point to the 
unsigned char const volatile *recv_reg;  //   correct hardware addresses


#define UART_CHAR_READY 0x00000001

int get_next_char()
{
    while ((*status_reg & UART_CHAR_READY) == 0) {
        // do nothing but spin
    }

    return *recv_reg;
}

 

Volatile Keyword Explanation

https://barrgroup.com/Embedded-Systems/How-To/C-Volatile-Keyword

 

he proper use of C’s volatile keyword is poorly understood by many programmers. This is not surprising, as most C texts dismiss it in a sentence or two. This article will teach you the proper way to do it.

Have you experienced any of the following in your C or C++ embedded code?

  • Code that works fine–until you enable compiler optimizations
  • Code that works fine–until interrupts are enabled
  • Flaky hardware drivers
  • RTOS tasks that work fine in isolation–until some other task is spawned

If you answered yes to any of the above, it’s likely that you didn’t use the C keyword volatile. You aren’t alone. The use of volatile is poorly understood by many programmers. Unfortunately, most books about the C programming language dismiss volatile in a sentence or two.

[Proper use of volatile is part of the bug-killing Embedded C Coding Standard.]

C’s volatile keyword is a qualifier that is applied to a variable when it is declared. It tells the compiler that the value of the variable may change at any time–without any action being taken by the code the compiler finds nearby. The implications of this are quite serious. However, before we examine them, let’s take a look at the syntax.

Syntax of C’s volatile Keyword

To declare a variable volatile, include the keyword volatile before or after the data type in the variable definition. For instance both of these declarations will declare an unsigned 16-bit integer variable to be a volatile integer:

volatile uint16_t x; 
uint16_t volatile y;

Now, it turns out that pointers to volatile variables are very common, especially with memory-mapped I/O registers. Both of these declarations declare p_reg to be a pointer to a volatile unsigned 8-bit integer:

volatile uint8_t * p_reg; 
uint8_t volatile * p_reg;

Volatile pointers to non-volatile data are very rare (I think I’ve used them once), but I’d better go ahead and give you the syntax:

uint16_t * volatile p_x;

And just for completeness, if you really must have a volatile pointer to a volatile variable, you’d write:

uint16_t volatile * volatile p_y;

Incidentally, for a great explanation of why you have a choice of where to place volatile and why you should place it after the data type (for example, int volatile * foo), read Dan Sak’s column “Top-Level cv-Qualifiers in Function Parameters” (Embedded Systems Programming, February 2000, p. 63).

Finally, if you apply volatile to a struct or union, the entire contents of the struct or union are volatile. If you don’t want this behavior, you can apply the volatile qualifier to the individual members of the struct or union.

Proper Use of C’s volatile Keyword

A variable should be declared volatile whenever its value could change unexpectedly. In practice, only three types of variables could change:

1. Memory-mapped peripheral registers

2. Global variables modified by an interrupt service routine

3. Global variables accessed by multiple tasks within a multi-threaded application

We’ll talk about each of these cases in the sections that follow.

Peripheral Registers

Embedded systems contain real hardware, usually with sophisticated peripherals. These peripherals contain registers whose values may change asynchronously to the program flow. As a very simple example, consider an 8-bit status register that is memory mapped at address 0x1234. It is required that you poll the status register until it becomes non-zero. The naive and incorrect implementation is as follows:

uint8_t * p_reg = (uint8_t *) 0x1234;

// Wait for register to read non-zero 
do { ... } while (0 == *p_reg)

This code will almost certainly fail as soon as you turn compiler optimization on.  That’s because the compiler will generate assembly language (here for an 16-bit x86 processor) that looks something like this:

  mov p_reg, #0x1234
  mov a, @p_reg
loop:
  ...
  bz loop

The rationale of the optimizer is quite simple: having already read the variable’s value into the accumulator (on the second line of assembly), there is no need to reread it, since the value will (duh!) always be the same. Thus, from the third line of assembly, we enter an infinite loop. To force the compiler to do what we want, we should modify the declaration to:

uint8_t volatile * p_reg = (uint8_t volatile *) 0x1234;

The assembly language now looks like this:

  mov p_reg, #0x1234
loop:
  ...
  mov a, @p_reg
  bz loop

The desired behavior is thus achieved.

Subtler sorts of bugs tend to arise when registers with special properties are accessed without volatile declarations. For instance, a lot of peripherals contain registers that are cleared simply by reading them. Extra (or fewer) reads than you are intending could result in quite unexpected behavior in these cases.

Interrupt Service Routines

Interrupt service routines often set variables that are tested in mainline code. For example, a serial port interrupt may test each received character to see if it is an ETX character (presumably signifying the end of a message). If the character is an ETX, the ISR might set a global flag. An incorrect implementation of this might be:

bool gb_etx_found = false;

void main() 
{
    ... 
    while (!gb_etx_found) 
    {
        // Wait
    } 
    ...
}

interrupt void rx_isr(void) 
{
    ... 
    if (ETX == rx_char) 
    {
        gb_etx_found = true;
    } 
    ...
}

[NOTE: We’re not advocating use of global variables; this code uses one to keep the example short/clear.]

With compiler optimization turned off, this program might work. However, any half decent optimizer will “break” the program. The problem is that the compiler has no idea that gb_etx_found can be changed within the ISR function, which doesn’t appear to be ever called.

As far as the compiler is concerned, the expression !gb_ext_found will have the same result every time through the loop, and therefore, you must not ever want to exit the while loop. Consequently, all the code after the while loop may simply be removed by the optimizer. If you are lucky, your compiler will warn you about this. If you are unlucky (or you haven’t yet learned to take compiler warnings seriously), your code will fail miserably. Naturally, the blame will be placed on a “lousy optimizer.”

The solution is to declare the variable gb_etx_found to be volatile. After which, this program will work as you intended.

Multithreaded Applications

Despite the presence of queues, pipes, and other scheduler-aware communications mechanisms in real-time operating systems, it is still possible that RTOS tasks will exchange information via a shared memory location (i.e., global storage). When you add a preemptive scheduler to your code, your compiler has no idea what a context switch is or when one might occur. Thus, a task asynchronously modifying a shared global is conceptually the same as the ISR scenario discussed above. Thus all shared global objects (variables, memory buffers, hardware registers, etc.) must also be declared volatile to prevent compiler optimization from introducing unexpected behaviors. For example, this code is asking for trouble:

uint8_t gn_bluetask_runs = 0;

void red_task (void) 
{   
    while (4 < gn_bluetask_runs) 
    {
        ...
    } 
    // Exit after 4 iterations of blue_task.
}

void blue_task (void) 
{
    for (;;)
    {
        ...
        gn_bluetask_runs++;
        ...
    }
}

This program will likely fail once the compiler’s optimizer is enabled. Declaring gn_bluetask_runs with volatile is the proper way to solve the problem.

[NOTE: We’re not advocating use of global variables; this code uses a global because it is explaining a relationship between volatile and global variables.]

[WARNING: Global variables shared by tasks and ISRs will also need to be protected against race conditions, e.g. by a mutex.]

Final Thoughts

Some compilers allow you to implicitly declare all variables as volatile. Resist this temptation, since it is essentially a substitute for thought. It also leads to potentially less efficient code.

Also, resist the temptation to blame the optimizer or turn it off when you encounter unexpected program behavior. Modern C/C++ optimizers are so good that I cannot remember the last time I came across an optimization bug. In contrast, I regularly come across failures by programmers to use volatile.

If you are given a piece of flaky code to “fix,” perform a grep for volatile. If grep comes up empty, the examples given here are probably good places to start looking for problems.