TCP Keepalives with Java to keep your load balancer happy

Thursday, February 2nd, 2012 by Gary Richards - Categories: IPVS / LVS, Java, Linux

So…

IPVS load balancer configured in Direct Routing mode. Due to this we have the shiny Netfilter Connection tracking support for IPVS turned off.

Anyhow, along comes our real servers, server A and server B, each running an instance of application X.

Application Y on servers C, D and E talk to the virtual server for the above real servers and everythings happy.

No real rocket science happening here, alot of people have been using IPVS without it for years…

Server F comes along, also running application Y, but makes significantly longer requests to the virtual server. Requests that are made and then take a fairly substantial time to return any data.

Application Y on server F regularly logs socket read timeouts (after hitting the ridiculously high default socket read timeout that application Y appears to be configured with by default).

Further investigation shows that server F (up until the point that the read timeout occurs) still thinks that the connection is established. Similarly, server A (the real server that this request happened to end up on) still thinks that the connection is established. The machine running IPVS however, has removed the tracked connection from its internal connection tracking.

I see this using ipvsadm
# ipvsadm -Lnc
IPVS connection entries
pro expire state source virtual destination
TCP 13:20 ESTABLISHED 192.168.130.8:37183 192.168.130.200:9700 192.168.130.10:9700

Notice the expire field? It seems that every time the virtual server receives a packet destined for this connection the expire timeout is reset. So what’s the default timeout? Watching the same ipvsadm output as seen above when a new connection is established suggests 15 mins 02 secs… I wonder if we can confirm that somehow?

# ipvsadm -L --timeout
Timeout (tcp tcpfin udp): 900 120 300

So we can (almost). TCP connection timeout is 900 seconds (aka 15 mins).

So the reason our connection breaks is due to this timeout removing the tracked connection from IPVS’s own connection tracking table. Excellent, we know what’s causing it… So how do we fix it?

Most people would simply up the TCP timeout, but I don’t like that idea. If something is going to hold onto a connection for longer than 15 mins and not send any data over that connection, shouldn’t it try to help me in confirming that its connection is still really alive rather than me blindly increasing what seems to be like a fairly reasonable (if not already too high timeout?).

Enter TCP Keepalive.

It seems that TCP keepalive is an optional feature of the TCP protocol. Fortunately both Linux and Java (which is what both ends of applications X and Y are written in) seem to support it. So can we test that it would solve our problem?

I wrote a simple Java class to connect to our virtual server above and to enable TCP keepalive on the socket

import java.net.*;
import java.io.*;

class SocketTest {
  public static void main(String args[]) {
    SocketTest socketTest = new SocketTest();
  }

  public SocketTest() {
    try {
      Socket sock = new Socket("virtual.server.name", 9700);
      sock.setKeepAlive(true);
      DataInputStream input = new DataInputStream(sock.getInputStream());
      input.readLine();
    } catch (Exception e) {
      System.out.println(e.toString());
    }
  }
}

The important line being
sock.setKeepAlive(true);
This tells our socket to send keep alives. Great, lets try it out….

Ok, it runs, it connects (i’m using tcpdump to confirm) and… oh, beyond the three way TCP handshake I see nothing.

I wonder how often the TCP keepalive is sent when enabled? A quick trip in the Googlecopter suggested 2 hours.

Yep, 2 hours… So, I open a socket, enable tcp keepalive on that socket, then I need to wait 2 hours before the first keepalive is sent. That’s not fun is it! I must be able to change the value….

Scary, Java doesn’t seem to be able to do anything other than enable or disable the SO_KEEPALIVE socket option on the underlying socket. So what is the underlying socket in this instance?

Ok, it’s native Linux socket code, so what/how can I configure those? Unfortunately other than enabling/disabling SO_KEEPALIVE options on them, I can’t change the timeout. This explains why Java can’t do it, so how do I change it?

Enter /proc and the magic this is system wide settings for things like this on a Linux box.


$ cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
$ cat /proc/sys/net/ipv4/tcp_keepalive_intvl
75
$ cat /proc/sys/net/ipv4/tcp_keepalive_probes
9

Now we’re onto something. The time before a keepalive is sent is 7200 seconds (or 2 hours). If a response isn’t received then we send another in 75 seconds, then another, then another, 9 times. So if my connection is idle, it would take up until 7200 + (75*9) = 7875 seconds.

Obviously that’s far to long for our load balancer. So what happens if we tweak these values slightly?
If I set keepalive_time to 600 seconds (or 10 mins)
And I set keepalive_intvl to 60 seconds
And I set keepalive_probes to 240 seconds
(I think that totals 14 mins?)

What happens?

Lo and behold, every 10 mins, the underlying Linux socket code appears to be sending a tcp keepalive to the server. Almost forgetting about the actual IPVS problem, what happens there? As expected, the tcp keepalive is causing the expire counter of each entry in the IPVS connection tracking tables to reset and my long running idle connections are no longer lost.

So I don’t have to increase the load balancer timeout afterall! Win win for everyone i’d say ;)

Leave a Reply