VLF Group compute cluster – Cluster organization

Cluster organization

Cluster organization

The cluster is composed of a head node named nansen and two distinct sets of cluster nodes. The first set of cluster nodes, numbered cluster001 through cluster016, have 8 cores each and 32 GB of RAM. These nodes are collected together under one job queue named batch. This job queue is intended primarily for MPI jobs, since infiniband is well-suited to the type of network traffic that MPI generates. They are connected to the head node and to each other through an infiniband network. Outgoing network connections are handled with NAT (network address translation) rather then real routing.

Infiniband network

  • Network: 192.168.2.0
  • Netmask: 255.255.255.0
  • Broadcast: 192.168.2.255
  • Default route: 192.168.2.1 (through nansen)

The second set of cluster nodes are numbered cluster017 through cluster026. These have 12 cores each and 64 GB of RAM. They make up the job queue named batchnew. This queue is primarily intended for multi-threaded or single-threaded jobs characteristic of simple C, MATLAB, or Python code when communication latency or throughput is not a primary concern. These are connected to each other through a gigabit ethernet switch. Outgoing connections are also NATed.

Gigabit ethernet network

  • Network: 192.168.1.0
  • Netmask: 255.255.255.0
  • Broadcast: 192.168.1.255
  • Default route: 192.168.1.1 (through nansen)