Virtualization: Resistance Is Futile: vSphere 5 Host Network Design

Sunday, April 29, 2012

vSphere 5 Host Network Design - 10GbE vDS Design

This design represents the highest performance, most redundant and also most costly option for a vSphere 5 environment. It is entirely feasible to lose three out of the four uplink paths and still be running without interruption and most likely with no performance impact either. When looking for a bullet proof and highly scalable configuration within the data centre then this would be a great way to go.

The physical switch configuration might be slightly confusing to look out without explanation. Essentially what we have here are four Nexus 2000 series switches that are uplinked into two Nexus 5000 series switches. The green uplink ports in the design show that each 2K expansion switch has 40GbE of uplink capacity to the 5Ks. Network layer 3 routing daughter cards are installed within the Nexus 5Ks and now traffic can be routed within the switched environment instead of going out to an external router. In other words traffic from a host will travel up through a 2K, hit a 5K and then come back down where required. It isn't apparent from the design picture, but Keep-Alive traffic is run between the console ports of the two 5K switches.

It is assumed that each host has four 10GB NICs provided by 2 x PCI-x Dual Port expansion cards. All NICs are assigned to a single virtual standard switch and bandwidth control is performed using Load Based Teaming (LBT) in conjunction with Network IO Control (NIOC) and Storage IO Control (SIOC). A good writeup on how to configure NIOC shares can be found on the VMware Networking Blog and whilst this information is specific for 2 x 10GbE uplinks it also holds true when using four 10GbE connections. LBT is a teaming policy only available when using a virtual Distributed Switch (vDS).

LACP is not used as it wouldn't be a good design choice for this configuration. There are very few implementations where LACP/Etherchannel would be valid. For a comprehensive writeup on the reasons why please check out this blog post. A valid use case for LACP could be made when using the Nexus 1000V as LBT is not available for this type of switch.

In order to gain the performance increase of Jumbo Frames for the storage layer all networking components will need to have Jumbo Frames enabled. The requires end-to-end configuration from the hosts through the network and to the storage arrays. There is definitely a performance increase by incorporating Jumbo Frames and this is outlined in the following blog post. It is important to note that enabling Jumbo Frames on the single switch will allow all traffic to transmitt at 9000MTU. This means that Management, vMotion, FT and Storage will all use Jumbo Frames. VMs will not use Jumbo Frames unless this feature is enabled on the network adapter inside the OS of the VM.

Trunking needs to be configured on all physical switch to ESXi host uplinks to allow all VLAN traffic including; Management, vMotion, FT, VM Networking and Storage. Trunking at the physical switch will enable the definition of multiple allowable VLANs at the virtual switch layer. All VLANs used must be able to traverse all uplinks simultaneously.

When running Cisco equipment there is the potential to use the Rapid Spanning Tree Protocol (802.1w) standard. This means there is no requirement to configure trunk ports with Portfast or disable STP as the physical switches will automatically identify these functions correctly. If running any other type of equipment the safest option would probably be to disable STP and enable Portfast on each trunk port, but please refer to the switch manufacturer manual for confirmation.

Running vCenter and the vCenter database on the same clusters that are managed is going to create a dangerous circular dependency. Therefore it is strongly recommended to make sure that the environment has a management cluster dedicated for vCenter and high level VMs, where the management cluster uses virtual standard switches (vSS). One alternative to a dedicated management cluster would be to run vCenter and it's database as physical servers outside of vSphere.

*** Updates ***

05/05/2012 - Minor update to Jumbo Frames paragraph. Thanks to Eric Singer for his observations.

07/05/2012 - Moved diagram to top of article so that visitors wanting to reference design do not need to scroll down the article to view the diagram. Fixed IP address and VMkernel typos.

13 comments:

Eric SingerMay 3, 2012 at 7:54 PM
Scott Lowe published this on his tech short takes and I felt it great timing as I had been thinking about a design like this. However, the design consideration that I've been struggling with, is having jumbo frames enabled on a shared interface. My concern is this, let's say I have file servers, domain controllers, exchange servers, all your typical system sitting in this awesome infrastructure. Communications between each server will be fine as all paths support jumbo frames. What happens when our clients, who's NIC's are configured for the standard 1500 mtu try and access these standard servers? Won't their be packet fragmentation? Would that cause issues for them?

Great article BTW...
ReplyDelete
Replies
UnknownMay 3, 2012 at 8:27 PM
Unless you modify the network settings within a VM it will continue to operate using the default packet size of 1500MTU and only the storage layer will operate at 9000MTU. When a TCP connection is initiated between devices or servers the handshake process will determine the packet size that will be used for communication; if one device can only work at 1500MTU then the handshake process should take care of this. The performance increase of Jumbo Frames is real, but it requires end-to-end support so if you don't have that then just leave it out of the design.
ReplyDelete
Replies
AnonymousMay 4, 2012 at 3:36 AM
Hi Paul,

Excellent post. It's important to note, though, that unless a VM is supposed to run switching/bridging software, all host-facing physical switch ports should be configured, as you state, with Portfast and BPDU Guard. This is regardless of physical switch manufacturer. If a VM is running switching/bridging software, then an STP protocol should be used, such as RSTP, as you mentioned. If one is worried about further BPDU security, they can use something like VMware's vShield App.

This info can be found in the VMware Networking Blog, VDS Best Practices:

http://blogs.vmware.com/networking/2011/11/vds-best-practices-rack-server-deployment-with-eight-1-gigabit-adapters.html

Cheers!

Mike Brown
http://VirtuallyMikeBrown.com
https://twitter.com/#!/VirtuallyMikeB
http://LinkedIn.com/in/michaelbbrown
ReplyDelete
Replies
PKJune 13, 2012 at 4:45 PM
Great article, something I was really looking for. what would you recommend for 2 NIC setup? we have dell M610 with one dual port 10gb Mezz card.SAN traffic is separate with its own Mezz card.
ReplyDelete
Replies
Rob BerginJuly 3, 2012 at 9:41 AM
Paul - I run a similar environment - dual 10 Gbe NICs - but I am being told that since the server's motherboard is a single point of failure - adding a 2nd NIC doesn't add a ton of HA.

Would you consider a single NIC with 2 x 10 Gbe ports (which alot of blades run - they run a single Mezz card) and then go to two Interconnects out to 2 switches.

Single 2 port NIC vs. Dual 1 port NICs?
ReplyDelete
Replies
kunalJune 27, 2014 at 11:16 AM
How can we accommodate 2 iscsi port group for port binding in the current implementation.
ReplyDelete
Replies

Add comment