Thursday, October 9, 2014

RSPAN on Dell PowerConnect N-Series switches

What is RSPAN?
What is SPAN?

RSPAN means Remote Switched Port ANalyzer.  It is an extension of SPAN.  So to understand RSPAN (there's really not much to it), you need to understand what SPAN is and what it does.

SPAN is the same as port mirroring, but usually in reference to Cisco products.  Port mirroring allows you to copy (or mirror) traffic from a given port (or ports) to another port (or ports).  The ports that are being copied are called source ports.  The ports that are being copied to are called destination or target ports.  The traffic which is being copied from source ports to destination ports can be specified to be going in either direction.  If you don't specify a direction, it's usually both ingress and egress traffic being mirrored to the destination port.

Port mirroring is useful when it comes to diagnosing network problems.  By using port mirroring you can listen in on traffic in order to see what may be hosing a network.  You can always just do a normal Wireshark packet capture without using port mirroring, but then you're limited to listening only to the traffic that is destined to or flooded to the particular host running Wireshark.  Port mirroring allows your Wireshark client to capture traffic which may be destined to other devices, or which may be passing through ports that you're interested in.

Another very nice thing about port mirroring is that on some switches you can select the CPU as a source port.  This is useful if the switch has high CPU utilization and you want to know why.  In a healthy network, you shouldn't see too much traffic hitting the CPU because packet and frame matching should be handled by the switch ASICs.  Packets and frames are processed much more efficiently when they are handled by the switch ASICs.  So by doing CPU packet captures with a Wireshark client plugged into the destination port, it's sort of like a doctor with his stethoscope listening in on a patient's heart.  Except here, you're listening in on packets hitting the the switch CPU.

One caveat about the destination port is that, depending on the product, the device that is connected to the target port will not have access to network resources through that port.  It's solely a listening client.  It cannot participate.

What is RSPAN?

RSPAN means Remote Switched Port ANalyzer.  It builds off of ordinary SPAN and tends to some of the limitations of SPAN.  It allows you to probe deeper into the network.

What problems does RSPAN solve?

RSPAN makes it so that your source and destination ports do not have to be on the same switch.  That's the limitation that RSPAN allows you to break through.  Take the following example.  If you have several IP phones connected to two switches uplinked to one another, and you want their voice streams copied to, say, a VoIP analyzer appliance on yet another switch, you have to use RSPAN.  RSPAN is made precisely for this type of application.


If you want to do this, you must use RSPAN


















RSPAN is able to do this by introducing the concept of an RSPAN VLAN.  This is a VLAN which is specifically for bridging mirrored traffic from source ports to destination ports across switches.

Here's how it works.  This example uses three switches, two with the source ports, and another with the destination ports.
  1. Create a VLAN the ordinary way.  This VLAN is going to be the RSPAN VLAN.  This needs to be done on both switches.

    en
    config
    vlan data
    vlan 997
    end
  2. Configure the VLAN to be the RSPAN VLAN.  This needs to be done on both switches.

    en
    config
    int vlan 997
    remote-span
    end
  3. Configure a monitor session on the switch with the source ports and designate one of its ports as a 'reflector' port.  Make sure that the reflector port is a trunk port which allows the RSPAN VLAN (switch mode trunk allows all VLANs by default).

    en
    config
    monitor session 1 source int gi1/0/1 both
    monitor session 1 destination remote vlan 997 reflector-port gi1/0/48
    monitor session 1 mode
    int gi1/0/48
    switch mode trunk
    end
  4. Configure a monitor session on the switch with the destination port or ports and specify the RSPAN VLAN.  Make sure the port on the other end of the source switch's reflector port is set to trunk mode which allows the RSPAN VLAN.

    en
    config
    monitor session 1 source remote vlan 997
    monitor session 1 destination interface gi1/0/20
    monitor session 1 mode
    int gi1/0/48
    switch mode trunk
    end
Bam.  That's it.  


Configuration example

Examples are always good.  I see no particular reason to not use the same scenario as before.  Let's use some specific ports this time.



















Based on what we already know about how to configure RSPAN, let's consider how we might go through this scenario.

  1. Create the RSPAN VLAN.  This is going to be done on all three switches.

    en
    config
    vlan data
    vlan 997
    exit
    int vlan 997
    remote-span
    end
  2. Create the monitor sessions on the top two n3048 switches.  Configure the appropriate ports as source ports.  Configure a reflector port.  Make sure that the reflector port is set to trunk mode which allows the transit of the RSPAN VLAN.  This configuration will work for both of the n3048 switches.

    en
    config
    monitor session 1 source int gi1/0/1 both
    monitor session 1 source int gi1/0/2 both
    monitor session 1 source int gi1/0/3 both
    monitor session 1 source int gi1/0/11 both
    monitor session 1 source int gi1/0/12 both
    monitor session 1 source int gi1/0/13 both
    monitor session 1 destination remote vlan 997 reflector-port gi1/0/47
    monitor session 1 mode
    int gi1/0/47
    switch mode trunk
    end
  3. Create the monitor session on the bottom n2048 switch.  This is going to be the 'destination' switch, if you will (the switch with the destination port(s)).  Specify the RSPAN VLAN in the monitor session and the destination port.  Make sure that the port connecting to the source switch's reflector port is set to trunk mode which allows the transit of the RSPAN VLAN.

    en
    config
    monitor session 1 source remote vlan 997
    monitor session 1 destination interface gi1/0/1
    monitor session 1 mode
    int gi1/0/48
    switch mode trunk
    end
That's it!  E facile.

Sunday, September 28, 2014

Demystifying VLT

What is VLT?

VLT means virtual link trunking.  It is a proprietary Force10 technology.  It does the same things as Cisco's Virtual Port Channel (VPC) and IEEE 802.1AX-2008 multi-chassis LAG (MLAG).  The idea is the exact same.  You create a set of VLT peers.  Once VLT is up and running, you can take another switch and create a LAG split between those two VLT peers.

VLT peering causes the two switches to become a single virtual switch so that, say, another switch which splits a LAG between the two VLT peers doesn't realize that it's splitting its LAG between two switches.  It thinks it's connected to one switch.  VLT can do this by peering switches together and by syncing ARP and MAC entries between the switches.  So in the simplest VLT scenario like depicted below, only the two switches at the top have VLT configuration and can be said to be participating in VLT.  The switch at the bottom has a single ordinary LAG connecting to the two switches at the top.  It does, however, have to match the type of LAG that the VLT switches are using--LACP or static.
This is the simplest VLT scenario possible.


















What problems does VLT solve?

I think of VLT as an in-between solution--somewhere between stacking, and not stacking.  So for instance, with VLT you have link redundancy, with no blocking and full link utilization.  But you can get the same things by simply stacking the switches.  So why don't people just stack the switches to effectively make them one switch instead of bothering with VLT?  Good question, I'm glad you asked.  Because switches in VLT are not true stacks they have some level of independence which makes upgrades relatively painless.  It may be possible to upgrade a switch stack without any downtime, but it's a terrible idea to try to do.  But you can do it with VLT with some caveats.**
Upgrading a stack is a nightmare for SLA-critical environments






















The non-stacking solution involves the use of STP.



















This is a fairly common setup.  This solution provides switch independence which makes upgrades easier.  However, because there is a loop, STP must block a port.  This often causes issues because it's common for people to not tune STP parameters so that the proper port gets blocked.  You generally do not want the interswitch LAG to be blocking.  The problem with a blocked link, of course, is the loss of bandwidth.  You have an entire link that is in standby, which in this scenario, is likely to be a 40gbps port, since s4820t switches have 40gbps uplinks.  That's a massive amount of wasted bandwidth.  But there are hacky ways around this, such as the implementation of PVST, or MSTP.  By configuring PVST or MSTP, you can avoid wasting bandwidth by selecting blocked ports on a per VLAN basis, but it's a kludge.  There's more configuration and consideration involved.  People are unlikely to implement it properly.  VLT allows you to avoid all of that.

So in essence what VLT does is connect switches so that they can appear to be one switch from the outside of the VLT domain in order to allow for loopless redundancy, while having a level of independence between one another for easy upgrades.


Configuration example

These are the steps involved in getting it all set up for this particular scenario.  It's not a one-size fits all.  I got this configuration sample here.  I omit parts of the configuration, because there's lot of redundancy.  But I explain everything pretty completely about which parts are missing.

  • Configure RSTP.  One peer should be 0; other should be 4096.
    This is to ensure that the STP root is one of the VLT peers.
  • Check and open ports.
    Make sure that the ports are enabled.  FTOS has ports disabled by default.
  • Configure backup link.
    This generally uses management ports to maintain the health of the VLT peer status.
  • Configure VLTi port-channel.
    This is used to synchronize ARP and MAC information across between the peers so that they can appear to be one switch from the outside.  
  • Configure VLT domain (vlt domain 999, back-up destination, peer link port-channel)
  • Verify VLT domain and VLTi are working
    Sh vlt brief
    This is how you can know that the VLT is working.
  • Configure access port-channel on peer
    Here you configure the port channels are on each peer and relate them to one another. 
  • Create normal port channel, then specify that it is part of peer LAG (vlt-peer-lag)
    This refers to the port-channel on the switch outside of the VLT domain.




















Force10_VLTPeer1 Switch Configuration
Force10(conf)#hostname Force10_VLTPeer1

1. Configure RSTP
Force10_VLTPeer1#configure
Force10_VLTPeer1(conf)#protocol spanning-tree rstp
Force10_VLTPeer1(conf-rstp)#no disable
Force10_VLTPeer1(conf-rstp)#00:04:08: %STKUNIT1-M:CP %SPANMGR-5-STP_ROOT_CHANGE: RSTP root changed. My Bridge ID: 32768:0001.e88b.1b1d Old Root: 32768:0000.0000.0000 New Root: 32768:0001.e88b.1b1d
Force10_VLTPeer1(conf-rstp)#bridge-priority 0
Force10_VLTPeer1(conf-rstp)#00:04:17: %STKUNIT1-M:CP %SPANMGR-5-STP_ROOT_CHANGE: RSTP root changed. My Bridge ID: 0:0001.e88b.1b1d Old Root: 32768:0001.e88b.1b1d New Root: 0:0001.e88b.1b1d
Force10_VLTPeer1(conf-rstp)#end

This sets up RSTP on Peer 1, and configures it to be the root. You'll be doing the same on Peer 2, except you'll be using priority 4096.  

2. Check and open ports
Force10_VLTPeer1#show interfaces status | grep 24|60
Te 1/24               Down   Auto      Auto   --
Fo 1/60               Down   40000 Mbit Auto   --
Force10_VLTPeer1#configure
Force10_VLTPeer1(conf)#interface range te 1/24 , fo 1/60
Force10_VLTPeer1(conf-if-range-te-1/24,fo-1/60)#no shutdown
00:05:57: %STKUNIT1-M:CP %IFMGR-5-ASTATE_UP: Changed interface Admin state to up: Te 1/24
00:05:57: %STKUNIT1-M:CP %IFMGR-5-ASTATE_UP: Changed interface Admin state to up: Fo 1/60
Force10_VLTPeer1(conf-if-range-te-1/24,fo-1/60)#end

Here, you're just making sure that the ports are open.  You'll do it on both switches.

3a. Configure VLTI port-channel
Force10_VLTPeer1(conf)#interface port-channel 100
Force10_VLTPeer1(conf-if-po-100)#no switchport
Force10_VLTPeer1(conf-if-po-100)#no ip address
Force10_VLTPeer1(conf-if-po-100)#channel-member fortyGigE 1/60
Force10_VLTPeer1(conf-if-po-100)#no shutdown
00:07:00: %STKUNIT1-M:CP %IFMGR-5-ASTATE_UP: Changed interface Admin state to up: Po 100
Force10_VLTPeer1(conf-if-po-100)#end

This creates the LAG which is going to be used for the VLT Interconnect (VLTi).  You need to do it on both peers.

3b. Configure VLT Domain
Force10_VLTPeer2#sh ip interface brief | grep YES
ManagementEthernet 0/0   172.28.17.251   YES Manual up                    up
Force10_VLTPeer1#configure
Force10_VLTPeer1(conf)#vlt domain 999
Force10_VLTPeer1(conf-vlt-domain)#back-up destination 172.28.17.251
Force10_VLTPeer1(conf-vlt-domain)#peer-link port-channel 100
Force10_VLTPeer1(conf-vlt-domain)#end

This sets up the VLT domain.  It configures the domain ID (999), it configures the backup link (for heartbeat and health of VLTi), and sets the previously created port-channel to be the VLTi.  You'll do the same thing on the other peer, but you'll configure the back-up to be the management IP of the other switch.  Otherwise, it'll all be the same.

3c. Verify VLT Domain and VLTI (ICL) are working
(when Peer 2 configuration is completed)
Force10_VLTPeer1#show vlt brief
 VLT Domain Brief
------------------
 Domain ID:                    999
 Role:                         Secondary
 Role Priority:                32768
 ICL Link Status:              Up
 HeartBeat Status:             Up
 VLT Peer Status:              Up
 Local System MAC address:     00:01:e8:8b:1b:1d
 Remote System MAC address:    00:01:e8:8b:19:ac
Force10_VLTPeer1#

'show vlt brief' will tell you whether your VLT is functional.  

4. Configure access port-channel on peer
Force10_VLTPeer1#configure
Force10_VLTPeer1(conf)#interface tengigabitethernet 1/24
Force10_VLTPeer1(conf-if-te-1/24)#port-channel-protocol lacp
Force10_VLTPeer1(conf-if-te-1/24-lacp)#port-channel 110 mode active
00:22:42: %STKUNIT1-M:CP %IFMGR-5-ASTATE_UP: Changed interface Admin state to up: Po 110
Force10_VLTPeer1(conf-if-te-1/24-lacp)#exit
Force10_VLTPeer1(conf-if-te-1/24)#exit
Force10_VLTPeer1(conf)#interface port-channel 110
Force10_VLTPeer1(conf-if-po-110)#switchport
Force10_VLTPeer1(conf-if-po-110)#vlt-peer-lag port-channel 110
Force10_VLTPeer1(conf-if-po-110)#end

This sets up the access port-channel--the port-channel which is going to connect to the s50 which is not part of the VLT domain.  Notice that he sets it up like an ordinary LACP port-channel, but under the port-channel, after he configures it to be in switchport mode, he mentions that it's part of the peer LAG.  The same is done on the other peer.  

Access Switch Configuration
Force10#show interfaces status | grep 45|47
Gi 1/45               Down   Auto      Auto   --
Gi 1/47               Down   Auto      Auto   --
Force10#configure
Force10(conf)#interface range gi 1/45 , gi 1/47
Force10(conf-if-range-gi-1/45,gi-1/47)#no shutdown
02:02:46: %STKUNIT1-M:CP %IFMGR-5-ASTATE_UP: Changed interface Admin state to up: Gi 1/45
02:02:46: %STKUNIT1-M:CP %IFMGR-5-ASTATE_UP: Changed interface Admin state to up: Gi 1/47
Force10(conf-if-range-gi-1/45,gi-1/47)#02:02:46: %STKUNIT1-M:CP %IFMGR-5-OSTATE_UP: Changed interface state to up: Gi 1/45
02:02:46: %STKUNIT1-M:CP %IFMGR-5-OSTATE_UP: Changed interface state to up: Gi 1/47
Force10(conf-if-range-gi-1/45,gi-1/47)#port-channel-protocol lacp
Force10(conf-if-range-gi-1/45,gi-1/47-lacp)#port-channel 110 mode active
02:04:19: %STKUNIT1-M:CP %IFMGR-5-ASTATE_UP: Changed interface Admin state to up: Po 110
Force10(conf-if-range-gi-1/45,gi-1/47-lacp)#02:04:21: %STKUNIT1-M:CP %LACP-5-PORT-GROUPED: PortChannel-110-Grouped: Interface Gi 1/45 joined port-channel 110
02:04:21: %STKUNIT1-M:CP %IFMGR-5-OSTATE_UP: Changed interface state to up: Po 110
02:04:22: %STKUNIT1-M:CP %LACP-5-PORT-GROUPED: PortChannel-110-Grouped: Interface Gi 1/47 joined port-channel 110

Configuration on this switch is simple.  You just set up the port-channel as though you're connecting to a single switch.  


Building on VLT

Once you've created the VLT domain and VLTi, if you want to make a more complex topology, all you need to do is create more access port-channels.  This is simple.  You just repeat step four but with the additional access port-channels, po111, and po112 on each peer.  Then you set up the port-channels on the switches which are on the other end of the access port-channels.





















Other considerations
This is a tutorial to explain the basics of VLT.  There are many different kinds of VLT topologies shown in the Force10 Configuration Guide.

You might wonder why use VLT at all.  You could achieve many of the same results by using OSPF.  This is possible because OSPF does equal cost load balancing.  By using OSPF for load balancing you can have it all: full utilization of all links and switch independence for easy upgrades.  Indeed, that's why Force10 uses OSPF for their spine and leaf architecture.  The only problem is that it doesn't allow you to extend layer 2 out to the core.  So that makes it hard for devices on a common network to communicate with each other if they're on different switches.  So the type of non-blocking architecture that you employ really depends on traffic flows your network uses.

PC1 and PC2 cannot be in the same
network if OSPF is being used for non-
blocking architecture.
















Caveats

  • Don't try to implement a VLT architecture in a live production network unless really know what you're doing.  
  • When implementing VLT, use a configuration that you know works.  VLT has a lot of dependencies and caveats.  
  • Read the manual for caveats.  There is a giant list of them (use RSTP, don't do VLT with stacked switches, etc.).  VLT is not as simple as setting up a LAG, and it needs to be set up properly, and tested in order for it to work well.  Don't just think, 'Well, I understand the theory, and this should work.'  Don't implement VLT willy nilly.  


**Because VLT provides some switch independence, having VLT pairs running different firmware versions should not be an issue.  But I've run into issues with ARP entries not syncing across the VLT peers if they're not running the same firmware.  It's all documented in the release notes.  So much for seamless upgrades.  That said, it's still much better than upgrading stacked switches.  Bottom line: schedule a maintenance period for upgrades.  

Wednesday, August 27, 2014

Understanding FCoE

I have very little real world experience with FCoE because I never work with it.  It never comes up in my cases.  That makes it hard to learn about it.  However, one of the best ways for me to learn something is to research something, explain it, think of what questions an imaginary student might think to ask, research and find out the answers, write them all down, and then lay it all out in a coherent and consumable way.  The result is the following.

I talked with Brian, one of our trainers about FCoE.  One of the things he said is 'The trick is to first learn FC.'  I think he's right.  So here goes.


Fibre-channel basics

Fibre-channel is different from Ethernet.  Ethernet is versatile, and thus, it is used for everything.  Fibre-channel networks are largely used for distributed storage and carrying SCSI commands and payloads from hosts to storage targets and vice versa.

iSCSI for instance, carries SCSI commands inside of IP packets, which are encapsulated inside of Ethernet frames.  The Ethernet networks carry the payloads like normal Ethernet traffic according to the way Ethernet switches handle their frames.

Fibre-channel, on the other hand, carries SCSI commands and payloads in Fibre-channel frames, period.  It has no use for Internet Protocol.  Fibre-channel functions as layer 1, 2, and sort-of layer 3.  I say sort-of because Fibre-channel has no directly correlative elements to IP addresses--but more on that later.  After those SCSI commands are encapsulated in Fibre-channel frames, those frames are handled by FC switches as Fibre-channel switches do.

FLOGI
With Fibre-channel, in order for devices to be able to communicate with one another, those devices, which are known as Nodes, must first log into the FC switch, a process known as FLOGI.  This is how the node gets registered in the fabric and can gain access to fabric services provided by this switch, or by other devices on the fabric.  The node gets what's called a Fibre-channel identifier or FCID from a name server on the fabric.  This address is used so that the switch can know how to pass FC frames from node to node.  Each FC frame has a source and destination FCID.

Fibre-channel port types
There are several different kinds of ports in a Fibre-channel fabric, but for now we'll deal with only three: N_Port, F_Port, and E_Port.  N_Port, or Node Port, is the port on a node, which is usually an HBA.  An F_Port, or Fabric Port, is a port on an FC switch which connects to an N_Port.  An E_Port, or Expansion Port, is a port on an FC switch which connects to another E_Port on a different FC switch.











WWNs and Zoning
In the blogs and articles that I've read about about Fibre-channel, people compare the FCID to the IP address.  This sort of makes sense since the FCID is how FC switches know where to send frames.  Also, nodes request for FCIDs in a way that's very similar to how hosts request IP addresses from DHCP servers in the Ethernet world.  You also hear comparisons made between WWNs, or World Wide Names, with MAC addresses.  This sort of works because WWNs are 64-bit addresses used to uniquely identify different elements within the network, similar to how MAC addresses are supposed to uniquely identify different elements in an Ethernet network.  Unlike MACs, however, WWNs are not to be found inside of FC frames.  FC switches know how to pass FC frames around only by virtue of the FCID.

What is the WWN for then?  Zoning.  Zoning is used to determine who can talk with whom.  It's like VLANs in that way, but better.  Zoning has some of the capabilities that private VLANs have, except natively.  For instance, you can have groups X and in a zone be able to talk with group Z, but not with each other--like devices in different private VLAN communities talking with their primary VLAN but not each other.  You can also define different network elements with aliases.  There's greater control and granularity available with zoning than with VLANs.  Zoning configurations are usually based on WWNs, since WWNs uniquely identify different Fibre-channel elements.



FCoE at its highest level

If you're seasoned in networking, you understand that networking is all about encapsulation.  In that sense, FCoE is not intimidating.  It's just one more layer of encapsulation.  With Ethernet, application data is encapsulated into segments, which are encapsulated into packets, which are encapsulated into Ethernet frames, which are formed into bits on the wire.  In FCoE SCSI commands are encapsulated with fibre-channel frames, which are encapsulated into Ethernet frames, which are formed into bits on the wire.

This is similar to how IPsec and GRE use encapsulation for tunneling, except those are layer 3 while Fibre-channel and Ethernet are layer 2.  In the case of FCoE, Fibre-channel networks are tunneled through Ethernet networks by encapsulating Fibre-channel frames inside of Ethernet frames.  As you can see, layer 4 segments and layer 3 packets are skipped over.  Fibre-channel has its own mechanisms which provide the functions of TCP/IP.  Just like one might use GRE over IPsec for securely connecting remote offices which are doing dynamic routing, one might use Fibre-channel over Ethernet for converged networking.  It's just more of the same.  Since it's easy to understand what's happening conceptually, the purpose of this post is to understand how this is all achieved, and through what mechanisms.

FCoE a bit deeper

FCoE can be further broken up into the following steps:

  1. In FCoE, an FCoE Node, usually abbreviated to Enode, creates the tunnel starting point.  The Enode is a host equipped with a converged network adapter, or CNA.  That device creates and tunnels Fibre-channel frames to a switch which natively understands both Fibre-channel and Ethernet.  It is known as the Fibre-channel Forwarder, or FCF.  In FCoE, the Enode and FCF are the tunnel endpoints.

    In the first step, the Enode discovers its FCoE termination point (FCF).  When the entire process of setting up FCoE virtual links and sessions is finished, the host will send FCoE traffic to this device which will translate those FCoE frames into FC frames.

    The FCF and FCoE discovery process has some similarities with how an IP phone might ask its connected switch to know which VLAN to use via LLDP MED.  The host solicits for a service with a multicast, like how a phone might ask its connected switch to know which VLAN and QoS parameters to use to generate voice frames.  Except here, the Enode is soliciting to know the FCoE VLAN and who its FCF is.
  2. After the client learns its FCoE VLAN and its FCF, it FLOGIs into the FCF.  After the Enode logs into the switch, the virtual FC links are set up.  
  3. After FLOGI, the host logs into its target (usually a Fibre-channel array).  This is called Port Login, or PLOGI. 
  4. The entire process of constructing virtual Fibre-channel links is managed by a protocol called FCoE Initiation Protocol, or FIP.  The last step is the teardown of the FIP session now that it's done its job of setting up virtual Fibre-channel links.  This is called FIP LOGOUT.  Keepalives are are continually sent between the CNA-enabled host and the FCF to keep the virtual Fibre-channel links active.
That's it.  After the Fibre-channel virtual links are set up between the CNA-enabled host and the FCF, the host can send FCoE frames to the FCF, which will then translate those FCoE frames into Fibre-channel frames to the rest of the Fibre-channel network.

The FIP process in more detail
  1. CNA-equipped hosts ask to discover FCoE VLANs with the well known multicast MAC address 01-10-18-01-00-02.  This is an address that FCFs know to respond to.
  2. The hosts learn about FCoE VLANs from the FCF.
  3. The hosts ask to discover FCFs.  The FCF provides the Enode MAC, Enode WWnN, Enode FCoE frame size.
  4. The hosts learn about FCFs: Priority, FCF WWnN, Fabric Descriptor (VFID, FC-Map), FKA ADV Period (interval for FC fabric advertisement).
In this case, Priority refers to an FCF's priority for the host to learn.  After the hosts learn about the FCFs they're able to log into the Fibre-channel fabric.  After Enode learns of the FCF and the FCoE VLAn, it FLOGIs into the FCF.

The FLOGI process
  1. The host sends a FIP FC ELS FLOGI to ff.ff.fe.  This process creates virtual N_Ports and virtual F_Ports. 
  2. The host receives a FC ELS ACC (FLOGI) from FCF.  This response includes a Fabric-provided MAC address (FPMA).  
The Enode uses its FIP MAC, which is a MAC address used to initially make the tunnel between it and the FCF, to request to the FCF to log in.  After it receives the FC ELS FLOG, the Enode receives its FPMA from the FCF.  As mentioned, the FPMA consists of the FC-Map as well as the FCID.  The FCID is used as the routable address in the frame after the FC frame gets out to the FC network.

You might wonder what an FC-Map is for since I haven't mentioned it until now.  The FC-Map is to identify different Fibre-channel fabrics since the FCF can work and deal with different FC fabrics--different FC networks with different domains and which do not communicate with one another.  By using an FPMA which has this information built into the Ethernet frame, confusion as to which FC fabric the FC payload is to be dropped onto can be avoided.

After the Enode is logged into the FCF, the virtual Fibre-channel links are established.  The Enode uses that virtual link to PLOGI to another device (FLOGI is to log into the fabric switch, PLOGI is for a node to connect to another node (e.g., storage).  After it uses its virtual links to PLOGI, it's ready to start communicating with other FC devices.

The ENode encapsulates the fibre-channel frames into Ethernet.  The MAC address used in the FCoE frame is the FPMA.  This is the MAC address the Enode uses in its Ethernet frames which are encapsulating the Fibre-channel frames when it talks to Fibre-channel storage.  The FCF sees the FC-Map (the first three bytes), which indicates the Fibre-channel fabric to use.  It strips off the Ethernet frame, and then forwards it out to the correct Fibre-channel fabric according zoning rules as a Fibre-channel frame.  From that point on, the frame is no longer FCoE, and is treated and forwarded as normal Fibre-channel traffic.  It is passed along using the original FCID that it received upon the login process.


















When Fibre-channel frames come back from storage, or whatever the device is, the process occurs in reverse.  The Fibre-channel frame comes into one of the FCF's FC ports which is associated with an FC-Map, which is associated with the virtual Fibre-channel interface.  The FCF takes the destination FCID in the FC frame, adds the FC-Map to the beginning, and uses that as the destination MAC address when it encapsulates the frame with Ethernet.  It then passes it along the vF_Port to the Enode.  There's no need to reference any table for it to generate the FPMA.  The FCF knows what FC fabric the frame came in on, and adds the fabric identifier (FP-Map) to the destination FCID of the frame.

The FCF's E and vF ports are associated with the FC-Map, and thus, it knows how to generate the FPMA for the FCoE Frame.










That's it, but that's not all.  

The above details the FCoE process between the Enode and the FCF.  The purpose was to explain the meat and potatoes of FCoE--the mechanisms by which the tunneling works.  There are, however, more details in how FCoE is implemented in the real world.  The last bits to explain, at least in this post, is NPV and NPIV: N_Port Virtualization and N_Port ID Virtualization, respectively.

Many converged infrastructures will have a FIP Snooping Bridge running in NPV Mode in between Enodes and the FCF.  The reason is scalability.  The purpose of the FIP Snooping Bridge, or FSB, is to provide lots of available Fibre-channel connections between the hosts and the FCF by extending the FCoE fabric out with an FCoE-aware switch, which is capable of reliably carrying FC payloads to the FCF.  The FSB provides this functionality while at the same time functioning as an ordinary Ethernet switch.


The FSB does this in two ways.
  1. It provides transport for FCoE traffic by snooping in on the original FIP sessions between the Enode and FCF.  Based on the information gathered from the snooping, it creates ACLs which let only the registered Enodes communicate with the FCF on the FCoE VLANs, thus fulfilling Fibre-channel's zoning requirement.  The FSB also provides safe transport of FCoE traffic with the use of Enhanced Transmission Selection (ETS), and Priority Flow Control (PFC).  ETS provides a guaranteed allocation of link bandwidth on a per dot1p value basis, and PFC provides flowcontrol on a per dot1p value basis.  This is necessary for FCoE to meet Fibre-channel's requirement of losslessness. 
  2. It can provide lots of FC connections to the FCF with NPV Mode.  The FIP snooping bridge can allow for many Enode connections to the FCF by means of N_Port Virtualization.  In NPV Mode, the FSB poses as a node to the FCF.  It gathers logins from downstream nodes via its vF_Ports (see glossary) and sends them upstream via its Proxy vN_Port to the FCF's vF_Port.  This only works if the FCF is set up with N_Port ID Virtualization so that it can allow for more than one FCID to enter through its F_Port.   










NPV mode versus NPIV
I initially found this confusing because NPV and NPIV sound very similar, and FCoE has way too many acronyms, many of which are used interchangeably.

NPIV and NPV are similar, but do different things.  They're best used together.  What NPIV does is allow for multiple logins to occur through a single F_Port.  Without an NPIV-enabled switch, you cannot have more than one login via an F_Port.  Specifically, NPIV allows for more than one FCID association with an F_Port.

So you can use NPIV on its own, just plug the switch's F_Port into a downstream node that has several hosts identified with it.  The different nodes will make FLOGI requests, which will pass to the NPIV-enabled switch, which will know how to deal with those requests and it will all work fine.

But you get the real benefit if you combine an NPIV-enabled switch with another FC switch in NPV-mode.  A switch in NPV Mode is called an NPV switch because it's doing N_Port virtualization: it's pretending to be a node.  You'll also hear people refer a switch in NPV Mode as being in passthrough mode or functioning as an FCoE passthrough.  What this means is that the switch is not part of the Fibre-channel fabric.  There's an excellent blog post explaining it all here.  A switch in NPV Mode has an NP_Port (yes, another freaking acronym), which means Proxy N_Port.  Why isn't it then called a PN_Port?  Well, uhm... good question.  Anyway.  A Proxy N_Port on an NPV Mode-enabled switch takes fabric login requests from downstream nodes and passes them off to the upstream switch's F_Port.  It functions as a proxy for the downstream nodes, handing off fabric logins on behalf of the nodes to the FCF's NPIV-enabled F_Port.

Because the NPV switch's proxy N_Port is passing along multiple logins and Fibre-channel frames with different FCIDs, the upstream switch's F_Port must be able to accept those different logins and frames containing different FCIDs.  Therefore, the upstream switch must be NPIV-enabled.

In this way you can see that NPV and NPIV are not the same thing, but they do complement one another.  An FC switch in NPV Mode acts like a Fibre-channel hub and passes on connections to the upstream switch, which must have NPIV capability in order to accept those connections on a single port.


Misc

FCoE passthrough versus multihop
Multihop connects two Fibre-channel switches with VE_Ports.  The first Fibre-channel switch is the FCF and serves as the FCoE termination point.  FC traffic passes between this switch and the next Fibre-channel switch.  This forms an FCoE interswitch link.
FCoE passthrough (NPV mode) connects an FCF to a FIP Snooping bridge with the FCF's VE_Port to the FIP Snooping bridge's VN_Port.  The FIP Snooping Bridge is pretending as though it is a node.  In this setup, FCoE traffic passes between the FCF and the FIP Snooping Bridge.  FCoE is passing through the bridges and getting terminated the FCF.
NPV mode versus AG mode
There is no difference.  They're different acronyms which mean the same thing: N_Port virtualization.   The NPV switch appears as a node to the upstream switch.  It is invisible to the rest of the FC fabric.  AG is a Brocade acronym, NPV is an acronym invented by Cisco.
NPV mode versus native mode
NPV allows an Fibre-channel switch to behave like a passthrough for FCoE traffic and can save on Fibre-channel ports since it allows for several FCIDs to come in through one F_Port, which can then be sent to the NP_Port, which can pass those logins along to an NPIV-enabled FCF.  In native mode, however, you can only have a single FCID come in through its F_Port, and thus, only one FLOGI.


Configuration example

Examples are always good.  I ripped this configuration example from here:
http://partnerdirect.dell.com/sites/channel/Documents/Deploying-FCoE-Dell-Force10-MXL-Networking-Whitepaper.pdf

I'm going to go through this example line by line and explain what everything does, and make general comments about things.

Scenario: Force10 as FSB and Nexus 5000 as FCF.  

Configuring an MXL as an FSB
  1. Turn on FCoE
    feature FIP-snooping
    fip-snooping enable
    protocol lldp
    service-class dynamic dot1p

    This enables FIP snooping, enables LLDP, which is necessary for lossless Ethernet, since ETS/PFC are extensions of the LLDP protocol.  The service class tells the switch to prioritize based on the dot1p tag in order to give priority to FCoE traffic.
  2. Configure default-VLAN
    default vlan-id 20

    This command configures the default VLAN to be 20, so that all ports are in VLAN 20 untagged by default.  That means the Enode is going to be using VLAN 20 untagged to talk FIP to the FCF in order to discover the FCoE VLAN and FCF identity.
  3. Configure uplink FCF switch-facing ports
    int te0/52
    portmode hybrid
    switchport
    fip-snooping port-mode fcf
    protocol lldp
    no advertise dcbx-tlv ets-reco
    dcbx port-role auto-upstream
    no shutdown

    Hybrid mode is Force10's Native VLAN mode.  It allows the switch to accept both tagged and untagged Ethernet frames.  Switchport turns the switch port into L2 mode for switching.  The FIP Snooping Port-Mode indicates that the connected switch on the other end of te0/52 is an FCF.  The DCBx role-auto-upstream indicates that the switch learns its PFC/ETS settings from the upstream switch.
  4. Configure downlink server-facing ports
    int te0/1
    portmode hybrid
    switchport
    protocol lldp
    dcbx port-role auto-downstream
    span pvst edge-port

    Span pvst edge-port configures the port to skip past the STP process of checking for loops and to put the port into the forwarding state right away.
  5. Configure VLAN interfaces for switch and assign ports
    int vlan 20
    int vlan 1000
    tagged te0/1,52
    fip-snooping enable
    no shut

    This tags the Enode and FCF-facing ports for VLAN 1000.  The FCoE VLAN must be tagged, because the FCoE relies on the dot1p value inside of the dot1q tag for ETS and PFC.  

Configuring an Nexus 5000 as FCF
  1. Add features
    feature fcoe
    feature npiv
    feature telnet
    feature lacp

    These enable FCoE and NPIV to allow it to accept multiple logins from its vF_Port.
  2. Configure FC ports and settings if using unified ports
    slot 2
    port 15-16 type fc
    copy run start
    reload

    Designate the FC ports.
  3. Configure VSANs and FCoE VLANs.
    vsan database
       vsan 2
    vlan 1000
       fcoe vsan

    Define a VSAN and associate it with the FCoE VLAN.  I didn't talk about VSAN before, but I mention it in the glossary a bit.  It basically does the same to SANs which VLANs does to LANs.
  4. Configure individual ports from MXL
    int eth 1/1
    switchport mode trunk
    switch trunk native vlan 20
    switchport trunk allowed vlan 20,1000
    no shut

    This configures the trunk to the MXL for VLAN 20 untagged and 1000 tagged.  The native VLAN 20 allows for the initial FIP communication between the Enode and the FCF, and the tagged VLAN 1000 functions as the FCoE VLAN.
  5. Configure VFC interfaces for binding to port-channel.
    int vfc1
    bind int eth 1/1
    no shut

    This configures the vFC and binds it to the Ethernet port facing the FSB.  Cisco and others usually recommend binding the vFC to the FIP MAC of the Enodes.
  6. Configure VSAN database matching VFC and FC interfaces.
    vsan 2 int vfc1
    vsan 2 int fc2/15

    This associates the VSAN with the vFC and FC interfaces, joining the FC and Ethernet fabrics together.
  7. Configure zones and zonesets.
    zone name blade1 vsan 2
       member int fc2/15
       member ppwn
    xx:xx:xx:xx:xx:xx:xx:xx

    zoneset name set1 vsan 2
       member blade 1

    zoneset activate name set1 vsan 2

    This configures the zones.  
Validating FIP Snooping

MXL-B1-Rack3015# show fip-snooping sessions
Enode MAC 5c:f9:12:34:56:78
Enode Intf Te 0/3
FCF MAC 54:7f:ee:56:55:49
FCF Intf         Te 0/52
VLAN 1000
FCoE MAC 0e:fc:00:55:00:04
FC-ID 55:00:04
Port WWPN 20:01:5c:f9:dd:16:ef:26
Port WWNN 20:00:5c:f9:dd:16:ef:26

Here are the parameters of a single virtual FC link as shown by the FSB.  You have the FCoE VLAN (1000), and you have the FCoE MAC of the Enode (0e:fc:00:55:00:04).  It's not clear to me whether that's the FCID of the Enode or for the device it's got an FC connection to.  You can see that last part is the FC_ID, and the first part is the FC-Map.


Validating vFC
switch# show interface vfc 4

vfc4 is up
Bound interface is Ethernet1/4
Hardware is Virtual Fibre Channel
Port WWN is 20:02:00:0d:ec:6d:95:3f 
Port WWN is 20:02:00:0d:ec:6d:95:3f 
snmp link state traps are enabled
Port WWN is 20:02:00:0d:ec:6d:95:3f 
APort WWN is 20:02:00:0d:ec:6d:95:3f
snmp link state traps are enabled
Port mode is F, FCID is 0x490100
Port vsan is 931
1 minute input rate 0 bits/sec, 0 bytes/sec, 0 frames/sec 
1 minute output rate 0 bits/sec, 0 bytes/sec, 0 frames/sec
0 frames input, 0 bytes 0 discards, 0 errors 
0 frames output, 0 bytes 0 discards, 0 errors
Interface last changed at Thu Mar 11 04:44:42 2010

Here you can see how the vFC is like a Fibre-channel port. It has a WWpN, an associated FCID, it's an F_Port (going to the FSB's NP_Port), it has an association with a VSAN.


Glossary

AG Mode: Access gateway mode.  This is a Brocade word.  This is the same thing as NPV Mode.  This mode uses N_Port virtualization.  An AG pretends to connect to a single node when it may be connecting to several.  It uses F_Ports to connect to hosts, and to the FCF.  This saves on ports on the FSB because a single N_Port can map to several F_Ports, each of which connects to an N_port on the hosts.  This is in contrast to native mode, which requires a 1:1 E (ISL) to F (switch to node) to N (node).
CEE: Converged Enhanced Ethernet = IBM’s trademarked term for their implementation (trademarked April 18th 2007, 18 months after Cisco trademarked Cisco Data Center Ethernet).
CEE Port:
DCB: Datacenter bridging.
DCBx: Data Center Bridging Capabilities Exchange Protocol.  A discovery and capability exchange protocol that is used for conveying capabilities and configuration of the above features between neighbors to ensure consistent configuration across the network.
Domain ID: Eight-bit field in the FCID.
ENode: This means FCoE Node, which got abbreviated to ENode.  An ENode is an FCoE termination point on the CNA, which provides the functionality of a Fibre-channel HBA with lossless Ethernet capability.  These end devices produce VFC (virtual Fibre channel) interfaces as VN_Ports connecting to VF_Ports on the FCF.  These devices create the FCoE tunnel endpoint on the host end.
E_port expansion port on the switch which is used to connect to another switch.
FCID: N_Port identifier.  A 24-Bit value identifies an individual Fibre-channel host.  Unlike a WWN, an FCID is what is used in the source and destination headers of the Fibre-channel header.  The first byte in the FCID is the Domain ID, which logs the host in.  The domain ID is unique to each switch.  The second byte is the Area ID, used to identify an N_Port that is connected to a switch.  The third is the Port ID, which is used to identify a single FC object on the Fibre-channel fabric.  The FCID is a bit like an IP address, where a WWN is like a MAC address.  FCIDs are uniquely assigned within a SAN.
FC-MAP: FCoE MAC Address Prefix.  0e:fc:00-0e:fc:ff.  The VN_Port is given a a fabric-provided MAC address, or FPMA.  This is given to the Enode in the FIP advertisement.  An FC-MAP is like an identifier for individual SANs.  This must be configured on the NPV bridge which helps the FCoe NPV bridge isolate misconnections to FCFs in other fabrics.
FCoE-FIP MAC: An Enode MAC address used to associate a CNA to the created VFC.
FCoE-WWN: The WWN of the CNA which will be used for zoning.
FDISC: Fabric Discovery. Subsequent logins from the same ENode for different users, applications, or virtual machines after an ENode performs an initial FLOGI to log in to a switch.  FC and FIP FDISC messages serve the same function in FC and FCoE networks, respectively. N_Ports send FC FDISC messages to the FC switch and VN_Ports send FIP FDISC messages to the FCF.
FIP: FCoE Initialization Protocol. an L2 protocol for endpoint discovery and fabric association. FIP frames have their own Ethertype (FIP).FIP-ADV-Period: FIP keep alive period.  These are transmitted to the MAC address of the Enode from the FCF.  It's on average 128 seconds.
FLOGI: Fabric Login.  F_Port Login.  Logical connection to the FC switch.  This process gives an FCID to a node.  For FC devices, an N_Port logs in to the FC network by sending an FC FLOGI message to the F_Port of an FC switch.  For FCoE devices, a VN_Port logs in to the FC network by sending a FIP FLOGI message to the VF_Port of an FC switch.
FPMA: Fabric-Provided MAC Address.  Is is also sometimes referred to as an FCoE MAC address.  This consists of two parts: a 24-bit FC-Map and a 24-bit FCID.
F_port fabric port on the switch which connects to a node point-to-point.
FSB: A FIP Snooping Bridge is an intermediate switch between the ENode and the FCF. By snooping on FIP packets during the discovery and login phases, intermediate bridges can implement dynamic data integrity mechanisms using ACLs that permit valid FCoE traffic between the ENode and FCF. Implementing such security mechanisms ensures that only valid FCoE traffic is allowed. This is FIP snooping.
FWWN: Fabric World Wide Name.  WWN on each port on a fabric switch.
NPAR: NIC partitioning done at the card.  The driving standards are Dell and QLogic.
These have an 'FCoE controller' for each of its Ethernet ports. The controller creates VN_Ports. NPAR does require a special switch. NPAR allows you to set up multiple apparent NICs according to a dot1q tag. Because dot1q has eight possible values this means that NPAR can only allow for eight virtual NICs.
N_port node port on the host.
NPV Mode: N_Port virtualization.  This is a Cisco-specific word.  Brocade has it and calls it 'Access Gateway' mode.  A switch in NPV mode is invisible to the fibre-channel fabric.  It doesn't participate in fabric services.  This uses proxy N_Ports, which make requests for fabric logins on behalf of downstream nodes.  NPV and NPIV are commonly used together.
NPIV: N_PortID Virtualization. NPIV allows a single physical N_Port to have multiple WWpNs and therefore multiple N_Port_IDs, associated with it.  An NPIV-enabled physical N_Port can subsequently issue additional commands to register more WWPNs and receive more N_Port_IDs (one for each WWPN).  Without NPIV, a host can only have single WWPN per F_Port.  Therefore, without NPIV, there will only be allowed one fabric login.
NPV: This is similar to NPIV, but it instead uses an 'NP_Port' on the switch which requests WWPNs via NPIV on behalf of other N_Ports connected to it.  This is switch-based, rather than NPIV, which is host-based.
NP_Port: Proxy N_Port.  This connects to an F_Port and acts like a proxy for other N_Ports on the NPV-enabled switch.
N_Port_ID: is a 24-bit address assigned by the Fibre Channel switch during the FLOGI process.  The N_Port_ID is not the same as the World Wide Port Name (WWPN), although there is typically a one-to-one relationship between WWPN and N_Port_ID.
NWWN: Node World Wide Name. An NWWN is valid for on multiple ports that are on that node (this identifies the ports as network interfaces of a particular node).
PLOGI: N_Port login, request login to another N_port, before any data exchange between ports.  Hosts register their WWpNs to the name server.  It sends its WWpN map to its FCID.  The name server can then expose that map and then allow for communication between devices.
PRLI: Process Login.  Finished.  Now devices can talk via SCSI.
PWWN: The same thing as World Wide Port Name, or WWpN. VN_Port emulates an FC N_port
SCR: State Change Registration.  Nodes are able to send notifications to the name server which can be sent to all other nodes in the fabric regarding major changes to the fabric--stuff like nodes joining and leaving the fabric, switches joining or leaving the fabric, or changing the switch name.
SR-IOV: Single Root I/O virtualization. This is a PCI-SIG standard which allows a CNA to appear as multiple NICs in the operating system while using up only a single I/O resource.  It provides a mechanism by which a Single Root Function can appear to be multiple separate physical devices. In constrast to NPAR, NIC partitioning with SR-IOV is done by the operating system.  This does not require a special switch.
SPMA: (Server Provided MAC Addresses) MAC address that an ENode assigns to one of its ENode MACs and is not assigned to any other ENode MAC in the same FCoE VLAN. An SPMA can be associated with more than one VN_Port at that ENode MAC.
VE_Port emulates an FC E_port.  V means that it's an Ethernet port.  
VF_Port emulates an FC F_port.  V means that it's an Ethernet port.
vFC: Virtual Fibre-channel interface. This is configured on the FCF. This is used for the FC part of the FCoE configuration. It must be bound to a physical Ethernet port or the MAC address of the CNA's FIP MAC. This is needed for connecting Ethernet and Fibre-channel services together. You must map a VLAN to a VSAN, map the VSAN to the VFC, and map the VFC to an Ethernet port or Enode's FIP MAC. So it's sort of a Fibre-channel port that has an associated VSAN as well as an associated Ethernet port/Enode FIP MAC.
VFID: Virtual fabric ID:
VSAN: Virtual SAN. A collection of ports from a set of connected Fibre Channel switches, that form a virtual fabric. Ports within a single switch can be partitioned into multiple VSANs, despite sharing hardware resources. Conversely, multiple switches can join a number of ports to form a single
VSAN. A VSAN is very similar to VLAN in that sense, and the term was invented by Cisco.
WWN: World Wide Name. These are eight-byte addresses, usually represented in hex form. They're used in storage technologies like Fibre-channel, ATA, and SAS. They can be used to refer to a switch, individual ports on a switch, or nodes.
WWnN: World Wide Node Name. This is an example of a World Wide Name. It is an eight-byte number used as a unique identifier in Fibre-channel to identify a node in the Fibre-channel fabric.  This might be a multiport HBA.
WWpN: World Wide Port Name. This is an example of a World Wide Name. It is an eight-byte number used as a unique identifier in Fibre-channel to identify a port in the Fibre-channel fabric.  This might be an individual port, on say, a multiport HBA.

Welcome to Wo-Net

It's a terrible name, but I needed something, perhaps I'll change it to something else later.  The purpose of this blog is to help me understand networking concepts better.  Perhaps others can benefit from it as well.  It's inspired by other blogs like evilrouters and packetlife and it's in the same vein.

A little about myself:
  • My name is Andrew Waranowski.  
  • I work at Dell doing technical support.  
  • My certifications are: CCNA, CCNP, ACMA, VCA-DCV.  
  • I'm 30 years old.  
  • I currently live in Round Rock, TX, two minutes away from Dell HQ. 
  • I play bass guitar and trombone.  
I'm interested in networking because it, especially ISPs, have become a utility on which the modern world runs.  Software, businesses, and applications all rely on computer networks.  I also think it's pretty cool how we're able to zip giant sums of meaningful information from one continent to another in milliseconds.  And how realtime communication is made possible by encapsulating computer information into packets with source and destination addresses and passed along by special-purpose devices we call routers.  It's pretty freaking nuts where we are right now, and I think it's only the beginning.  So yeah, I think networking is interesting and important.

Aspects of networking that I'm particularly interested in are:
  • Software-defined networking
  • Security
  • Simplifying networking so that more networks use a common platform. 
  • The application of ordinary and well known engineering concepts to the engineering of networks and network devices.  
  • Redundancy, Fault tolerance, and redundancy tolerance
  • Network reliability
Anyway, I know a good deal, but wouldn't yet consider myself an expert.  So if I make mistakes, please feel free to correct me.  Enjoy.