Layer 3 Virtual Private Network (VPN) services are widely used to exchange enterprises’ private routed information over a public infrastructure in a fast, reliable and secure way.Today, Layer 3 VPN services are mostly provisioned on the IP/MPLS core network, using a flat model among the core PE routers. Access to L3 VPN is mostly provided on traditional transport networks such as SDH. SDH networks provide the required resiliency to most of the network elements and links involved in the service, and limit the Layer 3 scalability requirements to the relatively small size of the core network, however it lacks statistical multiplexing and the ability to perform local routing and content insertion at the aggregation network itself. Also, due to the flat model, PE connectivity is mostly done without any traffic engineering, hence not being able to deliver hard QoS to the end customers.Orckit-Corrigent Packet Transport Network (PTN) solution provides a novel hierarchical L3 VPN model which improves the scalability, resiliency and operational simplicity of Layer 3 VPNs, while at the same time enabling statistical multiplexing and the ability to perform local routing and content insertion at the aggregation network. This hierarchical L3 VPN model conceptually resembles that of H-VPLS (compared with flat VPLS) for L2 VPN services. In this model, core PE’s are connected in a full LSP mesh and BGP peering, and leaf PE’s are connected only to core PE’s.This approach provides scalable, reliable and cost effective solution with traffic engineering, Quality of Service (QoS) and ease of operation that allows optimal utilization of network resources.This technical note provides information on L3VPN hierarchical solution in Orckit-Corrigent’s PTN portfolio.
L3 VPN deployment today
Figure 1 shows a typical deployment scenario of L3 VPN today. L3 VPN service delivery components are:
Customer Edge (CE) routers, operated by the VPN end customers
Provider Edge (PE) routers, operated by the service providers.
Aggregation network, typically SDH, interconnecting the CE and PE routers.
Core network (P routers), interconnecting the PE routers.
In the typical scenario, up to hundreds of PE routers are used to deliver the L3 VPN service. In order to interconnect those PE routers a logical full mesh is required between them in two aspects:Control plane – full mesh of BGP peering to exchange VPN routesData plane – full mesh of LSPs to exchange VPN traffic.The PEs in the core network are usually high cost, high scalability devices that support the required number of BGP peers, LSPs and large routing tables to accommodate the entire network. Still, these devices do experience some scalability limitations when it comes to a full mesh of LSPs. While from data plane perspective, a full mesh of LSPs is supported, establishing them in a state-full manner using RSVP-TE protocol, which enables traffic engineering and fast protection, overloads the PE CPUs, hence the common practice is to establish these LSPs using LDP protocol. The advantages of the LDP protocol for LSP establishment are:
It is stateless (i.e., no keep-alive messaging is required to maintain these LSPs; such keepalive messaging loads the PE routers’ CPUs)
It automatically connects any new PE to all existing PEs without the need for any manual provisioning.
On the other hand, it has two major disadvantages:
1.It lacks any traffic engineering capabilities. This means that VPN traffic can use a differentiated services QoS model, where various traffic classes have different priorities, however no bandwidth can be reserved for the VPN users, and hard QoS with bandwidth reservations cannot be supported.
It lacks any failure detection that may trigger fast protection mechanisms such as Fast Retoute (FRR). In fact, LDP relies on IGP convergence to re-distribute labels following any network failure event. Such convergence can take many seconds, leading to undesirable traffic hits.
The aggregation network, interconnecting the CE to PE routers, is traditionally an SDH network. While SDH networks provide sub-50ms resiliency for failures in the vast majority of network elements and links used to deliver the service, it lacks statistical multiplexing and the ability to perform local routing and content insertion within the aggregation network.
Challenges with flat L3 VPN model
Global trends such as exponentially increasing amounts of data traffic, which makes the constant bit rate allocation in SDH networks commercially in viable, the surge in video traffic, which calls for distributed content caching in the aggregation network, as well as the introduction of LTE mobile services which uses an all-IP backhaul network, calls for distribution of the L3 VPN service delivery to the aggregation network. This distribution creates new challenges in delivering L3 VPN services:
The traffic pattern in core networks is well aggregated – due to the law of large numbers – and not very bursty, which allows modest over-provisioning of the core networks to be statistically sufficient for meeting the service SLAs rather than strict bandwidth reservation and guarantee. In contrast, the traffic in access and aggregation networks is very bursty and over-provisioning is not a commercially viable option. Hence, to allow statistical multiplexing in the aggregation network, traffic engineering tools, providing bandwidth reservation, must be used.
While it might be acceptable to rely on IGP convergence for failures in the core, with failures in the aggregation being recovered within sub-50ms using SDH, it is not acceptable to rely on IGP convergence for failures in the access and aggregation network.
Distribution of L3 VPN service delivery from core PE routers (hundreds of PEs) to the access and aggregation (tens of thousands of PEs) increases the scalability requirements in a manner that is proportional to the square of the number of PEs (need 10,0002 LSPs and BGP peerings in the network rather than 1002). This puts an enormous challenge on both the network elements hardware, as well as on the operational efforts required to provision, maintain and troubleshoot such networks.
The operational team in the access and aggregation networks does not have the deep knowledge and expertise in operating complex IP networks. An operational model which resembles that of SDH, using a GUI-based Network Management System (NMS) rather than textual Command-Line Interface (CLI), static, deterministic and predictable connection-oriented provisioning and OAM tools, is required.
While high-end, high-scalability, high-cost and high-power devices may be acceptable for the core network, access nodes are required to be low cost and low power, and often are also required to be environmentally hardened and work across an extended temperature range. This requirement comes with a big conflict with the order of magnitude increase in scalability – the CPU and memory resources available in access nodes would be significantly lower, not higher, compared with core nodes.
L3VPN hierarchical model
In the L3VPN hierarchical model, the leaf PE’s are connected to one or two core PE’s and the full LSP mesh is done only between the core PE’s compared to the flat L3VPN model which requires a full LSP mesh between all PE’s where each PE maintains routes for all the VPNs of its connected sites.
Figure 2 shows a L3VPN hierarchical model. The PE’s in the access and core networks are provisioned with VPN routing and forwarding entities (VRFs). Customer Edge (CE) routers exchange VPN routing information with the access PE’s using static routes, or IGP routing protocols such as OSPF. Each leaf PE is logically connected only to the L3VPN core PE’s. No connectivity is required between leaves PEs.
Each leaf PE learns the VPN routes from its connected CEs, and advertises them to its associated core PE’s. The VPN routes advertisement uses the BGP protocol with multiprotocol extensions (MP-BGP), where each advertised route is associated with an MPLS label. VPN routes advertised in such a manner are known as labeled VPN routes. Core PE’s are also advertising VPN routes (learned from leaf PE’s) to all other core PE’s, using the same MP-BGP mechanism. However, and this is a key aspect of the hierarchical model, core PE’s will only advertise a default VPN route for each VPN to their associated leaf PE’s. In this manner, leaf PE’s will only store and maintain routes for their connected CE’s, while core PE’s will store and maintain VPN routes for all the VPNs in the network. Core PE’s exchange BGP updates (BGP peering) with all other core PE’s and with their associated leaf PE’s, while leaf PE’s only establish BGP peering with the core PEs.
VPN traffic is exchanged between leaf and core PE’s over MPLS LSPs, and is further identified with the label assigned by the MP-BGP advertisements. This label is used at the PE’s to direct the traffic to the correct VRF, where the traffic is then routed using traditional IP routing techniques (e.g. based on destination IP address).
The hierarchical model puts a big relief on the scalability requirements of leaf PEs, both in the control plane (BGP peering with core nodes only – no BGP peering is required with other leaf PEs) and in the data plane (LSPs to the core PEs only rather than to other leaf PEs, and a relatively small routing table that takes care of the local routes plus two default routes to the core PEs per each VPN, as opposed to the full routing table which only needs to be stored at core PEs).
Cost and power dissipation
The relief in scalability requirements in the leaf nodes directly translates to a reduced cost, power consumption and operational complexity of the leaf nodes, which comprise the majority of network elements in the network.
Quality of Service (QoS)
In both the MPLS-TP and IP/MPLS portions of the network, traffic engineering capabilities allow optimal utilization of network resources. The PTN (PE) provides network-wide Traffic Engineering, advanced QoS and Connection Admission Control (CAC) for SLA assurance. Static provisioning of LSPs connecting leaf to core PEs can be done confining all the scalability requirements to the Path Computation Element (PCE) in the NMS, while RSVP-TE signaling provides traffic engineering for LSPs connecting core PEs. With this scheme, statistical multiplexing can be effectively done in the aggregation network, without degrading the QoS and jeopardizing compliance with the customers SLA.
As described above, leaf PE’s are associated with two core PE’s to provide backup VRF in case of core PE’s failure, where only one of the core PEs is active at a time.
MPLS-TP LSPs with MPLS-TP LSP E2E protection (based on ITU-T G.8131) or IP/MPLS LSP provisioning using RSVP-TE with FRR, deliver sub-50ms protection and restoration against failures in nodes and links interconnecting the leaf PEs with the core PEs.
In the core network, IP/MPLS LSPs are used to transport the traffic between the core PE’s, enabling sub 50ms recovery capabilities using RSVP-TE signaling with FRR protection.
Operation and Maintenance (OAM)
For L3VPN fault detection, MPLS OAM is monitoring the network. MPLS-TP OAM is used for the MPLS-TP PW/LSP paths monitoring between the leaf PE’s to the core PE’s and IP/MPLS OAM which include ping, trace route and BFD on the LSP layer is used between the core PE’s or between leaf and core PE’s. The bidirectional nature of MPLS-TP LSPs, as well as the fact that the OAM path follows the data path, maintains an SDH-like operational model in the aggregation network and greatly simplifies the operation of the vast majority of the network, as can be shown in Figure 3.
The hierarchical model limits the amount of BGP peers, LSPs and routes in the leaf nodes, to a tree vs. mesh structure, making it much easier to provision, maintain and troubleshoot, compared with the flat L3 VPN model.
Traditional L3 VPN are managed using textual command-line interface (CLI) and highly-trained technicians with deep knowledge in complex IP networks.A GUI-based, service-oriented Network Management System (NMS), with IP/ MPLS and MPLS-TP as the underlying packet technologies, supporting both static and dynamic control plane for end to end service provisioning while maintaining an A to Z point-and-click provisioning for both options, is allowing operators to provision LSPs and PWs statically using a Network Management System (NMS), similar to the way it is done in legacy transport networks, without IP or routing protocols.
The NMS enables full management of L3VPN services through end-to-end graphical views, including node/VRF add/remove/edit, wizard-based addition and duplication of VPN endpoint, automatic any-to-any OAM activation for connectivity verification, PM reports, and more.
Orckit-Corrigent CM-4000 end-to-end PTN solutions present a scalable, resilient, cost effective, and simple to operate L3VPN hierarchical model. The hierarchical model reduces the number of LSPs, routes and BGP peering which eventually reduce the cost and complexity of the access PTN nodes. In addition, the use of MPLS-TP as the infrastructure of the L3VPN service enables sub 50 ms restoration time, enhanced OAM tools, quality of service and traffic engineering, and efficient multicast distribution.Orckit-Corrigent's CM-View NMS provides a full GUI-based management of L3VPN services through end-to-end graphical views and rich set of OAM automated tools. The NMS eliminates the need for costly and complex, error-prone CLI provisioning and offers a significant simplicity and OPEX reduction.