Private Cloud Infrastructure IaaS atau NFVi

 Intended audience: C-level executives, Technical managers, Sales-People, Solution architects, Technical support & operation teams & any cloud enthusiast

Disclaimer - This article does not have any fancy business vocabulary. It is very clear, to the point presentation covering technical aspects only.

In my previous article (click here), I have already talked about choosing between the public & a private cloud for your organization. In this article, we will talk specifically about building a private cloud infrastructure


What is private cloud infrastructure? 


So, when an organization decides not to host its applications on AWS, Azure & Google (public cloud) platforms instead they want to build their own infrastructure which is scalable, vendor agnostic, reachable (connected) from anywhere & economic in the long run then it is called a private cloud infrastructure. It is basically a collection of COTS servers, storage arrays, switches & cables in a server rack (or multiple racks). It looks like below in actual -


Image By Asad Khan


Converting Private cloud into infrastructure as a service (IaaS/NFVi)

 Now, let's talk about building a solution over the hardware because only purchasing the HW will not make it a cloud solution. Above this hardware, a type-1 hypervisor (either KVM or ESXi) will be installed which will also serve as an OS kernel. On top of the hypervisor, to manage this whole system a cloud management system (CMS) is also needed which can either be OpenStack (majorly) or VMware vSphere in some cases. Refer below figure -


Image By Asad Khan


Here comes the role of a Solution Architect which will make it an IaaS (in general terms) or NFVi (in telco terminology) NFVi = NFV infrastructure or Telco Cloud infrastructure. Telco cloud infrastructure will remain the same whether it is used for 4g, 5g, or 6g.


Solution Architects

 There are two types of solution architects. One who builds the infra (known as NFVi architect) and the other who builds the applications over the infra (known as application architects over the cloud). We will talk about the NFVi architect here & will discuss about the application architect later in this article. The NFVi architect will consider the following technical points while building the solution. I am not including the normal utilities & environmental requirements (AC, Power, Backup, etc) into this discussion.


Disclaimer - Do not get afraid by the technical terms used below. If you're not aware of them, then also it's OK. There are many things in this world which we don't know. Just read the article.


Hardware (CPU/RAM)

 Blade servers with high CPU performance, hyperthreading supported, NUMA aware hardware, high amount of RAM, Hugepages supported RAM. Which vendor will give all these features at the lowest cost? Will it be HP, Dell, Nokia Airframe, INSPUR, or any other vendor?


Hardware (Storage arrays) 

Suitable disks for the customer. SSD, NVMe, or a combination of both? Should we have separate storage arrays or should we have HDDs attached withing the compute nodes (called as HCI - hyper-converged infrastructure). Should we use CEPH as a combined solution for both block & Object storage or a non-ceph solution like HP3PAR? Again, the best choice at the lowest cost - HP EMC, Dell Unity, NetApp, or any other vendor?


Hardware (network adapters NICs)

Available options to choose from are Intel and Mellanox NICs. A choice between 2 or 3 NIC pairs in a blade server. Maximum capacity (up to 25 Gbps). OVS, DPDK, or SRIOV supported. All the blade servers to be interconnected & further connected to the rest of the world by using leaf & spine switches. The choices for high-quality switches are HP, Dell, Arista, Cisco, etc


Operating system/Hypervisor

Available options are RHEL, CentOS, Ubuntu & few more Linux flavors. RHEL comes with a licensing cost & support agreement while others have free community-based support. If you already have in-house expertise in Linux then no need to spare your $$ on RHEL. All support KVM as a type-1 hypervisor.


Cloud management system (CMS)

 - The most preferred choice is OpenStack because of its free cost & increasing community support now. In OpenStack also you have flavors like RedHat Openstack (also called RHOSP & comes with a licensing cost by RedHat), free Openstack from Openstack.org, Mirantis Openstack, etc. Other choices are VMware Vsphere with ESXi hypervisor (but this has a huge cost as compared to OpenStack) & Xen by Citrix (with almost no customer base). Here it has to be considered also that do we need an external SDN solution to be integrated with the CMS like Nuage VSP, Cisco ACI, VMware NSX, etc. Refer below figure to understand better -

Now take a pause & just go through the terms which I used above in the last 5 paragraphs? Are you really aware of them? If yes, then you're a real technical person who can talk about building cloud infrastructure. If no, then you need a lot (literally a lot) of study & experience on these topics. These are not a piece of cakes.


Image By Asad Khan


So, the NFVi architect can only be an industry veteran with a lot of hands-on experience in HW servers, Linux, IP networks & any one of the CMS like OpenStack or VMware so, very few people fulfill these criteria. In my personal view companies should not compromise the technical capabilities of this guy & if they don't have an in-house employee for this role then he can be hired on a contract basis from outside for a fixed duration. As this is a one time job to build an infrastructure so he is hardly required after that.


The second type of architects (application architects) actually builds applications over the infrastructure & his job is continuous in nature. The growing number of solution architects that you see in the market these days, all of them fall under this 2nd category. People with certifications like AWS-CSA, Azure professional, GCP SA, etc.


Skills required 

The skills required to manage this infrastructure includes - Linux (strong sysadmin), Ip networks (all basic CCNA level concepts), a bit of bash scripting, and OpenStack (all admin-level tasks). Going forward containers and Kubernetes will be added here. These are really high demanding skills in the market as of now & will keep increasing.


A general mistake on Technical support 

An in-house technical support team with professionals having the above skills can really manage the show & save OPEX. A mistake that organizations generally do is that they outsource technical support or ask the product vendor to support it. This creates dependency & vendor lock-in on the support part. Remember that nobody knows your solution better than you & so, except for SW bugs all the operational support should be developed in-house. This will benefit in the long run.


Skills required (continued) 

 PS core professionals will remain in demand who will manage the 5G core from Telco side. People who are already working on VoLTE & IMS core solutions.

A big chunk will be there who will be DevOps professionals who will develop, automate, and integrate applications over this infra. These professionals will not be traditional Telco employees because this is a core IT area that is open in telecom now. You need to understand the DevOps toolchain & learn a few to get these skills (Git, Jenkins, Chef, Puppet, coding tools, languages, etc)


Last but not the least - IP network professionals i.e. the routing & switching people who can manage the connectivity backbone for this infrastructure. This needs skills in traditional networking like MPLS, BGP, OSPF, etc along with the new virtualization networking concepts of CNIs, OVS, etc.


Keep evaluating yourself & your organization on these aspects (as discussed in the article) if you want to develop the skills or build the infrastructure. See you in the next article