Enhancing Microservices Communication - Service Mesh
Cloud Native Application Architecture (a.k.a CNAA) came up with the idea of microservices which have changed application design and deployment pattern with two key factors, container (e.g. Docker) and orchestration layer (e.g. Kubernetes). They’ve also brought new approach to guarantee network communication to stay reliable, fast, secure, observable and manageable and at last it resulted in “Service Mesh”.
This blog introduces what and why Service Mesh, it’s fundamental concept, major features and two key products implementing the features.
Lastly, you will get highly recommended blue print of the architecture for building Service Mesh.
What and Why is Service Mesh?
Service Mesh is a dedicated infrastructure layer for handling service-to-service communication in a fast and resilient manner for microservices. It takes responsibility to deliver reliable request from service to service in microservices, through the complex topology of services that comprise CNAA in which, a single application might consist of hundreds of services, each service might have thousands of instances, and each of those instances might be in a constantly changing state as they are dynamically scheduled by an orchestration layer like Kubernetes.
The goal of Service Mesh is to get request handled in reliable, fast, secure, observable, manageable and resilient manner throughout this status of mesh in microservices.
(Source : A sidecar for your service mesh)
The original model of Service Mesh can be traced in the evolution of typical architecture of web application like three-tired app. In this model, it comprises of three logical layers, presentation layer, business layer, data layer as each of separated layer. There was no “mesh” but the communication between the layers were handled within the built-in code of each layer.
When this typical architecture was pushed to very high scale, for the sake of safely handling spike, the applications layer was split into many services, which is sometimes called microservices, and the tires gradually changed to topology. At the beginning of these systems, a generalized communication layer with new functionality, such as Circuit Breaking, Loadbalancing, Dynamic Routing, Service Discovery etc, typically took the form of client library, like, Twitter’s Finagle, Netflix’s Hystrix, and Google’s Stubby and they were the first Service Mesh as they formed as if they had been dedicated infrastructure for managing service-to-service communication.
In this way, utilizing the set of library, service developers have to spend a lot of time to build service-to-service communication functionality at microservices level rather than focusing on the business logic, and this will be even getting worse if it is need to use multiple technologies (such as multiple programming languages like, Java, PHP, Node, Python etc), because they need to duplicate the same efforts across different languages.
Since the most complex challenge in realizing microservices is not building the services themselves, but how to establish reliable and observable communication between services, this combination of complexity and criticality motivates the need for a dedicated layer for service-to-service communication decoupled from application. The idea of offloading all such service-to-service communication functionality to a different layer should help making application independent. That is where Service Mesh come into.
What are differences from API Gateway?
Because the key objective of using API Gateway is to expose downstream microservices as managed APIs and contain the business functionality that creates composition or mashup of multiple downstream microservices, it should encapsulate the compositions of microservices which can change over time and hide them from external clients - API Gateway make sure that this complexity is hidden from the external clients.
Hence, API Gateway offers a single entry point for external clients such as a mobile app or web app used by end-user to access services (it is most likely external traffic), whereas Service Mesh is a dedicated infrastructure layer for handling service-to-service communication (it is most likely internal traffic), which allows you to decouple and offload most of the application network functions from your service code.
Logical Extension of TCP/IP
Microservices is to comprise of business logics and network functions that manages service-to-service communication in reliable, fast, secure, observable, manageable and resilient manner .
Decoupled from application, the service-to-service communication came into a network model that sits at a layer of logical extension of TCP/IP. that is called Service Mesh. TCP/IP abstracts the mechanics of reliability delivering bytes between network endpoints, while handling network failures, Service Mesh abstracts the mechanics of reliability delivering requests between services.
Proxy in Service Mesh is used for achieving desirable service-to-service communication. In most cases, Proxy is implemented using the sidecar pattern.
In mcroservices with sidecar proxy deployed, the network functions will be split into primitive network functions and application network functions to where sidecar proxy is belong and mcroservices won’t directly communicate with the other microservices as shown in below diagram.
Control Plane in Service Mesh as a centralized management utility, is quite useful when supporting Service Mesh features such as Service Discovery, Dynamic Routing, Access Control, Observability and so on.
For enabling desirable service-to-service communication, Service Mesh offers many network related functionality.
There are the most common features offered from Service Mesh.
Pros and Cons
While being adopted as a critical component in cloud native ecosystem , Service Mesh is coming into high primary option to enable reliable, fast, secure, observable and manageable service-to-service communication, there is still an extensive roadmap ahead to be explored. Service Mesh, like TCP/IP before it, will continue to be pushed further into underlying infrastructure.
Implementations - Istio vs Linkerd
Istio and Linkerd are the most popular open source Service Mesh architecture implementations. Since they both are designed for CNAA, they pretty much follow similar architecture and features, such as Circuit Breaker, Timeouts/Retry, Dynamic Routing, Loadbalancing, TLS termination, HTTP/2 & gRPC proxying, observability etc, but different implementation mechanism, for instances, the way of sidecar proxy implementation (Istio uses Envoy, Linkerd uses Netty and Finagle).
As drawn in below diagram, unlike Istio, since Linkerd has sidecar proxy implemented with most of the major features in Service Mesh such as Dynamic Routing, Loadbalancing etc, the cost of using sidecars can be quite higher then Istio, when scaled.
(Source : A sidecar for your service mesh)
It would be highly recommended feature that API Gateway complements lack of abilities from Service Mesh, so that you can take advantages of Service Mesh in your development affairs.
As diagrammed in below, exposing single entry point with specific business functionalities to external clients, API services in API Gateway can call downstream microservices via Service Mesh by offloading application network functions to Service Mesh.(API Gateway → Service Mesh → Microservices)
In addition, because Istio and Linkerd follow pretty much similar architecture and features, it would be also recommended to use Istio for Service Mesh Implementation due to it’s centralized Service Mesh implementation(Mixer, Pilot, Istio-Auth etc,) that can save resources a lot.
Lastly, because Service Mesh should be a good option to accelerate having container and microservices mainstream to build ecosystem, Rakuten will keep an eyes on it’s feasibility to bring the value to us.
Also, we are hiring, If you are interested in working as a container engineer, please contact us here.