Jitsi
Jitsi Videobridge AutoScale
May 01, 2020
4 min

Jitsi Videobridge Autoscaling with AWS

Jitsi Videobridge acts as the media server hence is the component which consumes the most resources. Scaling it becomes a necessity when the traffic starts to increase in your system. The jitsi performance test shows that a single videobridge can handle 1000 streams on a c5.xlarge server at 550Mbps bitrate. This blog describes how to autoscale jitsi videobridge and please note scaling the whole platform is different which requires to have multiple shards, load balancing, geo cascading etc. which will be discussed in a future post.

Why jitsi autoscaling is necessary?

  • Jitsi videobridge is the component which handles media in jitsi hence is the main server which consumes the most resources. When the traffic in the server increases (i.e. number of video/audio streams through videobridge) the server resources max out. (i.e. CPU Usage, Bandwidth). On most servers it would be the bandwidth that would reach the maximum first. Also note that the number of video streams or audio streams are not the number of participants but the number of streams that going across the server. For example in a meeting of 5 people, roughly 20 streams flow across the server assuming everyone has their video on.
  • When the number of streams increase the bandwidth and CPU usage of the server increases.
    The number of streams increase with both the increase of number of participants and increase of number of parallel conferences. It will reach to a point where the server would max out which on most occasions be the bandwidth.

  • This would mean we would need more servers to handle the load. If the traffic does not vary we can simply add more servers to meet the traffic. But on most scenarios the traffic varies and the solution needs to adjust to the requirements. This is where a solution like autoscaling is necessary where the servers would spin up or shutdown according to the traffic conditions.

  • There are few platforms or methods to handle auto scaling.

    • Using AWS native services
    • Using Azure native services
    • Using Google Cloud native services
    • Using cloud provider servers with Kubernetes to manage the clusters
    • Using bare metal servers with Kubernetes to manage the clusters

    In this blog we would be discussing the first method where we would completely relying on AWS services for handling the scaling end to end.

What services from AWS would be used?

Mainly the following services would be used

  • AWS EC2 - All the servers would be c5.xlarge (i.e. considering both network and bandwidth capabilities) EC2 instances

  • AWS EC2 Autoscaling groups - These groups are responsible for handling the autoscaling functionality of the servers. We can set scaling up and scaling down policies, create launch configurations, adjust parameters like max, min capacities of the group. We will explain more in a bit.

  • AWS SQS - Keeps the details of performance of the servers as a message queue.

  • AWS Cloudwatch - Cloudwatch monitor the servers and fires alarm when the defined thresholds are reached.

High-Level Architecture of the system

jitsi videobridge autoscaling

Main components

  • Jitsi-Meet - Web component of the system and all the videobridges from the group connect to this
  • Jitsi Videobridge - Media Server of Jitsi which routes all the streams to clients
  • Queue - A queue of messages which contain details of servers reaching threshold limits
  • JVB monitoring - Monitors all JVB servers and fires an alarm if threshold limit is reached which is then used by the autoscaling group

The Setup

We won’t go into details but provide a high level overview of the steps involved.

  • Install Jitsi-Meet on a EC2 Server

  • Install JVB in a EC2 server, create a script in JVB which will check with SQS per every minute Finally create an AMI of JVB.

  • Create a launch configuration in EC2 - This configuration would be used to spin up new servers when scaling up. For the configuration use the AMI created on the previous step. Choose the instance type as per your requirements ( We recommend to go for c5.xlarge or above). Please provide the other information accordingly.

  • Create an autoscaling group

    1. Choose the above launch configuration
    2. Choose the VPC and subnets that you are going to set this up
    3. Select the group size keeping in mind the traffic conditions. For example if you do not wish to have more than 10 servers in the group you can set the “Maximum Capacity” to 10.
    4. Skip the scaling policy for now
    5. Create a notification for all event types
  • Go to the created autoscaling group and create a dynamic scaling policy. Select “Target Tracking scaling” and select Metric type as “Network out”. Give the threshold value considering the network bandwidth capabilities of the server (for example c5.xlarge supports 10Gbps and keeping a safe margin we can chose 70% of this which is 875000000 bytes). Create another policy with keeping network out as 375000000 bytes.

  • Create lifecycle hooks for both instance launch and instance terminate which will be used by the queue to get details about the server.

  • Create a queue using SQS

    • This queue will receive the instance IDs of servers that need to shutdown jvb read queue cronjob

The Process

  • When the threshold limits defined in the scaling policy is reached it will either try to spin up or shutdown servers.
  • If it decides to shutdown a server,
    • it choses a server randomly and try to shutdown it. But since we have set up lifecycle hook it breaks this and put the instance ID to the queue.
    • Each server will check every minute with the SQS whether it needs to shutdown by cross checking the instance IDs of the queue.
    • If the servers instance ID matches with the instance ID on the queue it will execute a script to gracefully shutdown. This will ensure that the instance will only shutdown when the conferences are over.
  • If it decides to spin up a new server/s,
    • A new server/s will spin up based on the launch configuration.

Conclusion

This blog only explains in high level how the autoscaling is handled in aws. The setup is complex to be explained in a single blog. We will discuss about other methods as well in future blog posts.

If you are interested in hiring us to scale your solution please contact us through support@telzee.io and please visit telzee.io for more details.


Tags

jitsijvbautoscaling

Related Posts

Jitsi
Implementation of Chat Feature with Jitsi Meet API
August 29, 2020
1 min