How Many Gunicorn Workers Should I Set Up? - Thoughts on Gunicorn


If you develop web applications with Python web frameworks (Django, Flask, etc.), you've probably heard the term Gunicorn at least once. In this post, I'll summarize what Gunicorn is, why we actually use it, and what you should consider when using it.
**Gunicorn, short for Green Unicorn (pronounced either gunikon or jiunikon, it is also called both in foreign countries), is a Python web server, a Web Server Gateway Interface (WSGI) server. It is used for reliable and efficient web application deployment.
If you're wondering what Web Server Gateway Interface (WSGI) is, here's the wiki entry for you
plain웹 서버 게이트웨이 인터페이스(WSGI, Web Server Gateway Interface)는 웹서버와 웹 애플리케이션의 인터페이스를 위한 파이썬 프레임워크다. - wikipedia
WSGI is a set of "frameworks" that connect web servers and web applications, and Gunicorn is one of them. Gunicorn is a multiprocess architecture based on the concept of worker processes. This means that Gunicorn provides a high level of concurrency and processing power, and can effectively cope with large amounts of traffic, improving the performance and reliability of applications.
So, what does this mean for web servers and web applications? We've summarized them below for your convenience.
uWSGI and Waitress, but gunicorn is the most popular.plain웹 브라우저와 같은 클라이언트로부터 HTTP 요청을 받아들이고, HTML 문서와 같은 웹 페이지를 반환하는 컴퓨터 프로그램 - wikipedia
Nginx, Apache are typical examples of the above.
plain인터넷이나 인트라넷을 통해 웹 브라우저에서 이용할 수 있는 응용 프로그램이다. - wikipedia
In layman's terms, we can say that the API 서버 we typically create are web applications.
The above definition of each element is a bit confusing, so we can summarize it in a graphical representation as follows.

As mentioned above, Gunicorn is a
**multiprocess architecture
for concurrency or parallelism.
Python web applications such as Flask or Django use the
plain단일 프로세스
to the server. This means that once a request is made, there is only one flow through the server until the response is received.
If you have a web application that performs a function that takes about 3 seconds per request, it looks something like this

In the above case, simultaneous requests can cause problems.
Suppose three users make requests to the above server at the same time. How much time will it take for all users to receive a response?

As shown above, the WEB C user will receive a response in 9 seconds because of other requests.
As a result, single-process servers can become overwhelmed as more users use the service, potentially causing inconvenience.
To this end, we set up Gunicorn to run the server in multiple processes, enabling parallel processing, so that the server can perform its functions stably and reduce user inconvenience.

To run a Flask application with Gunicorn, you can execute the following steps.
shell# shell pip install gunicorn
python# test.py from flask import Flask def create_app(): app = Flask(__name__) # app 실행을 위한 여러가지 동작들 @app.route('/') def home(): return 'Hello, World!' return app
shellgunicorn -b "0.0.0.0:5000" "test:create_app()"
http://localhost:5000, which means running the application with the IP and port of http://localhost:5000.Since there is always one worker when configured like this, you can use the -w option to run with a variable number of workers to achieve parallelism.
shellgunicorn -w 4 -b "0.0.0.0:5000" "test:create_app()"
If you want to run a shell script, you can set it as an environment variable as follows. Create a file named run_server.sh and write it as follows.
plain#!/bin/bash export GUNICORN_WORKERS=4 -- 워커 개수 설정 # Gunicorn 실행 gunicorn -w $GUNICORN_WORKERS your_app_module:app -- 환경 변수로 설정한 개수만큼
When it comes to adjusting the number of workers, the focus should be on how many responses and how fast and well they perform, so we follow the following
Gunicorn uses 2 workers by default. However, the number recommended in the gunicorn official documentation is 서버 코어 당 2~4개로 정의.
plainThis number should generally be between 2-4 workers per core in the server. Check the FAQ for ideas on tuning this parameter.
This means that you can set the number of workers by default by thinking about the basic specifications of your deployment server (number of cores) in advance.
plain공식문서에 의해, 8코어 서버에 웹 어플리케이션을 배포하는 경우에는 gunicorn 개수를 12~32개로 설정한다.
Consider 트래픽 분석, 응답 시간 테스트, etc. to set the number of workers. If you are not in the on-service phase, you can try setting the number of workers in advance by load testing server performance.
You can perform a simple test by following the steps below.
jmeter. We'll summarize and link to it later.Gunicorn is an indispensable WSGI when using the Python web framework. It helps to deploy web applications efficiently by stabilizing the performance degradation caused by single processing of web applications.
By choosing the right settings and the right number of workers, you can measure in advance how well it can handle large amounts of traffic.