![]() "text": "Client: Hi, who are you?\nAI: I am Vincent and I am barista!\nClient: What do you do every day?\nAI:",Īlready thinking about how you will use the model? Here are some facts. Example of call: POST http: //yourServerPublicIP: 8080 /generate/ ![]() with Postman or curl or any HTTP request client. Once you are done with points above you are ready to run docker image: docker run -p 8080: 8080 -gpus all -rm -it devforth/gpt-j- 6b-gpuĪfter that you should be able to make HTTP to public IP of your server, e.g. Install Docker with NVIDIA Container Toolkit I will show how it works in step-by-step way. Go to, then click "Edit Images & Config" button, scroll to "Enter the name of docker image", and type : devforth/gpt-j- 6b-gpu So, let's spawn our Vast ai GPT-J instance. For this price, for profitable application you can keep 2-3 hosts running and load-balance requests between them, if one will go down others will serve requests. Luckily, you can specify which image to run, and proxy SSH port so actually it allows you to integrate several such instances smoothly even into a real-time application of any complexity. However there is a technical peculiarities: instead of direct SSH access to the server, you get SSH to the Docker container which will be spawned on user's machine. It has no SLA guarantees (only actual host experience) but it is much cheaper and gives you a lot of options. ![]() Comparing Vast.ai with regular hosting is like comparing Airbnb with hotels websites. Vast.ai looks like fresh technological idea of new age. ![]() Then we will also consider running model with plain SSH instance. The most interesting option is Vast.ai platform, also it allows you to play with the model with minimal expenses. It took a week and then required explaining why we need this instance and so on (Don't recommend if you are not ready to waste a lot of time) We tried to get this instance from our old trusted AWS account but it required to increase AWS G Instances limit (with confused UI where you have to specify vCPU cores). Minimal spot prize costs $1.14/hour ($840/month), however, it is non-stable (could be terminated) spot instances, stable on-demand costs 3 times higher. Someone might even say about AWS EC2, e.g. ![]()
0 Comments
Leave a Reply.AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |