Unexpected and Unpredictable Shiny App Terminations with Shinyproxy

Hello everyone,

I am reaching out for insights or solutions to an issue we are encountering with our ShinyProxy deployment. We are running ShinyProxy version 3.0.2 on an AWS EC2 instance with AWS Linux 2023, and we’ve been observing that occassionally our Shiny app stops unexpectedly and unpredictably after running normally for some (or better, the most of) time. Afterwards, the users are always able to reload the app successfully.

There are no errors in the container logs, and the container monitoring suggests that the containers are simply being stopped. However, we have noticed the following error messages in the ShinyProxy logs:

  1. A DockerRequestException indicating that Shinyproxy is attempting to access information about a Docker container that appears to be non-existent:
Caused by: com.spotify.docker.client.exceptions.DockerRequestException: Request error: GET unix://localhost:80/containers/{container_id}/json: 404, body: {
    “message”: “No such container: {container_id}”
}
  1. An error related to a failed proxy request, potentially due to a prematurely closed connection:
ERROR 7801 --- [XNIO-1 I/O-2] io.undertow.proxy : UT005028: Proxy request to /proxy_endpoint/{session_id}/highcharts-x.y.z/modules/timeline.js failed
java.io.IOException: UT001000: Connection closed

In the ShinyProxy logs, we found additionally some reoccurring errors, which are (at least time-)independent to the “lost” containers (still, we have the suspicion they have the same root):

023-11-06 14:10:33.672 ERROR 7816 --- [pool-1-thread-13] e.o.containerproxy.service.ProxyService  : [user=user_id proxyId=proxy_id specId=spec_id] Failed to remove proxy

and

eu.openanalytics.containerproxy.ContainerProxyException: Failed to stop container

We also observed several Exception handling request to..., e.g.:

023-10-10 21:43:52.779 ERROR 7990 --- [XNIO-1 task-2] io.undertow.request                      : UT005023: Exception handling request to /proxy_endpoint/moin_static170rc3/favicon.ico

This error occurred with a range of other files (which are, by the way, not part of our app) as well, such as:

/proxy_endpoint/pics/reservation.png
/proxy_endpoint/moin_static157/robots.txt
/proxy_endpoint/images/ico_clear.gif
/proxy_endpoint/images/green_dot.gif
...

We have already double-checked the Docker and ShinyProxy configurations, found no significant resource constraints, and verified network settings. Nonetheless, the problem persists.

Has anyone experienced similar issues or could provide insights into what might be causing this problem? Any advice or suggestions would be greatly appreciated. Thank you in advance for your support! :slight_smile:

Best regards,
bathyscapher

While waiting and hoping for an answer, I posted on SO as well

I am glad you reposted. I am having similar issues regarding UT005028 type errors. The issues I am trouble shooting, happen during startup. However, they are intermittent.

I’m not sure how to go about trouble shooting them, as they are happening in users browsers. Mine tend to be all related to DT::datatables content and downloading the css/js files not the same one each time. I suspect corrupted cookies and http problems like oversized headers. Maybe load balancer issues, or even issues in shiny iteslf handling https requests. I occasionally get these types of errors, and deleting the cookies seems to clear them up. Too many cookies? I looked at the har file for one of the users, and they have around one hundred and fifty SAML cookies.

In the meantime we continued to try to figure out where these error messages come from (unsuccesfully though :frowning:). We could observe them irrespective of the ShinyProxy version (tested: 2.6.1, 3.0.1 and 3.0.2), server (tested: NGINX and Apache), the operating system (tested: Amazon Linux 2 and Debian) or ShinyProxy’s HTML templates (tested: default and custom). Also they seem to occur independent from the Docker version (tested: 20.10.23, 20.10.25 and 24.0.6).

I’m also wondering, if there might be a connection to this issue? Or this related one? … ¯_(ツ)_/¯

Hi @bathyscapher

It seems your docker containers are being stopped or crashed. When your user then performs some action in the app, ShinyProxy ends a HTTP request to the container, but since the container disappeared, the request fails and ShinyProxy stops the app since it’s crashed. (it will still try to stop the container, but this fails as well since it crashed).

I would keep a look at the logs of the Docker daemon whenever you see users experiencing this issue. It should tell you why the container was stopped or has crashed.

1 Like

Hey @tdekoninck,

great that you answered, thank you :slight_smile:

In the meantime, we figured out that we have mixed up two different things here: the first refers to what you posted (but it is still valuable to have a confirmation!) and the second are traces of unsuccessful scanning attacks.

“Lost containers”

By logging the status of all components every 30 seconds, we discovered with docker system events that the containers stop with exit code 137 (ergo a “out of memory” error). With heavy clicking on the app, it is possible to evoke this error (however, I assume that this is out of the usual range of user behavior :wink: ).

Contradicting, the resource usage determined by docker stats claims that RAM is fine, but CPU usage is around 100 % shortly before a container is “lost”:

CONTAINER ID   NAME           CPU %     MEM USAGE / LIMIT     MEM %     NET I/O          BLOCK I/O     PIDS
55151b9e1304   brave_wright   91.87%    470.9MiB / 3.793GiB   12.13%    5.3MB / 13.5MB   19MB / 0B     5

A pattern that was always visible, except for one case, where CPU load was average (what could be an artifact as the depicted values represent the mean over a time span), but RAM was still low. We could also confirm this by tracing the output of top while “losing” a container:

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
11211 root      20   0 1724360 778500  60316 R  95.0  4.9   1:37.96 R

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+ COMMAND
11211 root      20   0 1724360 778500  60316 R  96.7  4.9   1:40.86 R

So we assume that the error stems from the CPU load (contradicting the out-of-memory error indicated by exit code 137). We tried to limit the container’s resource usage with container-memory-limit and/or container-cpu-limit as well as increasing resources (CPU and RAM) without any success.

If you have any ideas or suspicions here, we would be happy to know. :slight_smile:

Scanning attacks

The errors such as:

023-10-10 21:43:52.779 ERROR 7990 --- [XNIO-1 task-2] io.undertow.request                      : UT005023: Exception handling request to /proxy_endpoint/moin_static170rc3/favicon.ico

or

ERROR 7801 --- [XNIO-1 I/O-2] io.undertow.proxy : UT005028: Proxy request to /proxy_endpoint/{session_id}/highcharts-x.y.z/modules/timeline.js failed
java.io.IOException: UT001000: Connection closed

stem from rejected requests from scanning attacks, which presumably attempt to find a weakness in the infrastructure. This explains the occurrence of file names which are not part of our code base (whereof we conclude, they failed to reach a vulnerable entry point).

1 Like