As part of the openvolley project we are working on multi-camera approaches to ball and player tracking for volleyball. This is being conducted in parallel to the work being undertaken by Andrew Tao and Chad Gordon. These approaches are complementary, and if eventually we can combine both methods it promises to be a powerful tool.
The field of view of a camera can be described by its camera matrix
C
, which defines the transformation from court coordinates
[x, y, z]
to image coordinates [u, v]
. But for
tracking purposes we actually want the reverse: from the coordinates of
an object in the camera image, we want to calculate its corresponding
location in real-world (court) 3D coordinates. Inverting the camera
equation gives a ray in 3D court space, extending outwards from the
camera. The true location of the object lies somewhere along this ray,
but unless we have another piece of information we can’t tell where
along the ray it is. If we know the real-world elevation
(z
) of the object, that gives us the extra piece of
information that we need and we can then calculate the corresponding
court x
and y
coordinates. This works well for
e.g. players in contact with the floor, but becomes difficult once the
object is in the air, like a jumping player or the ball.
Using two or more cameras offers a potential solution to this problem. If we know the image coordinates of the same object in two different camera images, then we have two rays in 3D space, and we know that the object lies somewhere along both of them. The object must therefore lie on the intersection of these two rays.
For the time being we use a method with several steps:
Identify all volleyballs in each frame of each video stream. This part uses the publicly-available ovml R package, which includes a network specifically trained for volleyball detection (work still in progress to improve this though).
Synchronize the two camera streams and, for each frame, estimate the 3D location of the ball from the two sets of image coordinates following the general method outlined above.
We now have an estimated ball position in each frame, provided that a ball was detected in both camera streams for that frame. For frames where we haven’t detected a ball in both camera images, we cannot estimate the ball position, so our ball trajectory estimate is not continuous over the length of the rally. Our position estimates from step 2 will also include erroneous positions for the small number of frames that include a false-positive ball detection (i.e. something that wasn’t a volleyball, being classified as a volleyball).
The results are demonstrated below on a rally from the VNL 2021 match between Brazil and Germany.
From this, the estimated contact height of the serve (Isac, Brazil #12) is 3.28m (that’s the bottom of the ball, so the top of the ball would be at about 3.49m assuming a 21cm diameter ball). The back-row spike by Alan (Brazil #21) was estimated at 3.43m for the top of the ball.
You will notice that the 2D trajectory (the top-right court plot) has
a smoothed appearance, without the sharp changes in direction that you
expect to see when a player makes contact with the ball. (The height
plot below that is not smoothed, because this isn’t currently part of
the tracker output). In future work we will look to integrate the above
steps, especially steps 2 and 3. This will allow simultaneous
x
-y
and height estimation, provide information
about transition points in the ball direction (ball contacts), and allow
more complete filling of the gaps that occur when the ball is occluded
from view by players or moves out of the field of view entirely. It will
also potentially help when multiple balls are in view, which sometimes
happens when spare match balls are being moved around the sidelines
during a point.
This will be made freely available as part of the openvolley project for the whole volleyball community to use and build on.