We’ve certainly come a long way since the first recognised use of videoconferencing (as opposed to CCTV) when NASA used two radiofrequency channels to communicate with the astronauts during the first manned space flights.
Advances in digital camera (CCD) and projector technology (Contrast and Brightness) , coupled with the lower cost of high-bandwidth connections are driving amazing results.
At the high end – Holographic telepresence (Musion) can project life-size high-definition hologram style images on to a specially designed “film” which allows remote participants, or animated participants to interact with live performers on stage. The “peppers ghost” effect is truly mesmerising as anyone who has witnessed Madonna performing live with The Gorillaz, or been lucky enough to be escorted through the Disaster attraction at Universal Studios in Florida will attest.
Participants who are not really “there” can and do walk in front of and behind props that are present on the stage both obscuring and revealing the objects as their image passes.
Stage sets can be built that can be as meagre as a 2m square of the Musion “eyeliner” foil type film, or as large as 20m by 100m to re-create giant life-size images of trucks for example.
The current Musion technology is being widely used for press events and launches, corporate conferences, concerts and more recently tv news channels.
Room Based Telepresence
The next level of High Definition conferencing can be best represented as custom room based installations.
Such technology such as Cisco Systems CTS3000, HP Halo and Tandberg T3 even includes the furniture required to ensure high quality sound and video to the meeting participants, and is usually configured in the “split table” layout – where conference participants are effectively looking at their remote colleagues on the “other side of the table”.
Telepresence as a term is usually differentiated by the use of horizontal lighting to deliver more realistic colour reproduction, 1080P cameras, spatial microphone technology with multi channel sound, and large 1080P monitors.
While arguably delivering an immersive experience this system requires significant investment by the customer. An average CTS 3000 system like the one shown above from Cisco or the Tandberg T3 equivalent, will be in the order of $299,000 per room. (not including painting both rooms similar colours, installing blinds or the cost of seating)
Bandwidth required is usually between 6-8 Mbps (3 x 1080p streams) but up to 10Mbps if it was a “busy” meeting with lots of movement.
Pure Video Software Clients
Standard Definition software clients are plentiful, and there are good reasons for this that we’ll aaddress later. Some use proprietary technology such as P2P software or codecs like Skype, others are “sign-up for for feature rich service” models like ooVoo or iVisit.
High Definition Video clients are a little more rare, probably due to the standard problems of processor, memory , video card, sound card and microphone quality available to the standard desktop or laptop.
Those developers that already have solutions that use H.264 or H.264-SVC are Mirial, Radvision with their SCOPIA offering and Vidyo (who Cisco Systems recently came to a licensing agreement with).
Most of these allow point to point via SIP or H323, or point to multi point via an MCU (multipoint Conference Unit) such as those we use by Codian from Tandberg. Some MCU devices can accept ISDN connectivity either by way of expansion modules or built in to the main unit, enabling IP to ISDN conferences to take place.
So why is HD so tricky on laptops and desktop PCs ?
Well, leaving aside the bandwidth requirements – here are the top issues with current software HD clients
Lights, Camera, Action !
HD requires a camera that is at least 720P or 1080P.
Now many laptops are desktop monitors are supplied with webcams built-in to the bezel, but these tend to be 1.3 or 2.0 Megapixel CCDs.
This is fine for VGA, but they rarely deal with low light conditions or display colours very well.
External HD cameras are available but at considerable cost, since they are generally sold as Small Business conferencing solutions rather than personal or individual systems.
Examples of better quality webcams are the Logitech Quickcam Pro 9000 and Quickcam Sphere (with Pan and Tilt) unless you are lucky enough to have the new Samsung LCD monitors which have the high quality optics built in to the bezel.
Echo & Noise
Most sound cards in business PCs use analogue 3.5mm jacks to connect microphones and speakers – which is fine for everyday usage and the occasional tollquality VOIP call.
However these analogue microphone and speaker connections really are not up to coping with standard or wideband voice codecs and don’t compare favourably to digitally connected professional grade components.
Full duplex audio is a must for multi-party conferences to avoid clipping and constantly missing the other party’s input.
Some software clients try very hard to overcome the echo and delay by employing noise cancelling techniques, AGC (Automatic Gain Control) and echo cancellation, to varying degrees of success.
If you are stuck with using the embedded sound card in your PC – and want to maintain hands-free participation then generally it’s better to purchase a headset.
As headset technology progresses a quality USB headset (Plantronics) with inherent sound card and DSP processing is far superior to an analogue alternative.
If you are likely to spend a lot of time on conference calls, personal conferencing speaker and microphone sets are available – such as those from Polycom or ClearOne.
These provide full-duplex audio (some support wideband) with hardware based echo cancellation and noise suppression.
Processor, Memory and More Processor
Any of the HD software clients require real power to be able to encode and decode even a single HD Video stream.
The minimum would be an Intel Dual Core, and finally these are being made available on laptop PCs without the battery draining in less than a few minutes.
As another example of the processing power needed, a single H.264 call to an MCU on a 1.7Ghz Pentium Mobile shows a Processor Utilisation in Windows Performance Monitor of 47-50%, and once connected to the called party this moves to around 80%.
Frequent processor alarms were appearing in the video window and no other applications were running.
Video cards are available that support H.264 offload processing but these generally aren’t readily available in laptops (due to the cooling required) or business PC configurations.
So how much is your download, and more importantly your upload capacity from your current ISP ?
What is the contention ratio ?
Do you have a monthly allowance on traffic with a capped service ?
All of the answers to these questions suddenly become vital when considering whether you can employ HD while working from home – which of course in these periods of economic uncertainty is exactly what you should be doing ! Reducing travel expenses and carbon footprint are top of mind for many organisations right now.
Despite H.264 efficient compression algorithms it is still subjet to the usual challenges – Compression, Complexity and Visual Artefacts.
A rough rule of thumb is that a 720p camera sending a “talking head” type scene will require between 1.5 and 2.0 Mbps of guaranteed bandwidth.
Disclaimer – This obviously depends on numerous factors such as camera and lens quality, the encoder and the complexity of the scene.
If you wanted to send a constant 1080p quality stream you would require more like 6Mbps to sustain a high quality image.
So while download may be possible of a 720p stream in some areas, for most home workers (who don’t have a guaranteed 10Mbps duplex connection) the choice of transmission quality tends to be more economic, such as CIF (240p) or 4CIF (480p).
And most people believe they “look” better at these SD rates too, I know I do.
H.264 SVC Extension – Scaleable Video Codec could be a reasonable solution to the bandwidth issue. Without going in to huge depth here (Google is your friend) it allows for varying frame rates, quality and resolution by using layering techniques.
This means that rather than seeing immediate packet loss in the form of artefacts and picture break-up, it smooths out the speedbumps in an “elastic” way.
As mentioned previously there are only a couple of software clients available right now , but the results are impressive when viewed over lossy networks (like the Internet) in comparison with traditional H.264 AVC clients.
Since writing this piece Radvision have won TMC’s Communications Product of the year 2008 with their Scopia solution that champions H.264 SVC.
Software as a Service – Hosted Video Conferencing
Since the H.264 SVC extensions can cope with lossy networks, early adopters can provide a chargeable “portal access” VC bureau service over the Internet.
One such example being the latest service offering (launched May 15 2009) from a UK based company called Videocall.
Their MyPresence “personal telepresence service” allows conferencing on-demand with desktop collaboration (based upon the Vidyo software platform) for as little as £3 per week.
Megameetings has a slightly different model with a monthly charge based upon the capacity required (maximum number of participants in the meeting at any one time).
This includes the ability to share applications, presentations and screens.
Tiered pricing ranges from $45 per month for a 3 person capacity limit to $499 per month for a 100 person limit.
Presumably the larger capacity is for less interactive classroom monologues or CEO broadcasts. (EGO TV)
Pricing for larger capacities are available upon application
Fuzemeeting offers a similar up-front payment based service that utilises cloud computing and virtualisation in their browser based solution.
This also means that people can take part in the web-conference via mobile devices such as iphones , symbian , palm or Windows CE – but maybe not with HD video.
The Future – 3D without the glasses ?
While processor technology takes a vacation, camera and display technology is making Moores Law look under powered.
Companies like PureDepth are ramping up production of their 20.1 and 12.1 inch 3D displays, and these don’t require those polarising glasses.
By layering LCD screens inside a slightly deeper casing, they can superimpose the stereo components of an image within one unit to provide the depth of field experience.
Alioscopy has a different approach by utilising autostereoscopic imagery in it’s 42inch plasma displays.
The displays are equipped with an array of lenticular lenses that cast different images onto each eye dependant upon the viewing angle.
At the moment the consumer / home 3D tv market still requires polarising glasses.
Until the technology becomes more cost effective, we’ll have to make do with walking past occasional 3D advertising concepts like the one at 750 7th Avenue at 50th Street in New York .
For most people with contended cable and adsl connections, HD Video is a “nice to have” but by no means necessary.
Perfectly engaging video conferencing sessions can be executed with a good quality camera at VGA quality (30 frames per second), and until bandwidth becomes more affordable and everyone has refreshed their existing computer estate with HD offload capability, monster processor and ram specifications …
I would invest the additional money on ensuring that the audio quality is the best it can be.
I may be writing a separate positioning article on HD Voice very shortly.
Submitted By Darren Gallagher