World Music's DIVERSITY and Data Visualisation's EXPRESSIVE POWER collide. A galaxy of INTERACTIVE, SCORE-DRIVEN instrument model and theory tool animations is born. Entirely Graphical Toolset Supporting World Music Teaching & Learning Via Video Chat ◦ Paradigm Change ◦ Music Visualization Greenfield ◦ Crowd Funding In Ramp-Up ◦ Please Share

Monday, August 29, 2016

Cantillate

Online Music Learning and Augmented (AR) Or Virtual Reality (VR)


Augmented (AR) and virtual reality (VR) promise online music learning experiences radically different to those to which we are accustomed, yet are hamstrung on several levels.

Chief among these is perhaps the 'digital disconnect' between music exchange formats such as MusicXML and the AR and VT production workflows, but there has also been a more general failure on the part of the music industry to move on from legacy notation generation technologies, and a recent focus on 'mobile first' - with all the device, screen real estate, and design mindset limitations that entails.

Given the interest (and potential) in AR and VT, the industry could be accused of a lack of imagination, yet AR and VR are not alone. Even the graphical modeling strengths of the conventional browser DOM (especially scalar vector graphics, or SVG) have yet to be exploited to any meaningful degree. This despite steadily improving central processing unit (CPU) capabilities.

What are the implications of this apparent blind spot for music, and especially in an online and remote learning context?

Online Music Learning and Virtual Reality

Virtual Reality finds itself at the (wannabe) business end of a long, well-established and hard-fought music education value chain, but one that is otherwise in technical stagnation.

The entire music teaching value chain or stack needs to be revisited, and paths found that support the new technology directions. Efforts have been made (Soundslice), but these are focussed on a bitmapped display, which are conducive to neither to data transparency, nor to user interface flexibility and innovation.

We touched (above) on the two main online learning scenarios applying to augmented and virtual reality: on the one hand the 'mobile' context, and on the other a more relaxed and focussed 'workstation' context.

The limitations of mobile apps reflect an 'on the move' mindset: beleaguered with distractions, interruptions, fat-fingered inaccuracies, fiddly, small-screened, glare-prone detail, fragmented, spur-of-the-moment needs and -unaddressed- a stream of intrusive, bandwidth-sapping advertising.

Conventional face-to-face and online music learning is, by comparison, almost ascetic. Concentration, precision, flow, fine motor skills, awareness of -and constant adjustment to- the whole. In the background often, role and mental models, reference material, and the landmark of that long-term, clearly envisioned goal. Ideation, action and low distraction: a powerful combination.

Assuming an active ad-blocker, the workstation experience can be made not too far removed from that of face-to-face, but only if the entire delivery stack is revisited and overhauled. That is pretty much the aim of the 'proof of concept' implementation in focus in this blog.

The important takeaway is that whether SVG and browser DOM on the CPU, or Web3D (AR and VR) on the GPU, the workstation context is much the more conducive to learning.

I hope that is clear, as I'm now going to focus in on augmented and virtual reality.

Virtual Reality And Musical Context

There is, it appears, a widespread impression that with online scoring and practice tools such as Noteflight, MuseScore and Soundslice, our musical journey is otherwise at an end.

With the vast majority of world music systems, instruments and theory tools wholly untapped, this is far from the case. Moreover, SVG (scalar vector graphics), which offers comprehensive modelling and music visualization capabilities and has already been around for some 20 years, have hardly been exploited.

Another differentiator? Of the successful teaching platforms, Soundslice is clearly an online learning platform, yet not strictly a remote learning platform. A video is not a live, remote teacher.

Big, brave, open-source, non-profit, community-provisioned, cross-cultural and shotgun crazy. → Like, share, back-link, pin, tweet and mail. Hashtags? For the crowdfunding: #VisualFutureOfMusic. For the future live platform: #WorldMusicInstrumentsAndTheory. Or simply register as a potential crowdfunder..

Potential Crowdfunder?

To compete with this as-yet-untapped market (which will, I anticipate, will soon flower into life), VR content providers are going to have to add considerable value, while keeping navigation simple and intuitive, and costs low. Importantly, like it or not, they will continue to rely on conventional online music provisioning channels, but dramatically data-enhanced to support new capabilities and teaching ideas.

Two major hurdles at the moment are the long content creation workflows in VR, and the challenges of providing user choice of driver (here, music exchange file) content.

Though things will change as commerce finds way of bringing more solid value to VR, much current content has a limited shelf life. Simple eye and reflex candy is quickly discarded. New scenarios and possibilities are emerging fast.

Whether on workstation or mobile, virtual reality applications should ideally be built around the user's choice of music, and hence exchange file.

By this, I mean that a user should be able to load a piece of music in a format of their choice (MusicXML, Midi, audio, ABC..), and expect this to play a role -alongside other data feeds- in driving on-screen events. Should it achieve this, VR hints at immersive, dramatic learning experiences not possible on a 2D screen.

In this sense, as a learning source, music represents a vast, durable and reusable form of content for a variety of augmented and virtual reality solutions.

Of the potential sources of high-value application-relevant music data, the richest is perhaps MusicXML. It provides us with music system information such as notes per octave, temperament or intonation and key, but also real-time information on instrumentation (parts), voices, note names, durations and dynamics - and can be used to calculate pitch, or deduce the underlying theoretical models. Moreover, given use of the right music application, it can represent complex orchestral works with dozens of instrument and voice parts.

Anything other than dynamically loaded music sources -and especially the notion of music elements or properties being 'baked-in' to screen artefacts- is going to have a hard time justifying categorisation as a music application, and will in the long run lose out to those achieving it. This is something to keep in mind as you read on..

VR and AR Versus Web3D (WebVR and WebAR)

How do we distinguish VR and AR (virtual and augmented reality) from Web3D (WebVR and WebAR)?

The answer is twofold. First off, VR is totally immersive, the head effectively enclosed, the world view entirely synthetic. VR devices respond to an individual’s actions in a natural way, usually through immersive head-mounted displays and head tracking. Gloves or other hand tracking and haptic (touch sensitive) devices (musical instruments included) may provide feedback. Room-based systems provide a 3D experience for multiple participants, but tend to be more limited in their interaction capabilities….

AR is a mix of the real and the imaginary, integrating real-time information such as text, graphics, audio and other virtual enhancements with real-world objects.

It is this “real world” element that differentiates AR from virtual reality. Indeed, AR's primary selling point is adding value to the user’s interaction with the real world.

WebVR and WebAR mirror these experiences - but via the web browser, generally on a mobile device temporarily dropped into a headset holster. Freedom of choice between device and headset is far from guaranteed. Moreover, with the web browser go less raw processing power, fewer peripherals interfaces and often a marginally less convincing ("six degrees of freedom") 3D experience. On the upside, you have internet..

In these senses, we can distinguish broadly between AR and VR systems on dedicated, closed-ecosystem devices, and nominally platform-independent, browser-based WebAR and WebVR experiences. There are, however, nuances - Web3D on the desktop browser, WebAR games such as Pokemon Go playable without a headset, and so on.

Spatial Contexts In VR

Spatial or situational contexts in VR are defined in terms of 'scale' and 'degrees of freedom'.

'Degrees of freedom' signifies the freedom of immediate movement and orientation afforded by the VR device. These are forward/back, up/down, left/right, yaw, pitch, roll.

The VR headsets generating the most buzz currently are wholly 'untethered' (no wiring or other impediments on movement). Depending on instrument, however, much time learning an instrument involves the user being tethered - either to a seat, or in a standing position. For this reason alone, untethered 3D roving is likely to prove of limited use.

'Scale', a rough guide to the visual volume or range available is defined by terms such as 'swivel' (seat) scale, 'desk' scale, 'room' scale, 'factory' scale.

For mass social music events such as an online orchestral play, even 'factory' scale may already prove limiting. For a virtual band practicing on-stage for a concert, factory scale may be adequate. For pub session play, room size may be fine. For P2P music teaching (our focus, and as depicted by the banner photo at the head of this page) desk scale may be adequate.

"3D Or Not 3D? That Is The Question."

Simply transferring the notation-instrument-theory tool environment (see banner photo) and data feeds into the planes of a 3D AR / VR environment may be a satisfying technical challenge, but in itself adds -other than a sense of greater freedom of navigation- little user value. Indeed, it tends to obscure the underlying data.

How, too, do we directly interact with musical objects? Control gestures are quickly going to get complex, yet we are here to learn to play an instrument, not to navigate an VR interaction maze.

Moreover, if music theory models are a conscious abstraction, arbitrarily adding a third dimension arguably risks defeating their purpose. There should, then, be very concrete benefit from 3- as opposed to 2D display.

The fun begins, you could say, with the merger of multiple subjects into one view. Merging a teacher or student in the form of an avatar or background video with score, instrument and theory tool can lend the whole a feeling of immersion, but this should go beyond the structural.

As graphical elements are merged, so too should their data. Head and eye tracking, movement and touch sensor feedback, laser environmental sensors needs to be associated with merged graphical object and domain context data, possibly pulled in from the internet. Though not unique to AR / VR, we could also add to these the value delivered by simple algorithm or artificial intelligence.

In effect, then, the data's value is -at least in theory- multiplied. Nevertheless, given all these challenges and perhaps only dubious added value in comparison to the browser example, it's only when we break from the notation-instrument-theory tool-teacher model and align ourselves with the spirit of gamification - with storytelling, metaphor, allegory or analogy and imaginative use of graphics that VR is likely to properly differentiate itself from impending DOM offerings.

VR objects are prototyped using simple, often online tools, then modelled, textured, and rendered in lengthy pre-production workflows using (in many cases free) 3D modelling tools such as Maya, Unity, Unreal and Blender. To avoid every new musical source being forced through a lengthy pre-production graphics pipeline, however, four features central to our 2D browser mindset will need to be brought along:
  1. freely interchangeable music sources (MusicXML/Midi/audio etc - self-load or streamed
  2. real-time, music-source-driven graphical elements
  3. direct, P2P interworking controls
  4. configuration changes at any time
While conceding that the following is a gross (and from some perspectives distorted) simplification of the web and VR landscape, let's try and plug the first two into some of the available technology stacks:


On the left, the (on the whole) declarative, browser DOM technologies, on the right the (on the whole) procedural / imperative 3D polygonal wireframe world of AR/MR/VR.

The web DOM space is focussed on structured data, the VR space on the dramatic experience. The former provides a framework for understanding, the latter encourages the user to suspend disbelief and go with the dramatic flow. As concepts go, poles apart..

AR / VR Graphics

Where the web browser has long been focussed on direct interaction with structured data (manipulating a strict 2D hierarchy of DOM elements), VR is concerned with interacting with ad-hoc collections of pre-baked graphical assets in a virtual space.

The single biggest challenge to bringing conventional music learning to AR / VR, therefore, is run-time, data-driven population and manipulation of music elements directly in the virtual space. Coming from the browser's DOM, long-winded VR content delivery pipelines run counter to our music delivery expectations.

Indeed, what we risk is a virtual world populated with shapes with those six degrees of freedom ("outeractivity") but little real "interactivity". In this respect there are huge differences in the capabilities of browser-based 2D scalar vector graphics (SVG) and VR's 3D 'polygon art'. The former are entirely run-time scriptable and addressable in the browser, allowing all manner of complex interactions (getter/setter calls) and hence near-real-time transformations. The latter may be sovereign in the six degrees of freedom, but in many cases allows only limited and predetermined scripted actions.

Interactivity in conjunction with exchangeable scores more or less pushes us in the direction of programmatic animation, but another consideration is expressivity. A musical score is an information rich document and a perfect match for the browser's 2D DOM. Dependent animations beg powerful data visualisation techniques, but also the binding of musical notes and their dynamics to specific events or actions.

Could the web DOM and VR be united, music can become as much a driver of VR storyline as the user's actions. VR, with it's immersive qualities (autonomy, interactivity and presence) could open some radically new paths to music learning.

There are both declarative and procedural approaches to the 3D web. You will find a useful into to the declarative approach >here<.

From our 'any music source' point of view, however, holding on to real-time data-driven animation is a must. So how advanced is declarative 3D? In how far has it been integrated into the VR polygon world? Alternatively, could both be used simultaneously in overlaid views, and at what control cost? In how far can VR objects to adapt to real-time data? What are the fall-back options? In a future post we we look at the problem space from both ends: virtual and real.

Bringing 2D Capabilities Into AR / VR

To sum up, transferring elements of the 2D music learning constellation as depicted in the banner image above to AR / VR will continue to rely on:
  1. freely interchangeable music sources (MusicXML/Midi/audio etc - self-load or streamed
  2. real-time, entirely data-driven graphical elements (fingerings, note highlighting on playback, colour preferences and consistency)
  3. interactivity (allowing, for example, free choice of notation type, instrument tunings, theory model type, hover-over display of properties)
  4. freely reconfigurable (music system etc) at any time
There would, moreover, be huge benefit in:
  1. real-time interactions with polygonal graphical elements (getting/setting of properties)
  2. haptic instrument interfaces for feedback and control - supplementing generic hand, eye or other controls
  3. for P2P interworking, inter-device and application controls
..which imply the following:
  1. built-in web browsing capabilities
  2. 2- and 3D graphical elements as first-class DOM elements OR real-time scriptable, source-driven 3D bitmap graphics objects
These may be realisable individually within specific configurations, but taken together they more or less rule out current A & VR offerings. Even viewed tethered to a desktop with conventional browser, there are currently hardly any APIs linking the browser with virtual reality (VR) devices, including sensors and head-mounted displays.

Merging A & VR with web technologies raise complex interworking challenges. Not the least of these lie between often proprietary 3D technologies and open, largely 2D standards - and especially the integration of value in existing sites. Bringing these en bloque to A & VR implies at best clunky 2D planes. What we effectively end up with, then, is SVG within a 2D browser plane within a 3D AR/VR environment, a.k.a. 'hack'.

All this is a shame in as much as content production pipeline engines such as Unity are delivering VR or VR-like experiences to just about every platform one can imagine. Nevertheless, the internet and VR's future are woven inextricably together. Our immediate challenge is to pick out those approaches and technologies likely to survive the disruption.

What we would seem to need is a 3D web domain object model or DOM - but fully integrated into the A & VR wireframe world. The simpler alternative, naturally, is to provide a data-driven API - and wait. :-/


Keywords



online music learning,
online music lessons
distance music learning,
distance music lessons
remote music lessons,
remote music learning
p2p music lessons,
p2p music learning
music visualisation
music visualization
musical instrument models
interactive music instrument models
music theory tools
musical theory
p2p music interworking
p2p musical interworking
comparative musicology
ethnomusicology
world music
international music
folk music
traditional music
P2P musical interworking,
Peer-to-peer musical interworking
WebGL, Web3D,
WebVR, WebAR
Virtual Reality,
Augmented or Mixed Reality
Artificial Intelligence,
Machine Learning
Scalar Vector Graphics,
SVG
3D Cascading Style Sheets,
CSS3D
X3Dom,
XML3D


Comments, questions and (especially) critique welcome.