AIMS'26 will run from 09:00 to 17:00 Monday 30th and Tuesday 31st of March,
and 10:00 to 16:30 on Wednesday 1st of April.
Author: Oliver Bown, University of New South Wales
While the legality of scraping training data has dominated generative AI music debates, remaining inconclusive, there is a vast push to introduce payment strategies that would reward creators for the use of their work in generative AI models. However, the opaque nature of machine learning means that there is no natural model for how to divide royalties between a series of contributors. A number of companies and rights management organisations have taken steps towards implementing copyright fair licenses and payment strategies. These include STIM (Sweden), APRA/AMCOS (Australia), UMG and Udio, and Kobalt Media and Eleven Labs. The result is a very complex landscape in which very different model and interface designs, use-cases, forms of creative content ownership, and cultures of creative practice all interact to mutually shape each other into emergent structures.
In this paper I build on a systematic classification of payment strategies for AI training data and the related politics of how these strategies capture creative value (Bown, AIMC presentation 2024, and forthcoming paper for the Journal of Music & Data). I present a walkthrough of a series of worked examples for how payments are already being made from generative music AI tools to contributors in that tool’s training set, considering a range of emerging model designs, interface designs and user scenarios. I develop these worked examples into speculative case studies for how specific designs and scenarios reward different cultural forms, incentivise creative production strategies and iteratively influence the business models of the companies developing them.
Grounding discussion in four underlying principles of individual creative value — originality, talent, identity, labour — I consider how this complex design landscape may be perceived as fair or unfair. Additionally, I consider how different designs might support or diminish four dimensions of cultural value — community participation, cultural expression, cultural richness, and psychological benefits.
Lastly I step back to consider how power is exerted in the playing out of these emerging forms and specifically how a general veil of opacity may be falling over individual action in this space, caused by a complexity of forms driven both by AI and by the complexification of global music economics. I consider where governmental, super-governmental or civil society power needs to be used to better bring together competing stakeholders and correct overly perverse incentives.
Author: Bob L. T. Sturm & Elin Kanhov, KTH Royal Institution of Technology
On Sep 9 2025, the Swedish musicians’ rights organization STIM announced in a press release (STIM 2025) that they are “introducing the world’s first collective AI license for music”, establishing a relationship with the commercial “attribution” service Sureel.ai (founded Oct 2022): “The license model is an important step toward a long-term, sustainable structure that protects creators’ income while providing AI and music companies with a legal path forward”. Similar services are now emerging, such as Prorata (founded Oct 2024), Musical AI (founded Jan 2023), SpaceHeater (founded Oct 2024), and SoundPatrol (founded Feb 2025). What are these services and how do they work? What is meant by “attribution”, and how does that relate to legal concepts of intellectual property, as well as musicological concepts of authorship and influence? How are these services helping artists, rightsholders, the music industries, and AI companies? How are they hurting these stakeholders?
This paper takes a close look at the service Sureel.ai, and how it is situated within the AI music landscape. Through an analysis of its homepage, social media activity, media coverage, research papers, and patents, we see a service monetizing the needs (and fears) of different stakeholders: data sources (artists and rightsholders wanting to protect their work from exploitation) on the one hand, and data seekers (AI companies wanting to avoid legal jeopardy) on the other. We see how the term “attribution” takes on several different meanings, from a designation of influence, a portion of authorship, a form of responsibility or cause, and an algorithm for computing a number expressing data similarity. Two patents of Sureel.ai (Aykut and Kuhn 2025; Kuhn and Aykut 2025) discuss three attribution methods that are coincident with characteristics of modern generative AI models: one based on the input given by a user (“input attribution”), another based on the output produced by a model (“output attribution”), and a third based on statistics collected during training a model given data (“model attribution”). Each one of these methods makes choices that are questionable with reference to the intended task, and which in the end appear to be more convenient rather than relevant.
In light of the continued uncertainty regarding the legality of the unauthorized use of any data for training AI systems (see Craig 2025 and Sobel 2025 for competing opinions), we discuss how AI music attribution services could be the “lesser evil” responding to the needs of platform music industries (Morris 2020; Leyshon and Watson 2025), musicians and artists, labels and distributors, and AI companies seeking to tap into cultural revenue streams. While these attribution services may not actually be delivering what they claim in terms of fair and targeted remuneration for content rightsholders, they are acting as “digital data brokers”, operating a marketplace of digital music “assets” expressly focused on simultaneously facilitating and fighting against AI innovation.
References:
T. Aykut and C. B. Kuhn. United States Patent, (U S 12,013,891 B2), 2025.
T. Aykut et al.. Evaluating the role of composition and recording style in AI music. In Proc. IEEE International Conference on Responsible, Generative and Explainable AI, 2025.
C. J. Craig. The AI-Copyright Trap. Chicago-Kent Law Review, 100(1):107–139, 2025.
C. B. Kuhn and T. Aykut. United States Patent, (US 12,314,308 B2), 2025.
A. Leyshon and A. Watson. The Rise of the Platform Music Industries. agenda publishing, 2025.
J. W. Morris. Music Platforms and the Optimization of Culture. Social Media + Society, 2020.
B. L. W. Sobel. Copyright accelerationism. Chicago-Kent Law Review, 100(1):73–106, 2025.
STIM, “Blogpost: STIM Launches the World’s First AI License for Music”, https://www.stim.se/en/news
Author: Andrew Brown, University of Durham
Since Karpathy’s much-circulated tweet dubbed “vibe coding” a “new kind of coding” based on natural-language interaction with AI assistants (Karpathy, 2025), a growing ecosystem of platforms has promised non-specialists the ability to build complete applications conversationally. Recent surveys and reviews frame vibe coding as intent-first, conversational programming layered over large language model backends (Ray, 2025; Ge et al., 2025), while applied accounts describe structured, prompt-driven workflows that can compress idea-to-analysis timelines, relieve pressure on scarce technical staff, and “democratise” coding for researchers and clinicians (Crowson and Celi, 2025; Lee et al., 2025; Fawzy et al., 2025).
Here I offer a critical, first-person account of such platforms as they are marketed to musicians, coders, and creative practitioners. Drawing on an auto-ethnographic study of my own attempts to build music-AI tools using Replit agents, Lovable.dev, and Google’s Gemini 3.0 AI Studio, I argue that - contrary to the democratising narrative - current vibe-coding platforms often function as unreliable, opaque, and economically inefficient environments that actively hinder creative work. While promotional and academic accounts highlight productivity gains, they also acknowledge risks around correctness, cost creep, and skill dilution (Crowson and Celi, 2025; Fawzy et al., 2025), which my experiences amplify rather than allay.
Methodologically, I approached these systems as a practising musician-programmer tasked with building concrete artefacts (e.g. chord generators and analysis tools) for real use. Across multiple projects, the pattern was consistent: platforms could scaffold plausible-looking code and user interfaces in the chat window, but this veneer of fluency collapsed at execution and deployment. I repeatedly encountered brittle code, contradictory or circular error states, and anthropomorphised but uninformative failure messages such as “The creative spirits are busy right now (Model Overloaded)”. Fixes proposed by the same systems frequently broke other parts of the stack, producing regeneration loops that burned through substantial paid API credits and subscription fees without ever yielding a stable application - mirroring wider concerns that AI-native development can quietly drive up costs while lacking clear organisational controls over how these tools are used (Crowson and Celi, 2025; Hassan et al., 2024).
On this empirical basis, I do not regard current vibe-coding platforms as a viable development paradigm for AI music practice. By contrast, a simpler workflow - using a single LLM as an assistant to generate, critique, and iteratively edit conventional source code - proved more robust, transparent, and effectively free at the margin, aligning with findings on “programmer’s assistant”–style interaction that preserves human judgement and oversight (Ross et al., 2023; Peng et al., 2023; Prather et al., 2023). The promise of “no-code” or “chat-only” development thus obscures the reality that successful systems still require software-engineering expertise, and that hiding this behind automation can intensify, rather than relieve, the cognitive and economic load on creative practitioners.
For AI Music Studies, this negative result is itself a contribution: it foregrounds how labour, risk, and responsibility are redistributed in AI-native toolchains, and how practice-based accounts can puncture overly optimistic imaginaries of frictionless, “democratised” AI music creation.
Author: Karl Simu, KTH Royal Institution of Technology
This paper examines the nature of design-induced bias in generative “AI Music” tools. Systems in which human creators share agency with machine learning models are increasingly prominent across musical applications. A growing number of services, including the examined tools AIVA, Boomy, and Mubert, allow users to rapidly generate musical material by manipulating only a limited set of parameters. Although these systems are often framed as democratizing creativity, they remain embedded in cultural, technological, and market-driven assumptions that warrant critical analysis. Drawing on Friedman and Nissenbaum’s concept of bias in computer systems, Lindgren’s theory of AI assemblage, and Born’s dimensions of diversity, this study adopts a social-constructionist perspective to situate AI music technologies within their cultural and material contexts. Methodologically, the study combines qualitative content analyses of tool interfaces with semi-structured interviews with developers. These materials were interpreted through thematic analysis to explore how musical parameters, design decisions, and commercial priorities interact. The findings highlight three interrelated concerns. First, music is primarily conceptualized through Western popular traditions, with acoustic-aesthetic features often subsumed under familiar abstractions such as genre, mood, harmony, and melody. Second, while the tools emphasize user agency, personalization, and accessibility, many aesthetic and technical constraints remain concealed behind opaque interfaces, further interacting with prevailing mythologies of AI automation. Third, the tools reflect platform-capitalist logics, emphasizing features that facilitate shareability and the monetization of outputs. By unpacking these dynamics, the study offers an early contribution to AI Music studies, underscoring that bias is not only a technical issue but an inherent feature of AI Music assemblages. It calls for more critical, socially situated approaches to the design and evaluation of AI music tools. Promising directions for future research include targeted ethnographies with company employees and users, as well as analyses of the economic structures and multifaceted forms of creative platform labor involved.
Authors: Yerim Gim, Seoul National University
The recent rapid advancement and commercialisation of artificial intelligence music technology highlight the need for an academic understanding of the AI music ecosystem. Evaluations of AI creativity too often focus only on the final aesthetic result. This tendency ignores the complex relationship between human-AI interactions that occurs during the entire creative process. In this paper, I question whether algorithmically generated music can truly be regarded as an entirely independent work of AI creativity. I propose a shift in methodology: analysing the creative process, not just the output.
The subject of this research is the Beethoven X Project’s Symphony No. 10, which gained attention for completing Beethoven’s unfinished symphony. Utilising Margaret Boden’s three types of creativity (combinatory, exploratory, and transformative) as a theoretical framework, this research aims to go beyond simple musical analysis. Instead, I am examining how the creative process was structured itself – specifically, how human experts interpreted Beethoven’s remaining sketches, selected among AI-generated options, and orchestrated the final piece in response to the algorithm’s capabilities and limitations.
Based on publicly available documentation, this paper examines movements three and four compared to movements one and two, analysing both musical structure, creative outcomes and process. This preliminary analysis suggests that AI-generated movements achieve mixed results when evaluated through Boden’s creativity lens. While the initial findings show combinational and some exploratory creativity, it demonstrates limited evidence of reaching the transformational creativity which is found in Beethoven’s original movements. Most importantly, the examination of project materials shows that decisive intervention by human experts determines the instrumentation and structural integration. This suggests that, although the work has a certain level of creativity, it raises questions about the degree of AI autonomy.
Ultimately, I argue that AI music research should shift from evaluating only the result to analysing the entire creative process. This approach could illuminate the complex roles of human musicians, developers, and musicologists within the ecosystem. The methodology proposed here is also extendable to analysing other AI music beyond Beethoven X.
Panel Chair: Prof. Yang Chien-Chang, National Taiwan University
This panel presents two case studies utilizing AI models, alongside a critical reflection on recent developments in digital humanities and AI-assisted musicological research. This panel offers a critical reflection on how AI-assisted analyses intersect with, interrogate, and extend the narrative frameworks developed in traditional historical and analytical music studies. There will be three panel members and one Chair participating.
The first paper reviews recent development in AI-assisted studies in musicology. This paper examines how these recent trends re-evaluate historical and structural events, suggesting revisions to traditional views on the ontological status of music. AI-assisted studies indeed reveal recurring patterns, allowing researchers to pursue potential underlying causes—for example, whether certain vector characteristics can be traced to actual collaborative relationships (such as co-performance or interview statements about artistic influence). Yet rather than treating machine-learning models merely as auxiliary tools for “verifying human hypotheses,” we might ask: Do the vector spaces generated through AI analysis themselves constitute an alternative form of objectivity? These perspectives demonstrate the inherent plurality of musical meaning and the absence of any singular interpretive standard. Thus, rather than asking whether Music Information Retrieval (MIR) features possess “validatory” or “privileged” status, it may be more productive to recognize such capacity, in itself, may constitute an emerging mode of objectivity within musicology.
The second paper investigates the localisation of production practices in post-war Taiwanese popular music (1975-1995). Grounded in the Social Construction of Technology, this study integrates Social Network Analysis with AI-assisted audio analysis to quantify the "sonic identifiability" of sociotechnical actors. By mapping collaboration networks from archival credits, we identify key social groups—such as arrangers and studios—as historical labels. To validate their sonic agency, we employ a representation learning framework that projects instrumental tracks (isolated via Source Separation) into a high-dimensional latent space. This approach allows us to investigate whether the social consolidation of production roles is mirrored by intrinsic sonic distinctiveness. By analyzing these representations, the study offers a computational means to narrate the power dynamics between human creativity and technological materiality, providing a scalable framework for writing localized production histories.
The third paper proposes an AI-assisted framework for analysing modal structure in Asian instrumental traditions by demonstrating how micro-level pitch behaviour encodes higher-level musical organisation. Focusing on two contrasting systems—Guqin performance in the Chinese literati tradition and Beiquan ensemble practice in Taiwanese ritual music—the study integrates computational tools with musicological interpretation to link qualitative and quantitative approaches. For Guqin, we argue that global modal and phrase structures shape local pitch contours and playing techniques. Using a newly built performance dataset, we develop a structural mode-detection method and fuse its output with refined pitch-contour extraction for neural-network-based technique classification. Findings reveal a reciprocal dependency: mode context guides gesture recognition, while sliding behaviours provide cues for mode identification. For Beiquan, AI-based pitch analysis identifies mode-specific microtonal distributions that challenge equal-temperament assumptions and clarify how microtones articulate modal identity. Across both cases, AI operates as an analytical partner, surfacing insider musical knowledge and opening new methodological pathways for musicology and ethnomusicology.
Authors: Dr Robert Prey & Prof. David Hesmondhalgh
Since the global spread of recorded music in the early twentieth century, powerful companies – mainly based outside the actual music industries – have constantly shifted the predominant ways in which recorded music is experienced. Since the 1990s, digitalisation and platformisation have shifted the sector exercising such power, from consumer electronics to information technology. In this paper, we make a number of claims about the ways in which AI is extending and intensifying the changes brought about by digitalisation and platformisation. First, we argue that AI continues the often contentious but frequently mutually dependent relationship between music industries and tech companies. For example, major recording and publishing companies are simultaneously litigating against “unlicensed” AI players like Suno while partnering with others that agree to work within a licensing framework. Second, we show that AI intensifies a long history of musical surplus and overaccumulation. Each new technology adds more catalogue without subtracting old ones. AI-generated “slop” is another layer in this process: it adds further abundance on top of vast streaming catalogue. Third, on the consumption side we argue that AI extends the shift to “networked mobile personalisation” that emerged in the various phases of digitalisation taking place in the late 20th and early 21st centuries, via adaptive soundtracks, personalised mood-music, and so on. Fourth, on the production side, we show that AI tools extend a long-standing pattern in which new technologies shrink the number of people needed to write, record, and release a track—even as the cultural value of musicking continually pushes back against that reduction. Fifthly, we claim that AI adds further levels of instability, obsolescence and waste centred on music. Overall, AI Music has emerged as the latest “testing ground” for cultural tech, in ways that strongly echo earlier periods. Against much talk of rupture, our argument is that AI represents a continuation and extension of long-standing dynamics in the music industries.
Author: Ravi Krishnaswami, Brown University
The United States Constitution establishes Copyright “to promote the progress of science and useful arts, by securing for limited times to authors and inventors the exclusive right to their respective writings and discoveries.” This original language gestures towards the implications of new “AI” music generators like Suno and Udio, where any user can describe a piece of music with language and hear a fully produced recording in seconds. These outputs are poised to devalue the “useful art” of functional music, where music libraries and composers market their intellectual property for advertising, television, and other media placements. Suno and Udio, who scraped all their musical training data without any licensing deals in place, recently settled lawsuits brought by major labels and publishers, while Stable Audio, ElevenLabs, and others, are “fairly trained” on licensed datasets, often drawing from production music libraries. The US Copyright board does not consider the outputs of these products as eligible for copyright, because they do not represent “human expression.”
This paper presents ethnographic research conducted in the functional music industry, AI generation user groups, and the music information retrieval industry, with the goal of understanding AI music generation products as a continuation of the functional music industry rather than a disruption of it. Functional music has historically provided income opportunities for musicians and composers who are working towards professional careers as artists. As part of a neoliberal “portfolio career,” functional music has contributed to the music industry’s labor ecosystem, rather than existing outside of it. Yet, perhaps because of its utilitarian approach to creativity, the functional music industry has often been the fastest sector to adopt new technological and financial paradigms that have automated tasks and shifted the locus of value creation. Generative AI is the latest example of the industry’s habit of creative destruction: the reification of musical knowledge contained within machine learning shifts the locus of value from musical IP to technoscientific rentiership, and the labor from creatives to coders. One question we might ask in this moment is what will it mean for young musicians that knowledge of Python will be more useful to building a career than knowledge of ProTools? Through ethnography and historicization, I attempt to answer such a question by connecting the dots between disparate communities: professionals in the music for media industry, computer scientists working on music information retrieval, and generative AI user communities, looking for harmonies and dissonances between their discourses.
The themes of this research align with the goals of The Second International Conference in AI Music Studies, especially in speaking to how established musical ecosystems such as the functional music industry are grappling with AI, while also addressing how AI challenges existing business practices built around the protection of IP.
Author: Suren Pahlevan, University of Cambridge
As scholars of AI Music Studies we must place ethnography and participant observation at the centre of this newly emerging field. In this paper I reveal qualitative and quantitative findings from ethnography and interviews conducted with 24 digital audio workstation-based Hip Hop producers regarding their usage and views on a variety of AI tools, such as the stem splitter, music generative AI platforms such as Suno and Udio, and AI MIDI plugins. Interviewees include multiple multi-platinum certified producers, a 6x GRAMMY-nominated producer, a BAFTA-winning producer, an OSCAR-nominated producer, as well as many semi-professional and hobbyist producers at earlier stages in their music careers. Artists that the professional producers within this cohort of 24 interviewees have produced for include: Drake, The Weeknd, Kanye West, Cardi B, Madonna, Stormzy, Soulja Boy, Nines, Travis Scott, Bryson Tiller, Lil Tecca, Juice WRLD, Dave, J Hus, Central Cee, Idris Elba, 50 Cent, Ludacris, Lil Baby, Roddy Rich and many others. With a thematic focus on the topic of ‘robophobia’, findings reveal where in their musicking workflows (Small, 1989) producers desire ethically trained AI, what type of tools they see as threats or enhancements to their livlihoods and their thoughts on being out-streamed by AI-generated artists. Examining the political economy of music streaming in the age of AI production tools (Drott, 2023; Sturm et al., 2024; Pelly, 2025), I also compare and contrast the findings from professional producers with those from semi-professional and hobbyist producers in order to illustrate whether music industry success alters (or not) ones artistic relationship with AI, and elaborate on the variety of nuanced viewpoints that can be found amongst professional producers themselves regarding the future of their crafts and music scenes.
This work itself, along with my proposal that the field must place greater emphasis on ethnography and participation observation, aligns fairly comfortably with the broader disciplinary boundaries of ethnomusicology. Nonetheless, I additionally reflect on how incorporating academic works and modes of knowledge production from computer science, computer music, human-computer interaction, law, history, music technology studies, development studies, science and technology studies, AI ethics and AI philosophy has enriched and enhanced this doctoral research. Moreover, by centering producers’ experiences and viewpoints, I explain how the field of AI Music Studies has the unique opportunity to leverage its highly interdisciplinary nature and foundations to communicate insider (producer) knowledge of music AI products and tools to a wide range of stakeholders, including music consumers, policymakers, music producers, and of course AI developers and companies, with the ultimate goal of ensuring that AI enhances enhances rather than stifles human musical creativity.
Author: Matthew Day Blackmar, Indiana University
With AI-generated audio flooding streaming platforms (Drott, 2024; Pelly, 2025), settlements in RIAA v. Suno and RIAA v. Udio ensuring no clear case law on the matter, and the US Congress far from the consensus needed to reform the dated Copyright Act, now is a good time to ask: when and how will musicians get paid for the proprietary audio ingested, without permission, as AI training data? When I asked one software developer what attribution engines—micro-licensing software solutions for training data—will mean for the field of forensic musicology, this founder of a Silicon Valley startup did not mince words: "We aim to put them out of business." This paper seeks to test this claim.
Forensic musicology and attribution engines share a similar premise: that the proportion of inputs into a recording/composition can be reconstructed from the output, thereby performing scientistic rhetoric alongside less-than-scientific rigor. As part of a an in-progress book-length study of the emerging AI music copyright business, this paper asks whether the methods of forensic musicology might in fact prove valuable tools for critiquing the outsized claims to "objectivity" of the attribution-engine industry. Where such engines seek to "disentangle" training data and generative output, forensic musicologists use music theory and analysis to establish claims to "substantial similarity" between original and allegedly infringing works. But, as self-reflexive musicologists (e.g., Kerman, 1980) have long known, one cannot remove traces of the analyst from the analysis: the latter has long been mystified as objective knowledge production. Nowhere is this more apparent than in copyright jurisprudence (c.f. Leo, 2020).
Perhaps forensic musicology is not facing an existential threat. Perhaps analysis of substantial similarity between generative-AI inputs and outputs can instead help us to understand how audio- and symbolic-music generators "think" in music-theoretical terms. After all, the weights and biases of machine-learning algorithms are ultimately enculturated (Seaver, 2022), "tuned" by software developers possessed of ears and entrained to musical conventions. Perhaps, ultimately, forensic musicology promises to show these very developers what mathematics alone cannot.
Author: Ken Déguernel, University of Lille
This presentation proposes to study AI music through the lens of Yuk Hui’s concept of cosmotechnics, defined as the “unification of the cosmos and the moral through technical activities” (Hui 2017; 2021). Cosmotechnics challenges the assumption that technology forms a universal, value-neutral trajectory, by developing Simondon’s idea that technology evolves through individuation: a process where technical objects and human practices co-develop within a cultural milieu (Simondon 1958). Cosmotechnics highlights how this process is shaped by local cosmologies (cultural worldviews and philosophical systems), not just material or economic factors.
AI music systems follow a modern Western trajectory of technics shaped by liberal-capitalist extraction, in which nature, and by extension culture, is treated as raw material for optimisation and instrumentalisation. This trajectory remains entangled with romantic and modernist notions of art that privilege universal aesthetics and individual creativity, leading to techno-cultural homogenisation, even while its rhetoric celebrates “creativity” and “democratisation”.
Even though recent initiatives within the MIR community have rightly called attention to dataset bias, representational diversity, and ethical implications of AI music (Huang et al. 2023), at the level of technical practice, the field has seen a progressive narrowing, first through the dominance of deep neural networks, and now with the focus on large language models, standardising architectures and representations of what AI music can be. Current commercial AI music systems amplify this tendency by using the same models, interfaces, and commercial strategies, further reinforcing a monolithic vision of music creation. This convergence exemplifies what Stiegler identifies as the entropic drift of the Anthropocene: a global alignment of technical systems that diminishes plurality and individuation, and accelerates cultural and ecological exhaustion (Stiegler 2018).
Cosmotechnics offers a neganthropocenic counterpoint: it reminds us that AI music is not, and should not become, a monolith, but a field of potential technodiversities. Recognising the multiplicity of cosmologies opens space for alternative trajectories and for the decolonisation of musical technics, where different communities may cultivate technologies grounded in their own epistemologies and artistic practices. The practical implication is a need 1) to free data governance from private interests, and the development of libre, open-source infrastructures adapted to specific cultural needs; 2) to create tools that resist/oppose the normalising tendencies of dominant Western technologies, that can consider different conceptions of sound, time, technique and relationality; 3) to challenge the Western-centric framing of music and technology in education (Ewell 2023).
Acknowledging the need for technodiversity requires confronting the asymmetry of the struggle against the current tech industry dominating AI music, particularly for independent musicians and communities whose practices fall outside commercial logics. This makes it all the more urgent for academics within the MIR community to resist the pull of industrial priorities, rather than adapting its research agendas to them. Supporting technodiversity is not just a critical gesture but a practical one: it demands the preservation of technical knowledges, the support of situated practices, and the active development of multiple cosmotechnics as resistance.
Author: Yinuo Chen, Kings College London
This paper asks, through the lens of (queer) temporality, how AI music looks and feels from inside Chinese digital music ecologies, as a complex epistemic location in relation to the hegemonic music ecosystems of the Global North that organise most debates on “AI music.” China here is not simply “the Global South,” nor yet a new Global North. It is a contradictory formation whose tech giants operate at the speed and scale of Northern capital, proliferating the rhetoric of the AI race, while its musics remain marked as local, perpetually on the verge of being “not quite” what the global models know how to hear. I stay with this ambiguity and ask what happens when generative systems trained on asymmetrically Northern catalogues are asked to write for Chinese ears.
This project studies how Chinese digital music creators embed generative AI tools in their workflow and showcase them in their content, and how they narrate these embeddings and the outcomes. The methodology is grounded in digital ethnography and in my own position as a working musician moving through these platforms. I listen to how they talk about incorporating generative AI music, about AI improving or impeding their productivity, about whether AI can "really" understand prompts of desire like “Chinese style,” and about the emotions that surface when they compare working before and after generative AI music. I read these valuable on-the-ground experiences as situated theories of AI music, tracing the mentalities upholding how AI is embedded in, or refused from, the workflow of creativity.
I further propose temporality as a critical lens to examine the junctures in these subjective experiences that allow us to read collisions between AI and the human, the global and the local, centre and periphery. Narratives of “imperial lag” cast Southern industries as late and derivative. At the same time, Chinese recommendation systems and AI tools, under a “rapidly catching up” pace, demand that digital music creators be permanently prompt: feeding an algorithm that never sleeps, adjusting to metrics that refresh by the minute. Yet creativity on the human side is often framed as needing cyclical processes of reflection, experimentation and rest, or put simply, time. I propose that queer temporalities help frame this dissonance and bridge the conceptual gap between a romanticised creative world and the sociotechnical. Queer temporalities are helpful here because they name what it means to live and work “out of sync” with normative schedules of progress, productivity and success. As the artist is pulled between tempos – between AI and human, and between an AI future underpinned by listening habits of whiteness and a Chinese sonic imagination that is linguistically and affectively otherwise – it is time to turn to queer(ing) time to unsettle these binaries and to open critically inspiring ways of using AI as a creative tool that reconfigures human–machine relations. In doing so, queer temporalities become not only a metaphor but a method for tracking how perceptions of time mark out an alternative politics of creativity in AI music.
Author: Marco Amerotti, University of Nottingham
“Small data” is often considered more comprehensible, personal, and ethically grounded than its counterpart, “big data”. While in many cases these assumptions seem to hold, we argue that the situation is more complex than that.
As big data, small data needs to be properly situated, and its specificity and limited availability might make the task far from trivial. Moreover, small data often comes from a variety of sources and in a variety of formats, while big data usually imposes a standard to allow a large number of data points to be analysed in the same model. While big data approaches can benefit from generalisation (and often have generalisation as their goal), small data techniques need to consider multiple, overlapping, and sometimes contradictory sources for the same concept, and each needs to be situated accordingly. To make things more complex, small data might be situated in a Weltanschauung that radically differs from that of involved researchers, developers, and end users of the model.
Drawing on feminist and anthropological theory, through examples from our research practice within AI and Irish Traditional Dance Music, we argue that small data for AI music needs to be situated in the cultural contexts and practices it refers to, and researchers need to be aware of their own positioning while collecting or using it. For this to happen, small data should be understood as “thick”, complex, and multilayered, containing more than “thin” descriptions of musical practice and integrating their meaning in context. Finally, we argue that small data is ontologically different from big data, not only in structure and size, but also in the way it represents and relates to the world; as such, researchers need to consider that the lenses through which they analyse a certain musical practice might not be valid interpretative categories in the target domain. We hold this characterisation to be a necessary premise for small data in AI music truly to be comprehensible, personal, and ethically grounded.
Prof. Nanette Nielsen, University of Oslo
This talk responds to philosopher Alva Noë’s recent critique of reductive comparisons between the human mind and AI, and his insistence that ‘to think is to resist—something no machine does’ (Noë, 2024). Drawing on music studies and enactivist philosophy, I investigate what musicking (Small, 1998) contributes to current debates on AI and consciousness. Using the recent public discourse surrounding the AI band The Velvet Sundown as a case study, I examine the mental-health and ethical implications of engaging with AI-generated music. Building on current scholarship on musical creativity and AI, I analyse the possibilities and limits of AI’s creative agency and argue that human creativity should be valued more highly than its artificial counterparts. I also consider the ethical stakes of sharing agency with AI, proposing that we must continue to cultivate sensitivity to our own creative capacities as empowered yet vulnerable musicking agents. Through a range of examples, I suggest that musicking affords an enactive entanglement that enables us to embrace risk, resist rules, and practise new forms of orientation and reorientation—tools that help us navigate the more resistant, but distinctly human, paths of existence.
Bio
Nanette Nielsen is professor at the Department of Musicology, University of Oslo. She works on music and philosophy, especially 4E cognition and musical experience and on intersections of ethics and aesthetics in twentieth- and twenty-first century music (across different genres), on sound on screen (film, tv-series, gaming), the rhythm and temporality of musical experience, and on AI and creativity. At the RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, Nielsen leads the project Engagement & Absorption (2018-2027). She is also a member of the consortium MishMash Centre for AI & Creativity (2025-2030).
Is depth’ Real? Sample-based music as a challenge to semantically-mediated GenAI
Author: Ashely Noel-Hirst, Queen Mary University of London
Live and retrospective evaluations of embodied digital scores: A mixed methods framework for combining human-AI perceptions of interaction
Author: Oliver Miles, University of Nottingham
Repairing Creative AI for Artists
Author: Anna-Kaisa Kaila, KTH Royal Institution of Technology
Playing Off the Page: Reviving Clementi-Inspired Improvisation for AI- and MR-Enhanced Piano Pedagogy
Author: Xinying Wang
Tempos and Temporalities of AI-Assisted Music-Making in China: Folk Theories and the Politics of Time
Author: Yinuo Chen
Exploring Creative-Cultural Bias in the Design of AI Music: A Critical Analysis of Three Generative AI Music Services
Author: Karl Simu, KTH Royal Institution of Technology
(Posters to be confirmed)
When AI doesn’t sound like AI: negotiating aesthetic expectations in technology-mediated musical practice
Author: Teresa Pelinski, Adam Pultz Melbye and Andrew McPherson
Title: CMI-Bench: A Comprehensive Benchmark for Evaluating Music Instruction Following
Author: Yinghao Ma
Author: Anna-Kaisa Kaila, KTH Royal Institution of Technology
The myth of an autonomous, artistically competent machine has captured the imagination of communities around the world since ancient times. To this day, such transhistorical and transcultural narratives are incessantly evoked by staged performances that involve novel technology such as AI and robotics. What happens when this ancient fantasy is materialised in front of a concert audience? What do the narratives of algorithms-on-stage tell us about the ideological tensions brewing in the encounters of novel artistic technologies and the infrastructures of cultural production and valuation?
Narratives function as a site for the formation and legitimation of shared meaning in various domains of public discourse, not only reflecting cultural self-understandings but also steering and transform the social and material reality. Professional music criticism is one such forum where narratives about music and technology are constructed and negotiated. Yet, it remains a topic sorely underexplored in the study of the emerging AI music ecosystem.
This project focuses on the critical media reception of two recent, algorithmically mediated musical performances: Beethoven X (2021) and Veer (bot) (2024). Each of them involves a staged performance of a symphonic orchestra, but with a technological twist: in Beethoven X, the last two unfinished movements of Ludwig van Beethoven's 10th symphony are “completed” with the help of AI models, whereas in Veer, the solo part of the human-made composition is performed by an industrial robot programmed to operate an acoustic cello. The temporal distance between these two premieres cuts across an era of both accelerating technology development and hype, highlighting narrative shifts and permanences within the contemporary discourses around technology and arts.
Using the lens of discourse analysis, I examine how music critics conceptualize and valuate symphonic algorithms-on-stage; what narrative and rhetoric strategies do they use to navigate the tensions and ambivalence between techno-optimistic and cultural-artisanal worldviews; and how the shifting power relations between cultural and technological fields seem to factor in these choices. The analysis brings forth three recurring themes: the narratives of firsts, the spectacle of the machinic Other, and the reconstitution of hierarchies of expertise.
The study contributes to the socio-cultural contextualization of novel technologies in performing arts, valuable both for developer and arts producer communities working with performative technologies as well as for the critical analysis and evaluation of these phenomena both in media and in research contexts. Furthermore, it explores the methodological potential of discourse analysis to enrich the interdisciplinary understanding of musical algorithms-on-stage.
Authors: Dr Laura Bishop (University of Oslo) with Prof. Craig Vear (University of Nottingham)
Small’s concept of musicking places a focus on the act of music-making, which is seen as a participatory process. There are many ways for people to participate in music-making: as performers, listeners, dancers, composers, and so on. Increasingly, machines can also participate in musicking, fulfilling various roles: as interfaces for sound production, as digital scores providing a visual source of ideas, or as interactive musical partners, among other possibilities. This paper is about a project investigating musical interaction between people and a “dancing” AI-driven robot arm called RAMI. RAMI was designed as a digital score but also integrates some aspects of a musical partner through its interactivity and embodied, movement-based cues. Our central aim was to investigate how people integrate RAMI into their musicking. Specifically, we questioned whether people would relate to RAMI as a musical partner and what effect this would have on their musicking experience.
We based our concept of what defines relationships between musical partners on the musical togetherness model (Bishop, 2024), which posits a set of factors that contribute to musicians’ rewarding experiences of togetherness when playing together. The model points to musicians’ perception of their partners as responsive as necessary for the interaction to feel social. Therefore, we focused in particular on whether musicians felt RAMI to be responsive. While interactivity is a part of its design, it does not have a concept of musical structure and is not responsive in the way a human partner would be. We carried out a series of studies with trained musicians who were invited to improvise with RAMI and, afterwards, report on their experiences through interviews and questionnaires. In one of the experiments, we manipulated whether RAMI engaged its interactive AI as per its original design or, instead, ran off a non-interactive script.
Our findings show that musicians found moments of responsivity and connection while improvising with RAMI. Their perception of responsivity was tied to how much the robot moved and moments of compatibility that arose in the timing or energy of robot movements and the music. These moments offered a sense of connection, which also arose in response to movements of RAMI that felt socially motivated or were aesthetic pleasing. This sense of connection persisted even though musicians expressed doubt over whether the robot was really listening or processing high-level musical features. Musicians felt partially responsible for creating moments of connection by attending to RAMI and coming up with aesthetically interesting ideas. A search for connection defined their early improvisations with RAMI, while in later improvisations, they focused more on the music and drew on RAMI for creative inspiration. These findings suggest that musicians used their knowledge of how musical interaction typically unfolds between human partners to scaffold the process of integrating RAMI into their musicking. They also show that people can experience some degree of musical togetherness with a machine (i.e. feelings of connection) despite holding undermining beliefs about its lack of intentionality.
Panel Chair: Dr Charalampos Saitis, Queen Mary University of London
While central to musical experience, timbre, the quality or “colour” of sound, remains elusive: difficult to define across disciplines and practices. As emerging music technologies increasingly rely on computational and data-driven representations of sound/timbre, including generative AI audio systems, this ambiguity has become both a technical challenge and an opportunity for critical insight. This proposed panel will discuss how the ambiguity of timbre makes it an ideal lens through which to explore how music and technology intertwine and co-construct one another—a site of (productive?) epistemological and methodological tensions between technoscientific and constructivist approaches. It invites participants to explore how if unpacked (but not necessarily resolved), these tensions could be rather generative for interdisciplinary and transdisciplinary collaboration.
Recent developments in AI-driven (“neural”) audio synthesis and the concepts originating from them such as “latent space,” “timbre transfer,” or “text-to-audio generation” underscore these tensions. The same systems that promise new creative possibilities also implicitly operationalise a “wastebasket” definition of timbre (the "everything else that is not pitch and loudness") by fixing other musical attributes (pitch, loudness, note segmentation) and leaving timbre to the mercies of opaque latent spaces informed principally by data, not by musical or auditory principles.
This panel will therefore ask: What are the musical, aesthetic, and sociocultural implications of those technical practices, including using language as an intermediary for describing timbre in LLM-based generative audio models? How is timbre currently represented in training AI music systems, and what could be gained (or lost) from including more detailed numerical models of timbral attributes in the training process? If timbre is a “thing” that exists, then what form does it exist in? A numerical dimension space? Is it something ascribed through listening and context or generated through measurable physical properties? How is timbre conceptualised and communicated across different musical ecosystems? How might the humanities and social sciences critically engage with timbre to illuminate and problematise the assumptions currently embedded in artificial creativity?
Panelists:
Perception/Acoustics: Charalampos Saitis (moderator), Queen Mary University of London (UK)
MIR/GenAI: Bob L. T. Sturm, KTH (Sweden)
Instruments/Engineering: Andrew McPherson, Imperial College London (UK)
HCI/Voice: Courtney N. Reed, Loughborough University (UK)
Education/Improvisation: Danae Stefanou, Aristotle University of Thessaloniki (Greece)
Composition/Artistic Research: Artemi-Maria Gioti, Mozarteum University Salzburg (Austria)
Author: Tom Wilma, University of New South Wales
With the popularisation of generative AI technologies across the global music sector, AI persona musicians have now emerged as a novel technologically informed music practice. These musicians leverage generative AI, not only in music creation, but as part of their artist brand and identity, often reimagining themselves as fictional AI visualised characters. Xania Monet, the AI persona alter-ego of human poet Telisha Jones, is one such example where this AI persona artist reportedly signed a US$3-million contract earlier this year (Melville, 2025). Sitting between the intersections of AI-generated music, virtual band performers and the conflicting potentialities of AI as either appropriative or ‘democratizing’, AI persona musicians invite a novel recontextualisation of longstanding concepts like authenticity; authenticity usually being evoked in popular music literature as a qualitative descriptor of integrity or truthfulness. A ‘thoroughly relational’ (Mol & Askins, 2018) term that negotiates communities, identities and platform contexts, while simultaneously grappling with the industry demands of scalability, marketability and transferability across platforms and technologies.
Previous literature has explored the processes of authenticity (Moore, 2002; Auslander, 1999/2022) and the authenticity narratives that are institutionalised within music markets (Mol & Askins, 2018) however, there remains a significant gap in theorising how new technological conditions interact, alter or persist with authenticity as a conceptual blueprint. Here, AI persona musicians pose a research opportunity to revisit authenticity, particularly as practitioners move to assert authenticity through the often-contradictory imaginations of appropriated music and virtualised bodies. In line with discussion of how we, as academics, communicate and understand the insider knowledge generated through AI music ecosystems, in this paper, I gather and analyse interview data from several amateur AI persona musicians regarding their perceptions of authenticity. Guided by the research question ‘which properties of authenticity persist or are reformed in an AI era’, I propose the term machine (in)authenticity as the replication, adoption and de/recontextualisation of the signifiers that appeal to authenticity and that are navigated by AI-informed music practice. Machine (in)authenticity foregrounds the juxtaposed claims to authenticity and inauthenticity surrounding users of machine learning systems, particularly giving voice to the conflicting assertions around identity, self-expression, appropriation, access, novelty and stigma. It offers music studies a decentralised definition of authenticity that highlights connections between machine learning datasets, processes and users as co-constructing authenticity, and further, suggests that strong remuneration and transparency laws may also be beneficial for asserting a greater degree of authenticity in AI-informed music practice.
References
Melville, D. (2025). AI singer Xania Monet just charted on Billboard, signed $3 million deal. Forbes. https://www.forbes.com/sites/dougmelville/2025/09/27/al-singer-xania-monet-just-charted-on-billboard-signed-3m-deal-is-this-the-future-of-music/
Moore, A. (2002). Authenticity as authentication. Popular Music, 21(2), 209-223. https://doi.org/10.1017/S0261143002002131
Auslander, P. (2022). Liveness. Routledge: London. https://doi.org/10.4324/9781003031314
Askin, N & Mol, J. (2018). Institutionalising authenticity in the digitized world of music. Research in the Sociology of Organizations, 55, 159-202. https://doi.org/10.1108/S0733-558X20180000055007
Authors: Andrew McPherson (Imperial College London) & Teresa Pelinski (University of the Arts London)
Lucy Suchman (2023) asks how boosters and critics alike have come to uncontroversially accept AI as being a singular and stable "thing", while Scott and Orlikowski (2025) suggest we should treat AI not as a thing, but as "phenomena-in-the-making". Suchman argues: "our task is to challenge the discourses that position AI as ahistorical, mystify 'its' agency and/or deploy the term as a floating signifier." Is it therefore even reasonable to consider "AI music" as being a meaningful category capable of having aesthetic properties? The opposite position, advanced by corporate interests promoting AI tools, is also unsatisfying: that generative AI is a boundless, neutral canvas for personal creativity with no aesthetic or cultural fingerprint.
This talk explores the possibility that there might be cross-cutting aesthetic tendencies of the multifarious socio-technical systems that are currently known as "generative AI music". Previous literature has investigated the aesthetic non-neutrality of even apparently open-ended music programming languages, whose affordances influence users' ideational processes (McPherson and Tahiroglu 2020; Snape and Born 2022). It is reasonable to suppose that generative AI systems produce a similar influence, but what is the specific direction of that influence, and are there any consistent elements from one technology to the next? Moreover, the particular ways that the language of music analysis is reified into numerical metrics used to train neural networks has aesthetic effects, especially in relation elements such as instrumental technique which are left out of such descriptions (Morrison and McPherson, forthcoming).
Any unified analysis of AI music aesthetics is made more challenging by the divergent goals, methods and discourses of AI in experimental arts practices versus in mass-market commercial systems. Where artist-researchers might emphasise the eerie, uncanny or haunted qualities of generative AI systems (Privato and Magnusson 2024; Pelinski et al. forthcoming), mass-market systems are often "individualist, globalist, techno-liberal, and ethically evasive" (Pram and Morreale 2025). Concerns about so-called "AI slop" displacing human artists are widespread (Morreale 2021; Berry 2025), but is this slop a perfect replica of the aesthetic landscape it is trained on, or are there recognisable tendencies? In the latter case, are those tendencies shared between experimental and commercial systems, and what are the cultural or ethical implications for arts or research practices that seek a critical use of AI tools? By consider a series of case studies from commercial and experimental practice, including several real-time neural networks for instrumental audio transformation (Reed et al., 2025), the talk will seek to draw out connections between fine-grained technical decisions and high-level sociocultural implications.
Genre, the limits of MIR, and computational modelling
Author: Dr. Owen Green - Respondents: Prof. Andrew McPherson and Dr. Fabio Morreale
Towards public infrastructure: Developing a public interest recommender system
Author: Prof. Georgina Born - Respondent: Prof. David Hesmondhalgh
AI and artistic critique: Maryanne Amacher and David Tudor
Author: Prof. Christopher Haworth - Respondents: Prof. Jennifer Walshe and Max Ardito
Listening like an algorithm: Ethnography of a machine listening system
Author: Dr. Artemi-Maria Gioti - Respondent: Dr Frances Morgan
Beyond ethics, for a critical interdisciplinary AI music pedagogy
Authors: Teresa Pelinksi and Prof. Georgina Born - Respondents: Prof. Jennifer Walshe and Dr. Francisco Bernardo
This talk responds to philosopher Alva Noë’s recent critique of reductive comparisons between the human mind and AI, and his insistence that ‘to think is to resist—something no machine does’ (Noë, 2024). Drawing on music studies and enactivist philosophy, I investigate what musicking (Small, 1998) contributes to current debates on AI and consciousness. Using the recent public discourse surrounding the AI band The Velvet Sundown as a case study, I examine the mental-health and ethical implications of engaging with AI-generated music. Building on current scholarship on musical creativity and AI, I analyse the possibilities and limits of AI’s creative agency and argue that human creativity should be valued more highly than its artificial counterparts. I also consider the ethical stakes of sharing agency with AI, proposing that we must continue to cultivate sensitivity to our own creative capacities as empowered yet vulnerable musicking agents. Through a range of examples, I suggest that musicking affords an enactive entanglement that enables us to embrace risk, resist rules, and practise new forms of orientation and reorientation—tools that help us navigate the more resistant, but distinctly human, paths of existence.
Nanette Nielsen is professor at the Department of Musicology, University of Oslo. She works on music and philosophy, especially 4E cognition and musical experience and on intersections of ethics and aesthetics in twentieth- and twenty-first century music (across different genres), on sound on screen (film, tv-series, gaming), the rhythm and temporality of musical experience, and on AI and creativity. At the RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion, Nielsen leads the project Engagement & Absorption (2018-2027). She is also a member of the consortium MishMash Centre for AI & Creativity (2025-2030).
Evening drinks around Historic Nottingham (not included in conference registration)
Free time for delegates. Some venues you might like to visit:
Ye Olde Trip to Jerusalem (Greene King), 1 Brewhouse Yard, Nottingham NG1 6AD
Pitcher & Piano, High Pavement, Nottingham NG1 1HN
Nottingham Secret Garden, 17 3/4 Trinity Walk, Nottingham NG1 2AN
Canalhouse, 48-52 Canal Street, Nottingham NG1 7EH
CosyClub, 16-18 Victoria Street, Nottingham NG1 2EX
Delegate event (included in conference registration)
Join us at Peggy’s Skylight, 3 George Street, Hockley, Nottingham NG1 3BH for an evening of entertainment, from 6pm onwards. A hot buffet dinner will be served at 18.45 followed by a performance given by students from the University of Limerick, incorporating Loeric (an AI music performance system for Irish traditional dance music).
Drinks not included
18:00-18:45 Arrival and bar open
18:45 Hot Buffet dinner
20:00 Performance
22:00 Close