A Quick Update - Trying to Dig Myself Out
October 21, 2023
It’s been 4 months since the last post, and I wanted to put out this interstitial post just to cover the time lapse.
Since my last post, many things have been happening in the ML space that have impact how I think about all this.
OpenAI launched Code Interpreter, which became the Advanced Data Analysis beta - this allows you to upload a file to the chat session that becomes part of the chat context.
While this isn’t made available to people in the API, and it’s unclear OpenAI’s strategy for making this a UI-only feature, it did seem to support the direction in which I was going - that some other processing chain or chains could be used to most appropriately frame the input data into the context of the prompt.
The direction in which I was walking, was around breaking down code into structured elements - something I’d used Roslyn for in the past for doing static analysis of C# code.
I’ve written and deleted a lot of code over the past few months. Like, a lot. I’ve fallen into Python parsers, using Pydantic for analysis, shifted into other languages like C#, and while my thinking has shifted I unfortunately have nothing to show you for my work. It’s been overwhelming and a little discouraging.
But I’m not done.
Code Analysis for Coaching
As a technical coach, I’d often be looking for new ways to show an organization what their workflow and code looked like from different directions.
With my development background, I’d be putting together code that would scrape different systems and create unique visualizations to affirm my thoughts and open conversation around ways in which they could adjust their work to go more smoothly.
What started with simple git branch aging reports, turned into boundary analysis and how work from different areas of the organization splashed across various code repositories.
The last organization I was working with in this way used a lot of C# and had some amazing key technologists that had a breadth of experience, and with whom I could have deeper conversations around this idea. They introduced me to Roslyn for static code analysis and I started to apply it to boundary analysis.
Armed with these analyses, I’d have a deeper understanding of the organization’s code base, and the flows of change between people and code. As you might imagine I was deeply impacted by Adam Tornhill’s work and his book, “Your Code as a Crime Scene.”
Application with ML
Just like I needed to take millions of lines of code and somehow fit some kind of broader structure in my head so that I could talk intelligently about change opportunities, I am trying to take more code than fits in the context of an LLM’s context window, and ask the LLM to incorporate those fragments in the context of its generated responses.
Now this depends on a whole bunch of things, but these first two have been my current focus:
- Expressiveness - If the code-base is not expressive, just like poor names chosen for variables or function names will confuse a human, they will also horribly misinform the LLM.
- Cohesion - if the code-base lacks cohesion, that is if related logic is splayed out across the code-base, it will become harder to draw those related elements together to inform the LLM
As you can imagine, this is quite the rabbit hole. None of the code I wrote is worthy of publishing, and honestly, if you want to try a similar path to me, just ask ChatGPT to generate you some starter code.
For example, pulling the symbols out of the codebase with Roslyn:
using System;using Microsoft.CodeAnalysis;using Microsoft.CodeAnalysis.CSharp;using Microsoft.CodeAnalysis.CSharp.Syntax;using Microsoft.CodeAnalysis.MSBuild;namespace SymbolExtractionExample{class Program{static async Task Main(string[] args){// Load a C# solutionvar workspace = MSBuildWorkspace.Create();var solution = await workspace.OpenSolutionAsync("path/to/YourSolution.sln");foreach (var projectId in solution.ProjectIds){var project = solution.GetProject(projectId);foreach (var documentId in project.DocumentIds){var document = project.GetDocument(documentId);var syntaxRoot = await document.GetSyntaxRootAsync();// Here we're only looking at variable, method and class declarations.// You can extend this further.var nodes = syntaxRoot.DescendantNodes().Where(n => n is VariableDeclarationSyntax || n is MethodDeclarationSyntax || n is ClassDeclarationSyntax);foreach (var node in nodes){if (node is VariableDeclarationSyntax variableNode){var symbol = (ILocalSymbol)document.GetSemanticModelAsync().Result.GetDeclaredSymbol(variableNode.Variables.First());Console.WriteLine($"Variable: {symbol.Name}");}else if (node is MethodDeclarationSyntax methodNode){var symbol = document.GetSemanticModelAsync().Result.GetDeclaredSymbol(methodNode);Console.WriteLine($"Method: {symbol.Name}");}else if (node is ClassDeclarationSyntax classNode){var symbol = document.GetSemanticModelAsync().Result.GetDeclaredSymbol(classNode);Console.WriteLine($"Class: {symbol.Name}");}}}}}}}
I realize most of my examples to now have been in Python, but honestly I couldn’t find anything like Roslyn in the Python community. If you know of something, please drop me a line!
Roslyn’s facilities like Renamer.RenameSymbol
are just incredibly simple and powerful. In fact, if you’re curious, the Roslynator project built on top of Roslyn is an amazing project that abstracts many refactorings / structural shifts and provides you an API to perform them programatically (though it’s designed to expose these to a human within Visual Studio, so there’s not a lot of documentation on using them programatically).
Possibilities
My shift into C# has indicated a number of opportunities for exploration:
- use the LLM to extract meaning for symbols by examining the implementation code, and then suggest and refactor those symbols to more meaningful names
- use the LLM to seek out poor cohesion by examining implementation code grouped by symbols and finding similarities, and then refactoring to draw those areas together, perhaps generating adapters for all the disparate usage patterns
- what would it look like for a human to supervise such refactoring work - given that it would be quite risky
- it seems to me that analyzing a code-base and reorganizing it across different class-method boundaries would be an interesting linear-programming challenge, if we had a test-suite that could be used in an Objective Function
- eg the ML algorithm would explore different ways of dividing code between methods / classes, maximizing for cohesion, and rejecting scenarios that did not pass a test suite
- I do this myself a lot using extract-method and inline-method refactorings, so this may not be as disruptive as it sounds to someone unused to exploring boundaries in this way - it can be done safely
Will C# be my only focus stepping forward?
Probably not. But it’s where my head’s at right now.
And as you can see, I’m not done with this rabbit-hole. This is still just looking at the problem of taking a sub-optimal code-base and making it better, just so that I can provide an accurate enough stripped-down context for generating new contributions to the code-base from the LLM.
Related Work
There’s some really interesting work going on in MemGPT around managing a memory context that’s larger than the LLM’s available context.
The fun thing is that I already had in my mind say the ideas of long-term memory vs short-term memory vs working memory, their relatedness to how we treat memory usage in modern operating systems (eg physical memory, virtual memory, disk storage).
The direction I’m thinking though is while shredding a bunch of embeddings from the original content around fixed-length boundaries and re-assembling context from that, I want to divide original content around boundaries that follow the boundaries declared in the software to assemble context.
While the tools in MemGPT are about generically managing which knowledge lives in what memory, the tools I see for code management are about managing which bounded domain-context code (by proximity to the subject matter at hand) live in what memory.
Exposing the Innards
I find it interesting though, all of the ways I’ve been reminded of how I, cognitively as a developer, approach coding and refactoring.
We keep trying to categorize all kinds of tacit knowledge we gain through experience, and pull it into bite-sized chunks for the novices. I’m thinking SOLID, 4 Rules of Simple Design, CUPID, you name it. These categorizations are lossy, and broken, and can be fuelled more by the notoriety of their creators than their practical merit.
We categorize maneuvers in code like Martin Fowler’s refactoring patterns and code smells, or Joshua Kerievsky’s refactoring to patterns.
Working through what I should do as a human, as the artisan if you will, and what drudgery I could offload to a computer or “AI” to enhance my work I do wonder about all the times I’ve discovered things in the drudgery that made me a better artisan.
The boundary here is thorny, but it does consume my thoughts.