Tag: machine learning

When DOES it make sense to use AI?

I created my first neural network back in the late 90s, as part of my Ph.D, to do handwriting recognition on images of whiteboards. It wasn't a very good network; I had to write the whole thing from scratch as there weren't any suitable off-the-shelf libraries available, I didn't know much about them, and I didn't have nearly enough training data. I quickly abandoned it for a more hand-tailored system. But one of the early textbooks I was reading at the time had a quote, I think from John S Denker, which I've never forgotten: "Neural networks are the second-best way to do almost anything."

In other words, if you know how to do it properly, for example by evaluating rules, or by rigorous statistical analysis, don't try using a neural network. It will introduce inaccuracies, unpredictability, and make it very much harder either to prove that your system works, or to debug it when anything goes wrong.

The problem is that there are many situations in which we don't know how to do it 'properly', or where writing the necessary rules would take far too much time. And 'machine learning', the more generic term encompassing neural networks and similar trainable systems, has advanced amazingly since I was playing with it. For many tasks, we also now have masses of data available, thanks to the internet. (I was playing with my toy system at about the same time as I was experimenting with these brand new 'web browsers'.) So while it remains the case, as a Professor of Computer Science friend of mine likes to put it, that "Machine learning is statistics done badly", it can still be exceedingly useful. It would almost certainly be the right way for me to do my handwriting-recognition system now, for example, and over the last few decades we've discovered lots of other pattern-matching operations for which it is essential - analysing X-rays for evidence of tumours is just one example where it has saved countless lives.

But all of this is nothing new. So why the current excitement about 'AI'? After all, 'artificial intelligence', like 'expert system', is one of those phrases we heard a lot in the 70s and 80s but had largely abandoned in more recent decades, until it came back with a rush and is now the darling of every marketing department. Every project that involves any kind of machine learning (and many things that don't) will now be reported with 'AI' somewhere in the title of the article, even though it has nothing to do with ChatGPT, Claude, or Gemini.

And the reason is that, by appearing to have an understanding of natural language, generative LLMs have opened up the power of many of these systems to the non-technical general public, in the same way that the web browser in the 90s opened up the power of the Internet, which had also been in existence for decades beforehand, to ordinary users. (Many people ended up thinking the Web was the Internet, just as many people probably think ChatGPT has something to do with newspaper headlines about AIs diagnosing cancer.)

But it's not an analogy I'd like to push too far, because the technology of the World Wide Web did not invent new data, did not mislead people, did not presume to counsel them or tell them that it loved them. The similarity is that you needed to be something of an expert to make use of the Internet before the web, and you were therefore probably better able to judge what you might learn from it. If machine learning is statistics done badly, then 'AI' is machine learning made more unreliable, sounding much more plausible, and sold to the more gullible. Take any charlatan and give him skills in rhetoric, and you make him much more dangerous.

Regular readers will know that I am quite a cynic when it comes to most current uses of AI, and I consider myself fortunate that I was able to spot lots of its failings very early on. A few recent examples from ChatGPT, Gemini and other systems, some of which have been reported here, include:

  • Telling me that one eighth of 360 degrees was 11.25 degrees. (Don't trust it to do your financial planning!)
  • Telling a teenage friend that the distance from Cambridge to Oxford was 180 miles; she swallowed that whole and repeated it to me confidently. (It's actually more like 80 miles.)
  • Telling me that my blog was written by... well, several other people over the years, some of whom were flattering possibilities! (But there are several thousand pages here which all say "Quentin Stafford-Fraser's Blog" at the top.)
  • Suggesting a Greek ferry to a friend, as a good way to get to Santorini in time for our flight. (It didn't actually run on the days suggested, and we would have missed our flight if we had relied on it.)
And of course, the press has regular reports of more serious problems: So, some time ago, I announced Quentin's AI Maxim, which states that
"You should never ask an AI anything to which you don't already know the answer".
And for those who say, "But the AI systems have got a lot better recently!", I would agree. Some of my examples are from a few months ago, and a few months is a long time in AI. But I would also point out that, on Friday, when I asked the latest version of Claude to suggest some interesting places for a long weekend in our campervan, within about 2 hours' drive from Cambridge, one of its suggestions was Durham, which would probably take you twice that if you didn't stop on the way. I pointed this out, and it agreed.
"You're right to question that...I shouldn't have included it. Apologies for the error..."
Now, if I had been asking a human for suggestions, they might have said, "Mmm. What about Durham? How far is that from here?" But the biggest danger with these systems is that they announce facts just as confidently when they are wrong as when they are right, and they will do that whether you are asking about a cake recipe or about treatment for bowel cancer. Fortunately, I already knew the answer when it came to the suitability of Durham for a quick weekend jaunt! But here's the thing... Thirty-four years ago, I was very enthusiastic about two new technologies I had recently discovered. One was the Python programming language. The other was the World Wide Web. In both cases, more experienced research colleagues were dismissive. "It's not a proper compiled language." "We've seen several hypertext systems before, and none of them has really caught on." They were probably about the age that I am now. So, I don't want to be 'that guy' when it comes to AI. (Though I'm glad I *was* when it came to blockchains, cryptocurrencies and NFTs!) All of which brings to mind that wonderful quote from Douglas Adams:
"There's a set of rules that anything that was in the world when you were born is normal and natural. Anything invented between when you were 15 and 35 is new and revolutionary and exciting, and you'll probably get a career in it. Anything invented after you're 35 is against the natural order of things."
So in the last few weeks I have been doing some more extensive experiments with AI systems, mostly using the paid-for version of Claude, and the results have often been very impressive. They can be great brainstorming tools; I have to admit that some of the suggestions as to where we might go in our campervan were good ones... I'm just glad I didn't select the Durham option. They can be great search engines... just don't believe what they tell you without going to the source, or you too may have to call the coastguard. But perhaps I should modify the 2026 version of Quentin's AI Maxim to say something like:
You should never ask an AI anything where you don't have the ability, and the discipline, to check the answer.
And one of the areas where checking the answer can sometimes be an easier and more rigorous process is in the writing of software. I've been doing that a fair bit recently, and will write about that shortly. In the meantime, I leave you with this delightful YouTube short from Steve Mould. His long-form videos are always interesting - he has 3.5M followers for a good reason - and though I tend to avoid 'shorts' in general, this is worth a minute and half of your time.

Wisdom of the crowds, or lowest common denominator?

I liked this:

People have too inflated sense of what it means to "ask an AI" about something. The AI are language models trained basically by imitation on data from human labelers. Instead of the mysticism of "asking an AI", think of it more as "asking the average data labeler" on the internet.

...

But roughly speaking (and today), you're not asking some magical AI. You're asking a human data labeler. Whose average essence was lossily distilled into statistical token tumblers that are LLMs. This can still be super useful of course. Post triggered by someone suggesting we ask an AI how to run the government etc. TLDR you're not asking an AI, you're asking some mashup spirit of its average data labeler.

Andrej Karpathy

Thanks to Simon Willison for the link.

Two households, both alike in dignity

Two long-established names in the world of journalism are approaching the challenges of AI in very different ways.

The New York Times is suing OpenAI, in an expensive landmark case that the world is watching carefully, because it could have very far-reaching ramifications.

The Atlantic, on the other hand, has just done a deal with them.

This isn't a subject I normally follow very closely, but in what I found to be an intriguing interview, Nicholas Thompson, the Atlantic's CEO, explains how and why they made this decision, and explores areas well beyond the simple issues of copyright and accreditation. It's an episode of the Decoder podcast, hosted by The Verge's Nilay Patel, who is an excellent and intelligent interviewer.

Recommended listening if you have a car journey, commute, or dog-walk coming up -- just search for 'Decoder' on your favourite podcast app -- or you can get the audio, and/or a transcript, from the link above.

Behind the Tesla 'Full Self Driving' system

If I were giving advice to somebody considering buying a Tesla at the moment, it would be (a) buy it and (b) don't believe the 'full self-driving' hype... yet.

You'll be getting a car that is great fun to drive, has amazing range, a splendid safety record, a brilliant charging network, etc... and, in the standard included 'autopilot', has a really good cruise control and lane-keeping facility. One thing I've noticed when comparing it to the smart cruise control on my previous car, for example, is that it's much better at handling the situation where somebody overtakes and then pulls into the lane just in front of you. Systems that are primarily concerned with keeping your distance from the car in front have difficult decisions to make at that point: how much and how suddenly should they back off to maintain the preferred gap. The Tesla, in contrast, is constantly tracking all the vehicles around you, and has therefore been following that car and its speed relative to yours for some time, so can react much more smoothly.

The dubiously-named 'Full Self-Driving' package is an expensive optional extra which you can buy at the time of purchase or add on later with a couple of clicks in the app. At the moment, it doesn't give you very much more: the extra functionality (especially outside the US) hasn't been worth the money. If you purchase it now, you're primarily buying into the promise of what it will offer in the future, and the hope that this will provide you with significant benefits in the time between now and when you sell the car!

But at sometime in the not-too-distant future, the new version --currently known as the 'FSD Beta' -- will be released more widely to the general public. 'Full Self Driving' will then still be a misnomer, but will be quite a bit closer to the truth. YouTube is awash with videos of the FSD Beta doing some amazing things: people with a 45-minute California commute essentially being driven door-to-door, for example, while just resting their hands lightly on the steering wheel... and also with a few examples of it doing some pretty scary things. It seems clear, though, that it's improving very fast, and will be genuinely valuable on highways, especially American highways, before too long, but also that it's likely to be useless on the typical British country road or high street for a very long time!

What Tesla has, to a much greater degree than other companies, is the ability to gather data from its existing vehicles out on the road in order to improve the training of its neural nets. The more cars there are running the software, the better it should become. But the back-at-base process of training the machine learning models on vast amounts of video data (to produce the parameters which are then sent out to all the cars) is computationally very expensive, and the speed of an organisation's innovation, and how fast it can distribute the results to the world, depends significantly on how fast it can do this.

Last week, Tesla held their 'AI Day', where Elon Musk got up on stage and, in his usual way, mumbled a few disjointed sentences. Did nobody ever tell the man that it's worth actually preparing before you get up on a stage, especially the world stage?

However, between these slightly embarrassing moments are some amazing talks by the Tesla team, going into enormous detail about how they architect their neural nets, the challenges of the driving task, the incredible chips they are building and rolling out to build what may be the fastest ML-training installation in the world, and the systems they're building around all this new stuff.

For most people, this will be too much technical detail and will make little sense. For those with a smattering of knowledge about machine learning, you can sit back and enjoy the ride. There are lots of pictures and video clips amidst the details! And for those with a deeper interest in AI/ML systems, I would say this is well-worth watching.

There are two key things that struck me during the talks.

First, as my friend Pilgrim pointed out, it's amazing how open they're being. Perhaps, he suggested, they can safely assume that the competition is so far behind that they're not a threat!

Secondly, it suddenly occurred to me -- half way through the discussions of petaflop-speed calculations -- that I was watching a video from a motor manufacturer! An automobile company! If you're considering buying a Tesla, this is a part of what you're buying into, and it's astonishingly different from anything you'd ever see from any other car-maker. Full self-driving is a very difficult problem. But this kind of thing goes a long way to convincing me that if anybody is going to get there, it will be Tesla.

You may or may not ever pay for the full FSD package, but it's safe to assume much of the output of these endeavours will be incorporated into other parts of the system. So, at the very least, you should eventually get one hell of a cruise control!

The livestream is here, and the interesting stuff actually starts about 46 minutes in.

Street View Statistics

Google Street View is, I think, one of the most amazing achievements in recent times, and it's one of the things that keeps me using Google Maps even though many of the alternatives are rather good. If I'm heading to a new destination, I'll often look in advance at, say, the entrance gate, or the correct exit from the last roundabout, so those final manoeuvres when the traffic is slowing down behind you are less stressful: you're in familiar surroundings. Street View is, in that sense, a déjà-vu-generator.

And of course, it's great for bringing back memories of places * qu'on a vraiment déjà vu*. We can all think of dozens of examples; for me, this morning, the sea front by the Ullapool ferry terminal is somewhere I remember as a launching point into the unknown; it's where I stayed at a lovely inn before catching the ferry to the Outer Hebrides. Happy memories.

But there are interesting questions to be asked about Street View as well. For someone who enjoys window-shopping on Rightmove for a possible next home, it's a very valuable tool, and I've often wondered how much the market appeal of your property is affected by whether the Google car drove by on a sunny or a cloudy day!

And this morning, I saw debates on Twitter about research that used images of your house on Street View to estimate how likely you were to have a car accident, something which could be used against you by insurance companies (or, of course, in your favour, but that doesn't make such good headlines).

The paper's here, and I was most surprised by just how poor the insurance company's existing model was; information about your age, gender, postcode etc apparently doesn't give them as much insight as you might expect, and knowing whether you live in a well-maintained detached house in a nice neighbourhood gives them just a little bit more. Some see this as very sinister, but you need to remember that this wasn't some automated image-analysis system; the researchers had to spend a lot of time looking at StreetView pictures of houses and annotating them by hand with their assessment of the condition, type of house, etc. Some of this could be performed by machines in future, but there are lots of other factors to consider as well: is the issue that you are more likely to crash into somebody in certain neighbourhoods, or that they are more likely to crash into you? What's the speed limit on the surrounding streets? How close is the pub? And so on...

So I sat down thinking I would write about this, but one thing I failed to notice was the date of the research. I assumed that because people were talking about it on Twitter today, it must be new -- a fatal mistake. What's more, just a little bit of further research showed me that my friend John Naughton had written a good piece about it in the Observer two years ago.

So it's perhaps not surprising that I like technologies that can give me a sense of déja vu. My own abilities in that area are clearly lacking!

Testing Turing?

Stephen Pulman gave the Wheeler Lecture in our department this afternoon; an excellent discussion about whether current machine-learning techniques would ever allow us to build a machine that passes the Turing Test.

It made me wonder about the value of a variation on the theme, which I propose to call the Meta-Turing-Test.

It which would work like this:

Can we build a machine which, given a Turing Test scenario, can work out whether the responses are from a human or a machine, even when a human can't?