Experiences with DALL-E

David Stevenson

I’ve been developing AI and Machine Learning systems for over 10 years now, and have patents in the field (though not specifically in photographic applications, though the maths used is similar for any type of data: pixels, text, sound – it doesn’t really matter to the AI algorithm what type of data it is).

If I were to summarise OpenAI’s Dall-E, which I have used not for creating photographs as such, but I have used for generating the equivalent of ‘stock images’, it would be ‘far from perfect’, though I’m sure that will improve with each new generation of AI engine. As an example, I’m an instructor with RAF Air Cadets, and I gave an evening’s presentation on ‘AI in the military’ some months ago, for which I generated some images using Dall-E. 


Here’s one example:

In the third picture, you’ll see that the sense of geometry is haywire, in that the cockpit canopy is behind the cadet, not in front, which doesn’t make sense visually.

 

In the next one, where I use the word ‘pilot’ instead of ‘cadet’ in the query, it has picked up that it needs to generate an adult, instead of a cadet, but again the geometry is out in the last image:

And to emphasise that all current Generative AI is based on probabilities, not certainties or logic, and has no concept of the real world, I asked it to create a cadet on a flying elephant, which it did quite happily, and without any sense of irony (as it’s a machine, after all, and has zero emotions or sense of the real-world, and doesn’t know when it’s been asked to do (or is doing) something crazy).

So, I’m not worried about Arny walking down the street any time soon, despite all the hype in the press about the world going to be taken over by AI – it’s not that good, and it may not be for years, if ever, as Data Scientists still haven’t cracked the ability to incorporate sound logic into the engines.