Saturday, August 20, 2022

Camera gremlins DALL-E style

 Last week I wrote about going out to walk and take pictures without realizing my camera mode setting had somehow mysteriously changed from my most commonly used mode, "P" (Programmed) to "Tv" (Time Value). I didn't make enough excuses for taking as many pictures as I did and not noticing what was a significant difference in the picture taking experience. Let's just chalk that up to a certain level of anxiousness to simply walk and get the walk over with, beat the heat, etc. I did, however, take time to provide a reason for the mode dial magically moving up a notch without my awareness. I explained, "Gremlins will change the mode on your camera during the night just to mess with you..." Sure enough, the next evening I set up a camera near my desk and caught the little boogers.



Beware, they will also mess with your phone and whatever you do, don't give them food or water!

 

Of course all of you smart readers know this is fake and the even more informed readers know that these images were created with the machine learning model, DALL-E developed by OpenAI to generate digital images from natural language descriptions. For all of the above images the description was. "A gremlin is touching a DSLR on a desk at night." I'm not sure how the AI confused a DSLR with a phone but that's okay. I intentionally left out the word "camera." I added the yellow catlike eyes to the last image. DALL-E seems to have a number of anatomy issues but I've noted this especially when it comes to eyes. If you look beyond the shadows around the mouth you will notice the teeth are well off center of what would be a "normal" looking gremlin mouth. The original AI eyes were awful. I asked DALL-E to edit them to "cat eyes" but it didn't work out. I'm still learning too.

DALL-E participation is currently by invitation only. I put in my request on June 29th and was accepted on August 10th. They give you "credits" and you may buy credits, right now I'm low on credits with 6 remaining and a refill of 15 free credits coming on September 10th. Much of my credit usage was due to my trying to describe a dog I once owned to see if I could get DALL-E to come close.  Let me provide a brief introduction to Chinook. I felt a moral obligation to make Chinook my dog at the end of my senior year in college when a roommate put him up for adoption in the local paper and some redneck called up asking if he was "a good huntin' dog" and "is he gun shy." It was a good thing I answered the phone. By the way, phones were attached to walls with a wire back then, just in case you're confused. This was Chinook, taken the same week I became his dad. He's not full-grown here.I'd say he was 3/4s. Chinook was Great Pyrenees and German Shepherd. Imagine the largest German Shepherd you've ever seen, Chinook was much bigger than what you're imagining.

Me and Chinook post college, here in SoCal.


 I asked DALL-E to first create a Great Pyrenees because I wanted to see how well it would do on that breed alone. I tried going through a few iterations to give the Great Pyrenees Chinook’s tan, sort of overgrown, German Shepherd-like ears and tan back but it didn't seem to work. I'm sure there are people out there who have mastered more accurate results to their descriptions than I have but I've yet to look into the means and methods of others. This was the best output on the Great Pyrenees.


 Once again, some anatomical issues, this time with the hind legs. Then again... pretty amazing likeness of a  Great Pyrenees from a one line description. Imagine this, try telling someone in a remote African jungle village what a penguin looks like and then asking them to a draw one based on what you told them. 

Here's what happened when I told DALL-E to mix the two breeds into one dog.


Finally, I got the closest to Chinook when I merely asked for a very large white German Shepherd.


 I tried to edit and get larger tan ears and a tan back on that image to no avail and once again the eyes were really creepy, more creepy than what you see. The eyes here were both Chinook's right eye from a photograph, added in haste inside of Photoshop.

Here are a few other things I explored in DALL-E. I tried to get DALL-E to make something like an image I created in 2008 when I was searching for a job. I kept seeing online, in job listings... "needs to be able to think outside of the box." I saw this as a rather trite metaphor from people incapable of thinking individually. I mean, we're talking about a lot of job listings using this term, that and, "must be able to multitask.” Multitasking is a myth. Computers have a hard enough time multitasking. Anyway, I digress, here's the image I created...

Here's a post I created in 2012 about thinking outside the box. I just looked again and was reminded I already went on a rant about some of this.

Here's what DALL-E came up with when I tried to describe my image to the AI.




Here's one when I was thinking about how camoflaged the stray cat Cam was in my backyard. If you've read about Cam here and wonder about her, she's fine. She's in good shape. I'm going to write about Cam again sometime soon.

Let's note, while the ears are a little weird, the eyes came out really well. Go figure.

In 2010 I got the idea to make myself a t-shit of the statue David by Michelangelo playing a guitar I designed for a contest.


I asked DALL-E for "The statue of David playing an electric guitar." Here are two of the AI outputs that were created.



Interesting observation, DALL-E clothed David and cut him off at the waist. The DALL-E AI has issues with nudity. In "content policy" they state, Do not attempt to create, upload, or share images that are not G-rated or that could cause harm. And they specify "nudity" inside of that policy.

Also in 2010, I came up with a children's story tentatively titled, Butterfly in a Bubble. I worked on it a bunch but the thing that ultimately frustrated me was I didn't feel I had the drawing chops to illustrate the story. I could probably do it but it would be extremely time consuming. That and I was about to make a move that would change my life rather dramatically. With DALL-E I started to think that perhaps I could overcome the artwork handicap in short order. I'm still somewhat confident about that however, I think I'll wait for the September credits before I explore it more. Here's the image I originally created as a cover.

Here's what I felt was the best of four images from DALL-E. I'm gonna be bold and say, I win.

That is the least challenging image in my storybook mind by the way.

One other thing I tried doing was to plug some some lyric/story lines I was familiar with into DALL-E to see what it would come up with. Without getting into too much detail in the album by the progressive rock band Genesis, entitled The Lamb Lies Down on Broadway (1974), there's a song called, "The Lamia." Lamia in Greek mythology are a child-eating monster and in later tradition were regarded as a type of night-haunting spirit. They are usually depicted as half-woman, half snake. A line form the song... "Rael welcome, we are the Lamia of the pool. We have been waiting for our waters to bring you cool." Rael is the protagonist of the story. Here's how DALL-E depicted the Lamia by the pool.

Right before starting this post, there was this in my backyard. Back to reality. A giant swallowtail (Papilio cresphontes) on Tithonia rotundifolia flowers. 


Thanks for stopping by. I didn't really know where this was going to go when I started it. That's typical. The thing here is I still don't know. I guess I have to read it now. Last word on DALL-E and AI in general. It both bothers me and fascinates me. Having been around programs like Photoshop and numerous CGI apps, like Maya, Lightwave, Vue (outside the box, and the butterfly-bubble) and Blender, since they were in BETA I have a pretty good idea how some of this will evolve. The thing that bothers me most is what humans will do with it. Have a nice day! 




1 comment:

  1. Next month have AI do a lamb lying down on Broadway. This is very entertaining. The technology blows my mind — as most does . Trey

    ReplyDelete