Hi,

 

I know there are a lot of apps for doing image descriptions. I've been
working on one I call the Image Description Toolkit for close to a year now.
I'm not trying to replace anything else but rather fill a need I had that I
didn't find other products supporting.


The main points to my app are supporting Ollama for local AI models and then
being able to process large batches of images and save those descriptions. I
have the first version of the apps working on a Mac if anyone is interesting
in giving these a try.

 

These require you use the silicon-based macs but even then most Ollama
models will be far from instant. On an M1 Macbook Air I do get descriptions
that are reasonable using a model called Moondream in about 6 seconds an
image.

 

If you have a Claude or OpenAI key, the app does allow you to use many of
those models as well and process batches of images either from a cmd line
tool or a graphical app.

 

This is still under development and my first Mac app. It is created in
Python and there is also a Windows version. That has been my main focus
until recently.

 

You can get a copy of the toolkit at
http://www.theideaplace.net/projects/IDT-4.0.0Beta1Bld050.dmg.

 

I wrote a blog post recently talking about the latest release. The Mac
version wasn't quite ready but the functionality is the same.

 

 
<https://theideaplace.net/introducing-idt-4-0-beta-1-an-enhanced-way-to-desc
ribe-your-digital-images/> "Introducing IDT 4.0 Beta 1: An Enhanced Way to
Describe Your Digital Images - The Idea Place

 

You can learn more from my projects page at Software Projects Powered By My
Ideas in Partnership with AI Coding <https://theideaplace.net/projects/> .

 

This is still a beta and I've tried my best to ensure quality but I'm sure
there are issues. I do link to an issue list on my projects page.

 

If you do try this and use the Claude or OpenAI options with your own keys,
absolutely try one or two images with the models and prompts you choose.
I've done my best to test the models, my predefined prompts and the way I'm
handling images to maximize description quality and minimize cost and token
use but I have found the different models can definitely behave in various
ways here.

 

Kelly

-- 
The following information is important for all members of the Mac Visionaries 
list.

If you have any questions or concerns about the running of this list, or if you 
feel that a member's post is inappropriate, please contact the owners or 
moderators directly rather than posting on the list itself.

Your Mac Visionaries list moderator is Mark Taylor.  You can reach mark at:  
[email protected] and your owner is Cara Quinn - you can reach Cara at 
[email protected]

The archives for this list can be searched at:
http://www.mail-archive.com/[email protected]/
--- 
You received this message because you are subscribed to the Google Groups 
"MacVisionaries" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/macvisionaries/014801dca523%24ad91e390%2408b5aab0%24%40gmail.com.

Reply via email to