On Thu, 12 Oct 2006, [IDC]Dragon wrote:
P#6159: Add voice to roughly 100 splashe screens and yesno menus, without
cluttering the code too much.
The menus should all be voiced from the early days on (except for the
debug menu), definitely including the yes/no choices.
Am I missing something, did something got lost while I'm out?
Well I'm new to Rockbox, I've only had an RB supported player for about
two weeks. I bought a bigger player than I really wanted, just so it could
run RB. Being blind myself, I figured that way I'd finally have access to
all my player's functions because of the speech interface. And I hoped I
could tinker a bit. Frankly, I found it a bit less accessible than I had
been made to believe :-). There are a lot of situations in which you are
dropped in a non-talking context, you don't know where you are or how to
get out... I don't really mean to complain, this is a fantastic piece of
software, and having it free and open I can improve on it.
I don't know if speech has been mistakenly taken out of some contexts, I
haven't been around long enough, but I do know lots of contexts don't
speak.
-yesno confirmation dialogs,
-splash screens: some are just confirmations, lots are error messages...
-no speech when playback is paused. Among other things this makes it very
hard to place a bookmark in a precise spot. Not to mention it's pretty
confusing...
-The auto bookmark creation query is not spoken, because playback is
paused at that point.
-Somehow some of the menus don't speak during playback, while they do when
playback is stopped.
-The FM radio doesn't speak.
-The recording screen doesn't speak. It also ought to have some sort of
optional beep on starting recording, and some other beep on stopping, so
you know it got the key press.
-I haven't noticed a way to get to the ID3 info, or to info like file
sizes and durations...
-The playlist viewer doesn't seem to be accessible AFAICT.
-The simple equalizer menu options are not spoken.
-The keyboard was quite unusable when I tried it. I got sighted
assistance, and it seemed to be buggy even for sighted people (on the X5).
-The file browser could use some short audio clips as icons to quickly
indicate the type of a file or directory.
-None of the plugins talk apparently. I could use a stop watch when
working out at the gym. It ought to be feasible.
-And there's a few features that would be useful to audio book readers...
for one, it's too easy to skip to the next track when you just wanted to
adjust the volume, and the bus happened to hit a bump at that moment...
Audio book tracks can be very long, so it can be VERY annoying to find
your place again.
... and more.
Anyway I'd like to work on some of that, in what little spare time I have
:-). I just wanted to make sure there's no actual opposition to this sort
of stuff. Also I am a bit disconcerted at the number of patches that
appear to be languishing in FlySpray. Is there an issue with getting stuff
reviewed and committed? It's hard to find anything in there... Will it be
hard to get my code into RB?
In general, more voiced strings bring a problem for Archos targets. The
voice file must be no larger than the playback buffer plus some headroom
for dir+file name clips, about 1.5MB. We've been compressing the clips
more and more agressively to make it fit, now it's close to maximum.
Yes that's one of the first issues I spotted when I tried to make my own
voice file. I think the maximum has indeed been reached. Now it seems it
only fits by making the speech faster (so it takes less time) :-). We
can't just stop adding more speech though: features keep being added, and
newer players have more resources. We need to work out a solution for
this. One way would be to have the Archos load only the most essential
clips and do without the rest. Alternatively, we could break the
monolithic voice file concept somehow, loading clips on demand and keeping
a cache... But perhaps this has been discussed before?
Perhaps we could have one voice clip per word, instead of one clip per
phrase (although I haven't really checked how many duplications there
really are). Some contexts display multiple phrases, but the infrastructure
isn't really there to list multiple IDs. I see code like:
gui_syncsplash(HZ, true, (unsigned char *)"%s %s",
str(LANG_PASTE), str(LANG_FAILED));
So in that case we can just say "Failed", but in general this is becoming
tricky to handle. Just thinking out loud here but one trick we might do is
submit the actual text to the talk functions, split that into words, and
look for a voice clip for each of the words and speak them in sequence.
Plus special processing for numbers... and spell the rest.
Anyway this gets complicated, which is why I thought I'd start with simple
stuff, and that gets us most of the way anyway.
In general, cheers for maintaining my voice feature!
Well for all the possible improvements I've listed, I really must say
that feature is very useful and very appreciated, and Rockbox seems like a
lot of fun to hack.
--
Stéphane Doyon
<[EMAIL PROTECTED]>
http://pages.infinit.net/sdoyon/