On Thu, 12 Oct 2006, [IDC]Dragon wrote:

P#6159: Add voice to roughly 100 splashe screens and yesno menus, without
cluttering the code too much.

The menus should all be voiced from the early days on (except for the debug menu), definitely including the yes/no choices.

Am I missing something, did something got lost while I'm out?

Well I'm new to Rockbox, I've only had an RB supported player for about two weeks. I bought a bigger player than I really wanted, just so it could run RB. Being blind myself, I figured that way I'd finally have access to all my player's functions because of the speech interface. And I hoped I could tinker a bit. Frankly, I found it a bit less accessible than I had been made to believe :-). There are a lot of situations in which you are dropped in a non-talking context, you don't know where you are or how to get out... I don't really mean to complain, this is a fantastic piece of software, and having it free and open I can improve on it.

I don't know if speech has been mistakenly taken out of some contexts, I haven't been around long enough, but I do know lots of contexts don't speak.
-yesno confirmation dialogs,
-splash screens: some are just confirmations, lots are error messages...
-no speech when playback is paused. Among other things this makes it very hard to place a bookmark in a precise spot. Not to mention it's pretty confusing... -The auto bookmark creation query is not spoken, because playback is paused at that point. -Somehow some of the menus don't speak during playback, while they do when playback is stopped.
-The FM radio doesn't speak.
-The recording screen doesn't speak. It also ought to have some sort of optional beep on starting recording, and some other beep on stopping, so you know it got the key press. -I haven't noticed a way to get to the ID3 info, or to info like file sizes and durations...
-The playlist viewer doesn't seem to be accessible AFAICT.
-The simple equalizer menu options are not spoken.
-The keyboard was quite unusable when I tried it. I got sighted assistance, and it seemed to be buggy even for sighted people (on the X5). -The file browser could use some short audio clips as icons to quickly indicate the type of a file or directory. -None of the plugins talk apparently. I could use a stop watch when working out at the gym. It ought to be feasible. -And there's a few features that would be useful to audio book readers... for one, it's too easy to skip to the next track when you just wanted to adjust the volume, and the bus happened to hit a bump at that moment... Audio book tracks can be very long, so it can be VERY annoying to find your place again.

... and more.

Anyway I'd like to work on some of that, in what little spare time I have :-). I just wanted to make sure there's no actual opposition to this sort of stuff. Also I am a bit disconcerted at the number of patches that appear to be languishing in FlySpray. Is there an issue with getting stuff reviewed and committed? It's hard to find anything in there... Will it be hard to get my code into RB?

In general, more voiced strings bring a problem for Archos targets. The voice file must be no larger than the playback buffer plus some headroom for dir+file name clips, about 1.5MB. We've been compressing the clips more and more agressively to make it fit, now it's close to maximum.

Yes that's one of the first issues I spotted when I tried to make my own voice file. I think the maximum has indeed been reached. Now it seems it only fits by making the speech faster (so it takes less time) :-). We can't just stop adding more speech though: features keep being added, and newer players have more resources. We need to work out a solution for this. One way would be to have the Archos load only the most essential clips and do without the rest. Alternatively, we could break the monolithic voice file concept somehow, loading clips on demand and keeping a cache... But perhaps this has been discussed before?

Perhaps we could have one voice clip per word, instead of one clip per phrase (although I haven't really checked how many duplications there really are). Some contexts display multiple phrases, but the infrastructure isn't really there to list multiple IDs. I see code like:
        gui_syncsplash(HZ, true, (unsigned char *)"%s %s",
                       str(LANG_PASTE), str(LANG_FAILED));

So in that case we can just say "Failed", but in general this is becoming tricky to handle. Just thinking out loud here but one trick we might do is submit the actual text to the talk functions, split that into words, and look for a voice clip for each of the words and speak them in sequence. Plus special processing for numbers... and spell the rest.

Anyway this gets complicated, which is why I thought I'd start with simple stuff, and that gets us most of the way anyway.

In general, cheers for maintaining my voice feature!

Well for all the possible improvements I've listed, I really must say that feature is very useful and very appreciated, and Rockbox seems like a lot of fun to hack.

--
Stéphane Doyon
<[EMAIL PROTECTED]>
http://pages.infinit.net/sdoyon/

Reply via email to