[digitalradio] psk-125r
How does pak-125r work? Does it use the same varicode? Does it have error correcting code like QPSK-125? How many phases does it use? Would it work well for EME? n6ief
[digitalradio] the economy
How I would fix the economy Greed is hurting all of us. When is there enough wealth to stop getting more? If someone gets $100,000,000,000, he can afford to pay 99% tax and learn to live on $1,000,000,000. But our tax laws give that person $650,000,000,000. I don't believe that is fair. So I propose eliminating the current 35% flat tax that we currently have. Mike's non-flat tax plan wealth = tax + spending $100B 99% $1B $10B 95% $500M $1B 90% $100M $100M 80% $20M $10M 70% $3M $1M 50% $500K $100K 10% $90K $20K 0% $20K Some people say that, if poor people did not pay taxes, they would not have any incentive to try to gain wealth and live on welfare. This is crazy because wealth is the incentive and tax on poor people is just insulting. But no one has $100,000,000,000. I think Exxon-Mobil is trying to, real hard. If they had to pay 99% tax, they would not try so hard and we would all benefit from that. Michael E. Lebo San Diego
Re: [digitalradio] Re: digital voice within 100 Hz bandwidth
Bob, I was thinking about an SSB signal that is off frequency. Most of the time I could get the information I need to get the contact. I never intended this to be hi-fidelity. I just want it to be good enough. Miken6ief On Nov 18, 2007 12:11 PM, Robert Thompson [EMAIL PROTECTED] wrote: That is not entirely true. Besides, I wasn't focusing so much on their real research as the voice characterization research that they had to do before they could usefully work on recognition. It turns out that the very areas that are most necessary for digital voice recognition are the ones most necessary for human brains to recognize and interpret. Voice is a mixed-information-density signal, and if you simplify the signal by filtering out and discarding the less necessary elements, you have significantly reduced the effort the next stage has to do, whether it's digital encoding or speech recognition. On Nov 18, 2007 1:31 PM, Mike Lebo [EMAIL PROTECTED]mike-lebo%40ieee.org wrote: Robert, I agree. The thing that is different is that speech recognition is not real time. Voice over the radio is real time. Mike n6ief On Nov 18, 2007 10:46 AM, Robert Thompson [EMAIL PROTECTED]robertt.thompson%40gmail.com wrote: There are several (military/gov) standard intelligibility tests that do a pretty good job of scoring what most humans can and can not reliably understand. You might try taking a look at them to get some ideas of which voice characteristics make the most difference to intelligibility. There is actually a surprising amount of data out there, especially if you include the data peripheral to the various computerized speech translator research projects. It's not *exactly* signal processing... but understanding what parts of the signal matter the most can be surprisingly helpful. This may be unusually productive, because as of yet there hasn't been a huge amount of cross-discipline work between the codec researchers and the speech-to-meaning researchers. While there's a lot of duplicate research in there, it tends to be from slightly different perspectives, and the stereo view can sometimes help. On Nov 18, 2007 9:12 AM, Mike Lebo [EMAIL PROTECTED]mike-lebo%40ieee.org wrote: Hi Vojtech, Thank you for your reply to my papers. I will do more work on the phonemes. The project I want to do uses new computers that were no available 10 years ago. Every 10 mS a decision is made to send a one or a zero. To make that decision I have 68 parallel FFT's running in the background. I believe the brain could handle mispronounce words better than you think. Mike On Nov 17, 2007 3:55 PM, r_lwesterfield [EMAIL PROTECTED] r_lwesterfield%40bellsouth.net wrote: I have a few radios (ARC-210-1851, PSC-5D, PRC-117F) at work that operate in MELP for a vocoder – Mixed Excitation Linear Prediction. We have found MELP to be superior (more human-like voice qualities – less Charlie Brown's teacher) to LPC-10 but we use far larger bandwidths than 100 khz. I do not know how well any of this will play out at such a narrow bandwidth. Listening to Charlie Brown's teacher will send you running away quickly and you should think of your listeners . . . they will tire very quickly. Just because voice can be sent at such narrower bandwidths does not necessarily mean that people will like to listen to it. Rick – KH2DF From: digitalradio@yahoogroups.comdigitalradio%40yahoogroups.com [mailto:digitalradio@yahoogroups.com digitalradio%40yahoogroups.com] On Behalf Of Vojtech Bubník Sent: Saturday, November 17, 2007 9:11 AM To: [EMAIL PROTECTED] mike-lebo%40ieee.org; digitalradio@yahoogroups.com digitalradio%40yahoogroups.com Subject: [digitalradio] Re: digital voice within 100 Hz bandwidth Hi Mike. I studied some aspects of voice recognition about 10 years ago when I thought of joining a research group at Czech Technical University in Prague. I have a 260 pages text book on my book shelf on voice recognition. Voice signal has high redundancy if compared to a text transcription. But there is additional information stored in the voice signal like pitch, intonation, speed. One could estimate for example mood of the speaker from the utterance. Voice tract could be described by a generator (tone for vowels, hiss for consonants) and filter. Translating voice into generator and filter coefficients greatly decreases voice data redundancy. This is roughly the technique that the common voice codecs do. GSM voice compression is a kind of Algebraic
Re: [digitalradio] Re: digital voice within 100 Hz bandwidth
Vojtech, Thank you for reading my papers. I have no intention of re-inventing the wheel. The project is like echolink and does not understand speech or change to text. Books that have been done in the past did not have narrow bandwidth as their main objective. I do not need hi-fidelity to understand what is being said. I am used to slightly de-tuned SSB voice. I just need something that is good enough. My big problem is that none of this will ever happen unless someone steps up and wants to help me learn how to modify free, public domain, C++ software. Could you or someone you know be that person? 73's Miken6ief On Nov 17, 2007 7:11 AM, Vojtěch Bubník [EMAIL PROTECTED] wrote: Hi Mike. I studied some aspects of voice recognition about 10 years ago when I thought of joining a research group at Czech Technical University in Prague. I have a 260 pages text book on my book shelf on voice recognition. Voice signal has high redundancy if compared to a text transcription. But there is additional information stored in the voice signal like pitch, intonation, speed. One could estimate for example mood of the speaker from the utterance. Voice tract could be described by a generator (tone for vowels, hiss for consonants) and filter. Translating voice into generator and filter coefficients greatly decreases voice data redundancy. This is roughly the technique that the common voice codecs do. GSM voice compression is a kind of Algebraic Code Excited Linear Prediction. Another interesting codec is AMBE (Advanced Multi-Band Excitation) used by DSTAR system. GSM half-rate codec squeezes voice to 5.6kbit/sec, AMBE to 3.6 kbps. Both systems use excitation tables, but AMBE is more efficient and closed source. I think the clue to the efficiency is in size and quality of the excitation tables. To create such an algorithm requires considerable amount of research and data analysis. The intelligibility of GSM or AMBE codecs is very good. You could buy the intelectual property of the AMBE codec by buying the chip. There are couple of projects running trying to built DSTAR into legacy transceivers. About 10 years ago we at OK1KPI club experimented with an echolink like system. We modified speakfreely software to control FM transceiver and we added web interface to control tuning and subtone of the transceiver. It was a lot of fun and a very unique system at that time. http://www.speakfreely.org/ The best compression factor offers LPC-10 codec (3460kbps), but the sound is very robot-like and quite hard to understand. At the end we reverted to GSM. I think IVOX is a variant of the LPC system that we tried. Your proposal is to increase compression rate by transmitting phonemes. I once had the same idea, but I quickly rejected it. Although it may be a nice exercise, I find it not very useless until good continuous speech multi-speaker multi-language recognition systems are available. I will try to explain my reasoning behind that statement. Let's classify voice recognition systems by the implementation complexity: 1) Single-speaker, limited set of utterances recognized (control your desktop by voice) 2) Multiple-speaker, limited set of utterances recognized (automated phone system) 3) dictating system 4) continuous speech transcription 5) speech recognition and understanding Your proposal will need implement most of the code from 4) or 5) to be really usable and it has to be reliable. State of the art voice recognition systems use hidden Markov models to detect phonemes. Phoneme is searched by traversing state diagram by evaluating multiple recorded spectra. The phoneme is soft-decoded. Output of the classifier is a list of phonemes with their probabilities of detection assigned. To cope with phoneme smearing on their boundaries, either sub-phonemes or phoneme pairs need to be detected. After the phonemes are classified, they are chained into words. Depending on the dictionary, most probable words are picked. You suppose that your system will not need it. But the trouble are consonants. They carry much less energy than vowels and are much easier to be confused. Dictionary is used to pick some second highest probability detected consonants in the word. Not only the dictionary, but also the phoneme classifier is language dependent. I think human brain works in the same way. Imagine learning foreign language. Even if you are able to recognize slowly pronounced words, you will be unable to pick them in a fast pronounced sentence. The word will sound different. Human needs considerable training to understand a language. You could decrease complexity of the decoder by constraining the detection to slowly dictated separate words. If you simply pick the high probability phoneme, you will experience comprehension problems of people with hearing loss. Oh yes, I am currently working for hearing instrument manufacturer (I have nothing to do with merck.com).
Re: [digitalradio] Re: digital voice within 100 Hz bandwidth
Hi Vojtech, Thank you for your reply to my papers. I will do more work on the phonemes. The project I want to do uses new computers that were no available 10 years ago. Every 10 mS a decision is made to send a one or a zero. To make that decision I have 68 parallel FFT's running in the background. I believe the brain could handle mispronounce words better than you think. Mike On Nov 17, 2007 3:55 PM, r_lwesterfield [EMAIL PROTECTED] wrote: I have a few radios (ARC-210-1851, PSC-5D, PRC-117F) at work that operate in MELP for a vocoder – Mixed Excitation Linear Prediction. We have found MELP to be superior (more human-like voice qualities – less Charlie Brown's teacher) to LPC-10 but we use far larger bandwidths than 100 khz. I do not know how well any of this will play out at such a narrow bandwidth. Listening to Charlie Brown's teacher will send you running away quickly and you should think of your listeners . . . they will tire very quickly. Just because voice can be sent at such narrower bandwidths does not necessarily mean that people will like to listen to it. Rick – KH2DF -- *From:* digitalradio@yahoogroups.com [mailto:[EMAIL PROTECTED] *On Behalf Of *Vojtech Bubník *Sent:* Saturday, November 17, 2007 9:11 AM *To:* [EMAIL PROTECTED]; digitalradio@yahoogroups.com *Subject:* [digitalradio] Re: digital voice within 100 Hz bandwidth Hi Mike. I studied some aspects of voice recognition about 10 years ago when I thought of joining a research group at Czech Technical University in Prague. I have a 260 pages text book on my book shelf on voice recognition. Voice signal has high redundancy if compared to a text transcription. But there is additional information stored in the voice signal like pitch, intonation, speed. One could estimate for example mood of the speaker from the utterance. Voice tract could be described by a generator (tone for vowels, hiss for consonants) and filter. Translating voice into generator and filter coefficients greatly decreases voice data redundancy. This is roughly the technique that the common voice codecs do. GSM voice compression is a kind of Algebraic Code Excited Linear Prediction. Another interesting codec is AMBE (Advanced Multi-Band Excitation) used by DSTAR system. GSM half-rate codec squeezes voice to 5.6kbit/sec, AMBE to 3.6 kbps. Both systems use excitation tables, but AMBE is more efficient and closed source. I think the clue to the efficiency is in size and quality of the excitation tables. To create such an algorithm requires considerable amount of research and data analysis. The intelligibility of GSM or AMBE codecs is very good. You could buy the intelectual property of the AMBE codec by buying the chip. There are couple of projects running trying to built DSTAR into legacy transceivers. About 10 years ago we at OK1KPI club experimented with an echolink like system. We modified speakfreely software to control FM transceiver and we added web interface to control tuning and subtone of the transceiver. It was a lot of fun and a very unique system at that time. http://www.speakfreely.org/ The best compression factor offers LPC-10 codec (3460kbps), but the sound is very robot-like and quite hard to understand. At the end we reverted to GSM. I think IVOX is a variant of the LPC system that we tried. Your proposal is to increase compression rate by transmitting phonemes. I once had the same idea, but I quickly rejected it. Although it may be a nice exercise, I find it not very useless until good continuous speech multi-speaker multi-language recognition systems are available. I will try to explain my reasoning behind that statement. Let's classify voice recognition systems by the implementation complexity: 1) Single-speaker, limited set of utterances recognized (control your desktop by voice) 2) Multiple-speaker, limited set of utterances recognized (automated phone system) 3) dictating system 4) continuous speech transcription 5) speech recognition and understanding Your proposal will need implement most of the code from 4) or 5) to be really usable and it has to be reliable. State of the art voice recognition systems use hidden Markov models to detect phonemes. Phoneme is searched by traversing state diagram by evaluating multiple recorded spectra. The phoneme is soft-decoded. Output of the classifier is a list of phonemes with their probabilities of detection assigned. To cope with phoneme smearing on their boundaries, either sub-phonemes or phoneme pairs need to be detected. After the phonemes are classified, they are chained into words. Depending on the dictionary, most probable words are picked. You suppose that your system will not need it. But the trouble are consonants. They carry much less energy than vowels and are much easier to be confused. Dictionary is used to pick some second highest probability detected consonants in the word.
Re: [digitalradio] Re: digital voice within 100 Hz bandwidth
Robert, I agree. The thing that is different is that speech recognition is not real time. Voice over the radio is real time. Mike n6ief On Nov 18, 2007 10:46 AM, Robert Thompson [EMAIL PROTECTED] wrote: There are several (military/gov) standard intelligibility tests that do a pretty good job of scoring what most humans can and can not reliably understand. You might try taking a look at them to get some ideas of which voice characteristics make the most difference to intelligibility. There is actually a surprising amount of data out there, especially if you include the data peripheral to the various computerized speech translator research projects. It's not *exactly* signal processing... but understanding what parts of the signal matter the most can be surprisingly helpful. This may be unusually productive, because as of yet there hasn't been a huge amount of cross-discipline work between the codec researchers and the speech-to-meaning researchers. While there's a lot of duplicate research in there, it tends to be from slightly different perspectives, and the stereo view can sometimes help. On Nov 18, 2007 9:12 AM, Mike Lebo [EMAIL PROTECTED]mike-lebo%40ieee.org wrote: Hi Vojtech, Thank you for your reply to my papers. I will do more work on the phonemes. The project I want to do uses new computers that were no available 10 years ago. Every 10 mS a decision is made to send a one or a zero. To make that decision I have 68 parallel FFT's running in the background. I believe the brain could handle mispronounce words better than you think. Mike On Nov 17, 2007 3:55 PM, r_lwesterfield [EMAIL PROTECTED]r_lwesterfield%40bellsouth.net wrote: I have a few radios (ARC-210-1851, PSC-5D, PRC-117F) at work that operate in MELP for a vocoder – Mixed Excitation Linear Prediction. We have found MELP to be superior (more human-like voice qualities – less Charlie Brown's teacher) to LPC-10 but we use far larger bandwidths than 100 khz. I do not know how well any of this will play out at such a narrow bandwidth. Listening to Charlie Brown's teacher will send you running away quickly and you should think of your listeners . . . they will tire very quickly. Just because voice can be sent at such narrower bandwidths does not necessarily mean that people will like to listen to it. Rick – KH2DF From: digitalradio@yahoogroups.com digitalradio%40yahoogroups.com[mailto: digitalradio@yahoogroups.com digitalradio%40yahoogroups.com] On Behalf Of Vojtech Bubník Sent: Saturday, November 17, 2007 9:11 AM To: [EMAIL PROTECTED] mike-lebo%40ieee.org; digitalradio@yahoogroups.com digitalradio%40yahoogroups.com Subject: [digitalradio] Re: digital voice within 100 Hz bandwidth Hi Mike. I studied some aspects of voice recognition about 10 years ago when I thought of joining a research group at Czech Technical University in Prague. I have a 260 pages text book on my book shelf on voice recognition. Voice signal has high redundancy if compared to a text transcription. But there is additional information stored in the voice signal like pitch, intonation, speed. One could estimate for example mood of the speaker from the utterance. Voice tract could be described by a generator (tone for vowels, hiss for consonants) and filter. Translating voice into generator and filter coefficients greatly decreases voice data redundancy. This is roughly the technique that the common voice codecs do. GSM voice compression is a kind of Algebraic Code Excited Linear Prediction. Another interesting codec is AMBE (Advanced Multi-Band Excitation) used by DSTAR system. GSM half-rate codec squeezes voice to 5.6kbit/sec, AMBE to 3.6 kbps. Both systems use excitation tables, but AMBE is more efficient and closed source. I think the clue to the efficiency is in size and quality of the excitation tables. To create such an algorithm requires considerable amount of research and data analysis. The intelligibility of GSM or AMBE codecs is very good. You could buy the intelectual property of the AMBE codec by buying the chip. There are couple of projects running trying to built DSTAR into legacy transceivers. About 10 years ago we at OK1KPI club experimented with an echolink like system. We modified speakfreely software to control FM transceiver and we added web interface to control tuning and subtone of the transceiver. It was a lot of fun and a very unique system at that time. http://www.speakfreely.org/ The best compression factor offers LPC-10 codec (3460kbps), but the sound is very robot-like and quite hard to understand. At the end we reverted to GSM. I think IVOX is a variant of the LPC system that we tried. Your proposal is to increase compression rate