On Saturday, 21 March 2015 at 20:51:52 UTC, FG wrote:
But if you look at my function definition, it doesn't have that, nor does it use parentheses, semicolons, etc., so it's "voice-ready". My question is: at which point would that be considered an efficient method to define a program's component that we would choose to use instead of the current succinct symbolic notation?

We will probably keep some variation on the current textual symbolic notation but develop verbal shorthand and IDEs that enable faster programming than even keyboards allow today.

Yeah, right, people will create drawings with voice commands. :) Every interface has its rightful domain and voice ain't best for everything. Or do you mean that touch will go away but instead people will be waving their hands around?

Yes, hand and fingers, as I said before, and you will even be able to "paint" in 3D:

https://www.thurrott.com/windows/windows-10/573/hands-microsoft-hololens

On Saturday, 21 March 2015 at 21:46:10 UTC, H. S. Teoh wrote:
Of course. But we're talking here about interfaces for *programmers*, not for your average Joe, for whom a pretty GUI with a button or two
would suffice.

When you said "I think rodent-based UIs will go the way of the dinosaur," you seemed to be talking about more than just programmers.

This is the unpopular opinion, but I'm skeptical if this day will ever come. The problem with voice recognition is that it's based on natural language, and natural language is inherently ambiguous. You say that heuristics can solve this, I call BS on that. Heuristics are bug-prone and unreliable (because otherwise they'd be algorithms!), precisely because they fail to capture the essence of the problem, but are merely
crutches to get us mostly there in lieu of an actual solution.

You don't have to handle full "natural language" to handle voice input, you can constrain the user to a verbal shorthand for certain tasks. Eventually, you can loosen that requirement as the recognition engines get better. You can never have algorithms that handle all the complexity of human speech, especially since the speech recognition engine has no understanding of what the words actually mean. But thousands upon thousands of heuristics might just do the job.

The inherent ambiguity in natural language comes not from some kind of inherent flaw as most people tend to believe, but it's actually a side-effect of the brain's ability at context-sensitive comprehension. The exact same utterance, spoken in different contexts, can mean totally different things, and the brain has no problem with that (provided it is given sufficient context, of course). The brain is also constantly optimizing itself -- if it can convey its intended meaning in fewer, simpler words, it will prefer to do that instead of going through the effort of uttering the full phrase. This is one of the main factors
behind language change, which happens over time and is mostly
unconscious. Long, convoluted phrases, if spoken often enough, tend to contract into shorter, sometimes ambiguous, utterances, as long as there is sufficient context to disambiguate. This is why we have a tendency toward acronyms -- the brain is optimizing away the long utterance in preference to a short acronym, which, based on the context of a group of speakers who mutually share similar contexts (e.g., computer lingo), is unambiguous, but may very well be ambiguous in a wider context. If I talk to you about UFCS, you'd immediately understand what I was talking about, but if I said that to my wife, she would have no idea what I just said -- she may not even realize it's an acronym, because it sounds like a malformed sentence "you ...". The only way to disambiguate this kind of context-specific utterance is to *share* in that context in the first place. Talk to a Java programmer about UFCS, and he probably wouldn't know what you just said either, unless he has been reading up on D.

This acronym example is actually fairly easy for the computer to handle, given its great memory. But yes, there are many contexts where the meaning of the words is necessary to disambiguate what is meant and without some sort of AI, you have to rely on various heuristics.

The only way speech recognition can acquire this level of context in order to disambiguate is to customize itself to that specific user -- in essence learn his personal lingo, pick up his (sub)culture, learn the contexts associated with his areas of interest, even adapt to his peculiarities of pronunciation. If software can get to this level, it might as well pass the Turing test, 'cos then it'd have enough context to carry out an essentially human conversation. I'd say we're far, far
from that point today, and it's not clear we'd ever get there.

I'd say we're fairly close, given the vast computing power in even our mobile devices these days, and that is nowhere near the Turing test, as extracting a bunch of personalized info and contextual dictionaries is nowhere close to the complexity of a computer generating a human-like answer to any question a human asks it.

We haven't even mastered context-sensitive languages, except via the crutch of parsing a context-free grammar and then apply a patchwork of semantic analysis after the fact, let alone natural language, which is not only
context-sensitive but may depend on context outside of the input
(cultural background, implicit common shared knowledge, etc.). Before we can get there, we'd have to grapple with knowledge representation, context-sensitive semantics, and all those hard problems that today seem
to have no tractable solution in sight.

Certain of these can be dealt with by heuristics, others are too hard to deal with right now, but likely wouldn't make much of a difference in accuracy anyway.

P.S. Haha, it looks like my Perl script has serendipitously selected a quote that captures the inherent ambiguity of natural language -- you can't even tell, at a glance, where the verbs are! I'd like to see an algorithm parse *that* (and then see it fall flat on its face when I actually meant one of the "non-standard" interpretations of it, such as if this were in the context of a sci-fi movie where there are insects
called "time flies"...).

It took me a minute to even figure out what "Fruit flies like a banana" meant, as that pair of sentences takes advantage of the human heuristic of assuming that two rhyming sentences place their verb in the same spot. But a speech recognition engine doesn't _need_ to parse the grammar in those sentences, as it'll likely get it just from the pronunciation, so it would likely confuse only us humans, who are trying to understand what it actually means, and not the computer, which just wants to transcribe literally what we said.

On Saturday, 21 March 2015 at 22:00:21 UTC, Ola Fosheim Grøstad wrote:
Right, but it is likely that the nature of programming will change. In the beginning of the web the search engines had trouble matching anything but exact phrases, now they are capable of figuring out what you probably wanted.

Take music composition, people still write notes explicitly as discrete symbols, yet others compose music by recording a song, and then manipulating it (i.e. auto tune). So, even though you can do pitch recognition many probably use discrete interfaces like keyboard or a mouse for writing music, yet new forms of music and composition has come with the ability to process audio in a more intuitive, evolutionary fashion.

Good point, this was a good video from a couple years ago:

http://forum.dlang.org/thread/op.wn0ye9uy54x...@puck.auriga.bhead.co.uk

On Saturday, 21 March 2015 at 23:58:18 UTC, Laeeth Isharc wrote:
HS Teoh is right about context, and the superiority of the written word for organizing and expressing thinking at a very high level. The nature of human memory and perception means that is unlikely to change very soon, if ever.

Nobody is arguing against text, but of the best way to provide it to the computer, whether through tapping keys or voicing it.

On Sunday, 22 March 2015 at 09:30:38 UTC, Atila Neves wrote:
On Friday, 20 March 2015 at 22:55:24 UTC, Laeeth Isharc wrote:
There is nothing intrinsically more scientific about basing a decision on a study rather than experience and judgement (including aesthetic judgement), which is not to say that more data cannot be useful,


Of course there is. Experience and judgement aren't measurable. You don't have science without numbers.

But it takes experience and judgement to interpret and contextualize those numbers, so we are back to square one. :) Note that Laeeth specifically said more data can be useful, only that one has to be careful that it's the right data, not just data that happens to be lying around.

Reply via email to