On Saturday, 21 March 2015 at 20:51:52 UTC, FG wrote:
But if you look at my function definition, it doesn't have
that, nor does it use parentheses, semicolons, etc., so it's
"voice-ready". My question is: at which point would that be
considered an efficient method to define a program's component
that we would choose to use instead of the current succinct
symbolic notation?
We will probably keep some variation on the current textual
symbolic notation but develop verbal shorthand and IDEs that
enable faster programming than even keyboards allow today.
Yeah, right, people will create drawings with voice commands.
:) Every interface has its rightful domain and voice ain't
best for everything. Or do you mean that touch will go away but
instead people will be waving their hands around?
Yes, hand and fingers, as I said before, and you will even be
able to "paint" in 3D:
https://www.thurrott.com/windows/windows-10/573/hands-microsoft-hololens
On Saturday, 21 March 2015 at 21:46:10 UTC, H. S. Teoh wrote:
Of course. But we're talking here about interfaces for
*programmers*,
not for your average Joe, for whom a pretty GUI with a button
or two
would suffice.
When you said "I think rodent-based UIs will go the way of the
dinosaur," you seemed to be talking about more than just
programmers.
This is the unpopular opinion, but I'm skeptical if this day
will ever
come. The problem with voice recognition is that it's based on
natural
language, and natural language is inherently ambiguous. You say
that
heuristics can solve this, I call BS on that. Heuristics are
bug-prone
and unreliable (because otherwise they'd be algorithms!),
precisely
because they fail to capture the essence of the problem, but
are merely
crutches to get us mostly there in lieu of an actual solution.
You don't have to handle full "natural language" to handle voice
input, you can constrain the user to a verbal shorthand for
certain tasks. Eventually, you can loosen that requirement as
the recognition engines get better. You can never have
algorithms that handle all the complexity of human speech,
especially since the speech recognition engine has no
understanding of what the words actually mean. But thousands
upon thousands of heuristics might just do the job.
The inherent ambiguity in natural language comes not from some
kind of
inherent flaw as most people tend to believe, but it's actually
a
side-effect of the brain's ability at context-sensitive
comprehension.
The exact same utterance, spoken in different contexts, can
mean totally
different things, and the brain has no problem with that
(provided it is
given sufficient context, of course). The brain is also
constantly
optimizing itself -- if it can convey its intended meaning in
fewer,
simpler words, it will prefer to do that instead of going
through the
effort of uttering the full phrase. This is one of the main
factors
behind language change, which happens over time and is mostly
unconscious. Long, convoluted phrases, if spoken often enough,
tend to
contract into shorter, sometimes ambiguous, utterances, as long
as there
is sufficient context to disambiguate. This is why we have a
tendency
toward acronyms -- the brain is optimizing away the long
utterance in
preference to a short acronym, which, based on the context of a
group of
speakers who mutually share similar contexts (e.g., computer
lingo), is
unambiguous, but may very well be ambiguous in a wider context.
If I
talk to you about UFCS, you'd immediately understand what I was
talking
about, but if I said that to my wife, she would have no idea
what I just
said -- she may not even realize it's an acronym, because it
sounds like
a malformed sentence "you ...". The only way to disambiguate
this kind
of context-specific utterance is to *share* in that context in
the first
place. Talk to a Java programmer about UFCS, and he probably
wouldn't
know what you just said either, unless he has been reading up
on D.
This acronym example is actually fairly easy for the computer to
handle, given its great memory. But yes, there are many contexts
where the meaning of the words is necessary to disambiguate what
is meant and without some sort of AI, you have to rely on various
heuristics.
The only way speech recognition can acquire this level of
context in
order to disambiguate is to customize itself to that specific
user -- in
essence learn his personal lingo, pick up his (sub)culture,
learn the
contexts associated with his areas of interest, even adapt to
his
peculiarities of pronunciation. If software can get to this
level, it
might as well pass the Turing test, 'cos then it'd have enough
context
to carry out an essentially human conversation. I'd say we're
far, far
from that point today, and it's not clear we'd ever get there.
I'd say we're fairly close, given the vast computing power in
even our mobile devices these days, and that is nowhere near the
Turing test, as extracting a bunch of personalized info and
contextual dictionaries is nowhere close to the complexity of a
computer generating a human-like answer to any question a human
asks it.
We haven't even mastered context-sensitive languages, except
via the crutch
of parsing a context-free grammar and then apply a patchwork of
semantic
analysis after the fact, let alone natural language, which is
not only
context-sensitive but may depend on context outside of the input
(cultural background, implicit common shared knowledge, etc.).
Before we
can get there, we'd have to grapple with knowledge
representation,
context-sensitive semantics, and all those hard problems that
today seem
to have no tractable solution in sight.
Certain of these can be dealt with by heuristics, others are too
hard to deal with right now, but likely wouldn't make much of a
difference in accuracy anyway.
P.S. Haha, it looks like my Perl script has serendipitously
selected a
quote that captures the inherent ambiguity of natural language
-- you
can't even tell, at a glance, where the verbs are! I'd like to
see an
algorithm parse *that* (and then see it fall flat on its face
when I
actually meant one of the "non-standard" interpretations of it,
such as
if this were in the context of a sci-fi movie where there are
insects
called "time flies"...).
It took me a minute to even figure out what "Fruit flies like a
banana" meant, as that pair of sentences takes advantage of the
human heuristic of assuming that two rhyming sentences place
their verb in the same spot. But a speech recognition engine
doesn't _need_ to parse the grammar in those sentences, as it'll
likely get it just from the pronunciation, so it would likely
confuse only us humans, who are trying to understand what it
actually means, and not the computer, which just wants to
transcribe literally what we said.
On Saturday, 21 March 2015 at 22:00:21 UTC, Ola Fosheim Grøstad
wrote:
Right, but it is likely that the nature of programming will
change. In the beginning of the web the search engines had
trouble matching anything but exact phrases, now they are
capable of figuring out what you probably wanted.
Take music composition, people still write notes explicitly as
discrete symbols, yet others compose music by recording a song,
and then manipulating it (i.e. auto tune). So, even though you
can do pitch recognition many probably use discrete interfaces
like keyboard or a mouse for writing music, yet new forms of
music and composition has come with the ability to process
audio in a more intuitive, evolutionary fashion.
Good point, this was a good video from a couple years ago:
http://forum.dlang.org/thread/op.wn0ye9uy54x...@puck.auriga.bhead.co.uk
On Saturday, 21 March 2015 at 23:58:18 UTC, Laeeth Isharc wrote:
HS Teoh is right about context, and the superiority of the
written word for organizing and expressing thinking at a very
high level. The nature of human memory and perception means
that is unlikely to change very soon, if ever.
Nobody is arguing against text, but of the best way to provide it
to the computer, whether through tapping keys or voicing it.
On Sunday, 22 March 2015 at 09:30:38 UTC, Atila Neves wrote:
On Friday, 20 March 2015 at 22:55:24 UTC, Laeeth Isharc wrote:
There is nothing intrinsically more scientific about basing a
decision on a study rather than experience and judgement
(including aesthetic judgement), which is not to say that more
data cannot be useful,
Of course there is. Experience and judgement aren't measurable.
You don't have science without numbers.
But it takes experience and judgement to interpret and
contextualize those numbers, so we are back to square one. :)
Note that Laeeth specifically said more data can be useful, only
that one has to be careful that it's the right data, not just
data that happens to be lying around.