[Mscore-developer] (GSOC 2016) Regarding the Virtual Singer project idea...

syrma Sat, 19 Mar 2016 00:52:56 -0700

Hello! 

I have been researching the possibility of using a Virtual Singer for
MuseScore.

I downloaded and compiled from source some of the following software (and
directly tested others from installing the packages). I will talk about all
the software I have looked at/tested, before talking about those I consider
promising. As I lack the experience and the insight to give definite
judgement, I would be grateful for any input.

- E-Cantorix (https://github.com/divVerent/ecantorix):

A perl singing synthesis software using espeak. This unfortunately doesn't
look like something that can be directly exploited, the impression given by
the headache-inducing robotic voice. There could be some good ideas to take
from it, although I still have nothing in mind.

- Festival Speech Synthesis System's singing mode
(http://www.festvox.org/festival/ ):

The speech synthesis' singing mode came as way better than e-cantorix in
matter of usability (from my own experience that might not be
representative), although the output still lacks quality. The input for this
mode is a special xml file that specifies the notes and their durations for
each word (festival being foremost a speech synthesis system).

As for the singing mode output, aside from the robotic voice (that is still
way more decent than e-cantorix'), I have to say it sounded pretty random.
The British voice would pronounce some words faster than the American one,
completely messing up the rhythm. Or sometimes the tone gets off. Mainly the
dissimilarity between how we speak and how we sing that can makes huge
differences.

- Sinsy (http://sinsy.sourceforge.net/):

Aside from the pretty impressive (non-open source) version that is presented
on their website (Japanese (3 voices), Dubious English (2 voices), and
Chinese (1 voice) singing synthesis from a music xml file), the open source
version only supports Japanese, and only one voice is available (which is
clearly of a lesser quality than the ones on the website). It uses the
hts_engine API (http://hts-engine.sourceforge.net/).

Pros:
- Quite easy to use; compiled and run with minor trouble.
- Supports Japanese well.
- It is straightforward to get results, as it directly converts from
MusicXML files (as generated from MuseScore) to audio.
- The free voice can sound pretty decent.

Cons:
- Depending on what kind of project would be better, the integration into
Mscore could be a problem. The software takes a descriptive file and a voice
and converts them into audio. It could be fine for an external tool, but I
am not sure how the audio could be exploited in real time/playback inside
the software.
- Only supports Japanese. (there might be a possibility to add other
languages through espeak)
- Has only one voice available. (aside from the fact that it is for
Japanese, the lack of choice might be hindering)
- The free voice sounds horrible with long notes. (Really.)

- World (https://github.com/mmorise/World):

World is an open source speech synthesis system. Although very unlike
anything that I've looked at before. World can analyse and synthesize voice.
I must admit that the result is impressive, very natural sounding, or at
least far from being robot-like (even if we play with unrealistic
parameters). However it has no idea of language, so something needs to be
built on top of it. (vConnect-STAND is a possible option. It is built upon
World, sound nice according to youtube demos, but I haven't tried it yet.
The documentation I've come across is in Japanese, so I am slowly going
through it).

Pros:
- Very good results.
- Can be used in real time; it might be possible to integrate it into
Mscore.

Cons:
- Very low level.

- QTau (https://notabug.org/isengaara/qtau) and Cadencii
(https://github.com/cadencii/cadencii-nt):

Two free software editors written with C++ and Qt. Although neither of them
are voice synthesis technologies, they both make use of vConnect-STAND (in
addition to e-cantorix for QTau, and Utau + Vocaloid for Cadencii). I think
the way they do things may be interesting, but I have yet to study them in
depth. I would like to do so after figuring vConnect-STAND out.

The ideas page stated that an external tool would be good to practice along,
but I am not sure what kind of project would be best to consider. Depending
on this, some tools may or may not be good, so I would really like to
discuss this project idea.
I would greatly appreciate any kind of input or guidance. Please let me know
what I am missing, if I disregarded an interesting possibility, or whether I
should keep going on this path.

Thank you!

--
View this message in context:
http://dev-list.musescore.org/GSOC-2016-Regarding-the-Virtual-Singer-project-idea-tp7579698.html
Sent from the MuseScore Developer mailing list archive at Nabble.com.

------------------------------------------------------------------------------
Transform Data into Opportunity.
Accelerate data analysis in your applications with
Intel Data Analytics Acceleration Library.
Click to learn more.
http://pubads.g.doubleclick.net/gampad/clk?id=278785231&iu=/4140
_______________________________________________
Mscore-developer mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mscore-developer

[Mscore-developer] (GSOC 2016) Regarding the Virtual Singer project idea...

Reply via email to