Blizzard Challenge 2011 Rules

From SynSIG

DATABASE ACCESS

REGISTRATION FEE

  • A registration fee of 500GBP (approx 800USD) is due to offset the costs of running the challenge, including paying local assistants and undergraduate listeners. The fee is fixed, regardless of how many tasks you participate in. The fee must be paid by Friday 8th April 2011. You can pay this fee using Edinburgh University's online payments system: HERE and register for the event called 'Blizzard Challenge 2011'. After doing this, please also email blizzard@festvox.org to notify us that you have paid. If you are really unable to use the online payments system, please contact blizzard@festvox.org for assistance with other methods of payment. However, we strongly prefer the epay system because it reduces the costs and admin work for us. If you must pay by bank transfer, please contact us in plenty of time (several weeks before the payment deadline); an additional charge of 50GBP will be made for any payments not made using the epay system.

EXPERT LISTENERS

  • Each participant is expected to provide at least ten speech experts as listeners of the evaluation tests. English native speakers are preferable, where possible. The organisers would also appreciate assistance in advertising the Challenge as widely as possible (e.g., to your students or colleagues).

BUILDING VOICES

  • It is not permissible for a single participant to submit multiple entries for any task, because the listening test will become unmanageable.
  • Participants involved in joint projects or consortia who wish to submit multiple systems (e.g., an individual entry and a joint system) should contact the organisers in advance to agree this. We will try to accommodate all reasonable requests, provided the listening test remains manageable.

Hub task

  • Task EH1: build a voice from the full 'Nancy' database. You may use either the 16kHz or higher sampling rate versions, and the submitted wav files can be at any sampling rate. All entries will be downsampled to 16kHz for the main listening test. If there are sufficient entries, we will run an additional listening test for higher sampling rate entries.

Spoke tasks

  • Task ES1: build a voice designed to read names and addresses (in US format) - the evaluation of this task will focus mainly on intelligibility


USE OF EXTERNAL DATA

  • "External data" is defined as data, of any type, that is not part of the provided database.
  • You are allowed to use external data in any way you wish
  • Use of external data is entirely optional and is not compulsory
  • You may exclude any parts of the provided databases if you wish.
  • Use of the provided labels, pitchmarks, etc is optional.
  • If you are in any doubt about how to apply these rules, please contact the organizers immediately.
  • You must not use ANY additional data from the same speaker (Nancy Krebbs)

SYNTHESISING THE TEST EXAMPLES

  • No manual intervention is allowed during synthesis. This includes, but is not limited to:
    • "Prompt sculpting"
    • Altering existing entries in your lexicon (however, you are allowed to add new words)
    • Using different subsets of the database to generate different test sentences or sentence types within a single task, unless this is a fully automatic part of your system. However, it is permissible to use a different subset of the data for each task.

RETENTION OF SUBMITTED SYNTHETIC SPEECH SAMPLES

  • Any examples that you submit for evaluation will be retained by the Blizzard organisers for future use.
  • You must include in your submission of the test sentences a statement of whether you give the organisers permission to publically distribute your waveforms and the corresponding listening test results in anonymised form. In the past, all participants have agreed to this and we strongly encourage you to give this consent.

LISTENING TEST

  • The listening test design is likely to be similar to that used in the 2010 Challenge. Depending on the number of entries for each task, the organisers may only be able to evaluate certain subsets of the synthesised sentences or certain system configurations.

PAPER

  • Each participant will be expected to submit a six-page paper describing their entry for review.
  • One of the authors of each accepted paper should present it at the Blizzard 2011 Workshop, which will be a satellite of Interspeech 2011 in Italy. The workshop will be in Italy.
  • In addition, each participant will be expected to complete a form giving the general technical specification of their system, to facilitate easy cross-system comparisons (e.g. is it unit selection? does it predict prosody? etc. etc)

HOW ARE THESE RULES ENFORCED?

  • This is a challenge, which is designed to answer scientific questions, and not a competition. Therefore, we rely on your honesty in preparing your entry.

SynSIG is a Special Interest Group of ISCA, the International Speech Communication Association.

SynSIG 1998-2024